Neural Audio Codec with Continuous Vectors #8
manmay-nakhashi
started this conversation in
Ideas
Replies: 1 comment 2 replies
-
@manmay-nakhashi yea, they do assert that, but never showed any experiments comparing the two what they actually did in the paper follows all the other recent successes. they used the soundstream architecture with the residual VQ, and even had a special loss to each quantizer codebook |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
@lucidrains This is something different from Enocodec and SoundStream
We use a neural audio codec to convert speech waveform into continuous vectors instead of discrete tokens
and
The audio encoder consists of several convolutional blocks with a
total downsampling rate of 200 for 16KHz audio.
means they are compressing 16000 khz audio to 80 size continuous vector.
Beta Was this translation helpful? Give feedback.
All reactions