SYS/02 — NEURAL AUDIO
KrisCodec
Music-specialized neural audio codec, built from first principles, producing a custom .kris format.
GitHub ↗
PyTorchRVQAudio DSP
- TESTS
- 55 passing
- DECODE
- ~7.4 ms
- QUANT
- RVQ
- ACTIVATION
- Snake
KrisCodec is a neural audio codec specialized for music, written from first principles to understand codec design end to end — from the DSP front end through vector quantization to the decoder.
ARCHITECTURE
The codec uses Snake activations — periodic activations suited to audio's oscillatory structure — and a Residual Vector Quantizer (RVQ) to discretize the latent into a compact, layered code. Output is a custom .kris format.
PIPELINE & VALIDATION
A full end-to-end training pipeline takes raw audio through the encoder, quantizer, and decoder. A 55-test suite covers the model components and the training path. Decode is benchmarked at roughly 7.4 ms.