Model introduction:
Kitten ML has released open source code and weights of their new TTS model’s preview.
Github: https://github.com/KittenML/KittenTTS
Huggingface: https://huggingface.co/KittenML/kitten-tts-nano-0.1
The model is less than 25 MB, around 15M parameters. The full release next week will include another open source ~80M parameter model with these same 8 voices, that can also run on CPU.
Key features and Advantages
Eight Different Expressive voices - 4 female and 4 male voices. For a tiny model, the expressivity sounds pretty impressive. This release will support TTS in English and multilingual support expected in future releases. Super-small in size: The two text to speech models will be ~15M and ~80M parameters . Can literally run anywhere lol : Forget “No gpu required.” - this thing can even run on raspberry pi’s and phones. Great news for gpu-poor folks like me. Open source (hell yeah!): the model can used for free.
💬 Discussion r/LocalLLaMA (473 points, 97 commentaires) 🔗 Source