🤖 Miso Labs Introduces MisoTTS 8B

This is an open-source model for high-quality conversational speech generation. It utilizes an RVQ Transformer architecture with a Llama-8B backbone and a compact Llama-300M audio decoder, achieving a latency of 110 ms.

🌍 MisoTTS sets a new standard for voice AI agents, enabling real-time operation and providing the possibility of local deployment to maintain privacy.

👤 It is now possible to create ultra-fast voice assistants based on open-source models that are virtually indistinguishable from live conversation and do not require cloud APIs.

Source 1: https://github.com/MisoLabsAI/MisoTTS Source 2: https://www.misolabs.ai/