Yesterday, French AI research lab Kyutai unveiled Moshi, an innovative real-time AI model capable of processing both voice and text, and it’s fully open source.
Developed in just four months by a small team of eight, Kyutai’s impressive achievement is backed by significant funding, including $330 million from Eric Schmidt, former CEO of Google.
Moshi boasts remarkable speed with only 200ms latency, outperforming GPT-4o, and can even interrupt speakers due to its rapid processing.
The model, with 7 billion parameters, is designed to run on limited hardware, making it highly accessible.
In contrast to OpenAI’s GPT-4o, Moshi’s code, model, and research paper will be fully open source.
This initial version, created in under six months, promises continual improvement.
France continues to excel in AI innovation, joining the ranks of Hugging Face and Mistral. Congratulations to Kyutai! 🇫🇷 🇪🇺