Stay Ahead of the Curve

Latest AI news, expert analysis, bold opinions, and key trends — delivered to your inbox.

Home Page » News » News » Zonos Beta Release: Open-Source Voice Cloning by Zyphra

Zonos Beta Release: Open-Source Voice Cloning by Zyphra

3 min read Zyphra launches **Zonos-v0.1**, an open-source voice cloning model with **high-fidelity speech, real-time performance, and multilingual support**. It can clone voices with **5-30s of audio** and runs efficiently on **RTX 4090 GPUs**. A major step for AI voice synthesis. February 12, 2025 13:24

Zyphra has introduced Zonos-v0.1, an open-source voice cloning model under the Apache 2.0 license, offering high-fidelity speech synthesis with multilingual support. Here’s what you need to know:

Key Highlights

1️⃣ Models & Architecture

Two models: A 1.6B-parameter transformer model and a hybrid model of similar size.
Available under Apache 2.0, making them free for developers and researchers.

2️⃣ Training & Multilingual Capabilities

Trained on 200,000 hours of speech data across multiple languages.
Predominantly English, but also supports Chinese, Japanese, French, Spanish, and German.

3️⃣ Core Features

Voice Cloning: Requires just 5 to 30 seconds of sample audio for cloning.
Expressiveness: Controls for speaking rate, pitch, audio quality, and emotions like happiness, anger, fear, sadness, and surprise.
Native 44 kHz Audio Quality for realistic speech generation.

4️⃣ Performance & Efficiency

Optimized for real-time applications with a latency of 200-300ms.
Runs efficiently on NVIDIA RTX 4090 GPUs, with a real-time factor above 1.

5️⃣ Accessibility & Deployment

Available on Hugging Face, with model weights for both architectures.
Supports a Gradio-based UI, Docker setup, and an API for cloud-based use.

6️⃣ Feedback & Limitations

Praised for high-quality output and expressiveness, but some users report occasional audio artifacts and alignment issues.
Zyphra plans future updates to enhance language support, pronunciation accuracy, emotional control, and inference efficiency.

7️⃣ Community & Reception

Posts on X (formerly Twitter) show strong enthusiasm, particularly around its open-source nature and potential for developers.

Why It Matters

Zonos-v0.1 represents a major step forward in open-source voice cloning, offering tools that rival proprietary models. With its high-quality speech synthesis, low-latency performance, and multilingual capabilities, it opens new possibilities for TTS applications, AI-powered assistants, and creative content development.

For developers and researchers in text-to-speech and AI voice cloning, this is an exciting opportunity to contribute, test, and build upon a truly open AI model.

User Comments (0)

Add Comment

No comments added yet.

Add Comment

Your Name: *

Comment Title: *

Your E-mail: * We'll never share your email with anyone else.

Your Comment: *

Comments will not be approved to be posted if they are SPAM, abusive, off-topic, use profanity, contain a personal attack, or promote hate of any kind.