If inference fails with a 503 error, the space is sleeping. Open it on HuggingFace.co to wake it up, wait ~30 seconds, then try again.
Text-to-Speech — RVCNEW
1.0
Fetches audio from Google Translate TTS, converts to WAV, then auto-sets it as the Input Audio for RVC.
1.0
Uses your browser's built-in TTS (Edge voices on Microsoft Edge). Plays audio and records from microphone simultaneously — ensure your mic can pick up speakers, or use a loopback device.
Mic required
Preview (will be used as RVC input):
Model Source
Quick presets:
Click or drag .pth file here
Model weights file
Click or drag .index file here
Improves voice similarity
Input Audio
Click or drag audio file here
.wav · .mp3 · .ogg · .flac · .m4a
Parameters
0
0 = no pitch change. +12 = one octave up. Use +12 for male-to-female, -12 for female-to-male.
rmvpecrepecrepe-tinyharvestdiopm
rmvpe is recommended. pm / dio are fastest. crepe is slowest but accurate.
0.75
How much the index file influences the output timbre. 0 = disabled.
3
If >= 3, applies median filter to reduce pitch artifacts (harvest/dio only).
Target sample rate of the output file. 0 = same as model.
0.25
0 = match output loudness to input. 1 = use converted audio loudness.
0.33
Protect breath and unvoiced sounds from conversion artifacts. 0.5 = fully disabled.
RVC (Retrieval-based Voice Conversion) is an open-source project. All voice models are community-made. Respect original voice actors and use responsibly.