
Voice Generator
Give your words a voice.This node turns text into natural-sounding speech. You can use a default voice, or clone your own (or someone elseβs) by uploading audio samples. Perfect for narration, character dialogue, or any time you need spoken audio.
πͺ When to Use It
Use Voice Generator when you need spoken audio without recording. It works great for:- Voiceovers β narration for videos or presentations
- Character dialogue β give your characters unique voices
- Podcasts and audio content β scriptable speech
- Dubbing β create audio in different voices
- Prototyping β test dialogue before recording real actors
Average time: 10β30 seconds depending on text length
Cost: Low-Medium
ποΈ Controls and Parameters
Transcript (Text, Required)
The text you want spoken. Write it naturally, with punctuation for pacing. π‘ Tip: Add commas for pauses, periods for longer breaks. βHello, how are you?β sounds more natural than βHello how are youβ Supports up to 2 speakers β label them in your transcript:Speaker Sample Audios (Array, Required)
Upload 1-2 audio clips (5-15 seconds each) of the voice(s) you want to clone. π‘ Tip: Use clear, quiet audio with consistent tone. Avoid background music or noise.- 1 sample β clones one voice
- 2 samples β supports dialogue between two different voices
Voice Speed (Slider)
How fast the speech should be:- 0.8 β slower, deliberate
- 1.0 β normal conversational pace
- 1.2 β faster, energetic
CFG (Slider, 0β5)
Controls how closely the AI follows your voice sample:- Low (0.5β1.0) β more creative variation
- Medium (1.3) β balanced (recommended)
- High (2.0+) β very close match to sample
Temperature (Slider, 0β1)
Controls speech variation and expressiveness:- Low (0.5β0.7) β consistent, monotone
- Medium (0.95) β natural variation (recommended)
- High (1.0) β more expressive, less predictable
Seed (Number)
Control randomness. Same seed + same inputs = same result.π¨ Available Models
Vibe Voice (Default)
Advanced voice cloning and text-to-speech engine with natural intonation and emotion. Features:- Clone up to 2 different voices from audio samples
- Multi-speaker support with speaker labeling
- Fine control over speed, CFG, and temperature
- Natural-sounding prosody and emphasis
π¨ What to Expect
Voice cloning captures the tone and timbre of your sample audio, but itβs not a perfect match.The AI will:
- Mimic the pitch and character of the voice
- Follow the pacing and emphasis in your transcript
- Handle most common words naturally
- Very long, complex sentences
- Unusual names or technical terms
- Extreme emotional range (shouting, whispering)
- Use clean audio samples with no background noise
- Write clear, natural-sounding text
- Keep sentences reasonably short
π¬ Quick Tips
- Record your voice samples in a quiet room with consistent tone
- 10-15 seconds of sample audio is usually enough
- Write your transcript the way youβd naturally speak it
- Use punctuation to control pacing (commas = short pause, periods = longer pause)
- For dialogue, clearly label speakers in your transcript
- Generated audio matches the length of your text β longer text = longer processing time

