Source Code

EachLabs Voice & Audio

Text-to-speech, speech-to-text transcription, voice conversion, and audio utilities via the EachLabs Predictions API.

Authentication

Header: X-API-Key: <your-api-key>

Set the EACHLABS_API_KEY environment variable. Get your key at eachlabs.ai.

Available Models

Text-to-Speech

Model	Slug	Best For
ElevenLabs TTS	`elevenlabs-text-to-speech`	High quality TTS
ElevenLabs TTS w/ Timestamps	`elevenlabs-text-to-speech-with-timestamp`	TTS with word timing
ElevenLabs Text to Dialogue	`elevenlabs-text-to-dialogue`	Multi-speaker dialogue
ElevenLabs Sound Effects	`elevenlabs-sound-effects`	Sound effect generation
ElevenLabs Voice Design v2	`elevenlabs-voice-design-v2`	Custom voice design
Kling V1 TTS	`kling-v1-tts`	Kling text-to-speech
Kokoro 82M	`kokoro-82m`	Lightweight TTS
Play AI Dialog	`play-ai-text-to-speech-dialog`	Dialog TTS
Stable Audio 2.5	`stable-audio-2-5-text-to-audio`	Text to audio

Speech-to-Text

Model	Slug	Best For
ElevenLabs Scribe v2	`elevenlabs-speech-to-text-scribe-v2`	Best quality transcription
ElevenLabs STT	`elevenlabs-speech-to-text`	Standard transcription
Wizper with Timestamp	`wizper-with-timestamp`	Timestamped transcription
Wizper	`wizper`	Basic transcription
Whisper	`whisper`	Open-source transcription
Whisper Diarization	`whisper-diarization`	Speaker identification
Incredibly Fast Whisper	`incredibly-fast-whisper`	Fastest transcription

Voice Conversion & Cloning

Model	Slug	Best For
RVC v2	`rvc-v2`	Voice conversion
Train RVC	`train-rvc`	Train custom voice model
ElevenLabs Voice Clone	`elevenlabs-voice-clone`	Voice cloning
ElevenLabs Voice Changer	`elevenlabs-voice-changer`	Voice transformation
ElevenLabs Voice Design v3	`elevenlabs-voice-design-v3`	Advanced voice design
ElevenLabs Dubbing	`elevenlabs-dubbing`	Video dubbing
Chatterbox S2S	`chatterbox-speech-to-speech`	Speech to speech
Open Voice	`openvoice`	Open-source voice clone
XTTS v2	`xtts-v2`	Multi-language voice clone
Stable Audio 2.5 Inpaint	`stable-audio-2-5-inpaint`	Audio inpainting
Stable Audio 2.5 A2A	`stable-audio-2-5-audio-to-audio`	Audio transformation
Audio Trimmer	`audio-trimmer-with-fade`	Audio trimming with fade

Audio Utilities

Model	Slug	Best For
FFmpeg Merge Audio Video	`ffmpeg-api-merge-audio-video`	Merge audio with video
Toolkit Video Convert	`toolkit`	Video/audio conversion

Prediction Flow

Check model GET https://api.eachlabs.ai/v1/model?slug=<slug> — validates the model exists and returns the request_schema with exact input parameters. Always do this before creating a prediction to ensure correct inputs.
POST https://api.eachlabs.ai/v1/prediction with model slug, version "0.0.1", and input matching the schema
Poll GET https://api.eachlabs.ai/v1/prediction/{id} until status is "success" or "failed"
Extract the output from the response

Examples

Text-to-Speech with ElevenLabs

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-text-to-speech",
    "version": "0.0.1",
    "input": {
      "text": "Welcome to our product demo. Today we will walk through the key features.",
      "voice_id": "EXAVITQu4vr4xnSDxMaL",
      "model_id": "eleven_v3",
      "stability": 0.5,
      "similarity_boost": 0.7
    }
  }'

Transcription with ElevenLabs Scribe

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-speech-to-text-scribe-v2",
    "version": "0.0.1",
    "input": {
      "media_url": "https://example.com/recording.mp3",
      "diarize": true,
      "timestamps_granularity": "word"
    }
  }'

Transcription with Wizper (Whisper)

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "wizper-with-timestamp",
    "version": "0.0.1",
    "input": {
      "audio_url": "https://example.com/audio.mp3",
      "language": "en",
      "task": "transcribe",
      "chunk_level": "segment"
    }
  }'

Speaker Diarization with Whisper

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "whisper-diarization",
    "version": "0.0.1",
    "input": {
      "file_url": "https://example.com/meeting.mp3",
      "num_speakers": 3,
      "language": "en",
      "group_segments": true
    }
  }'

Voice Conversion with RVC v2

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "rvc-v2",
    "version": "0.0.1",
    "input": {
      "input_audio": "https://example.com/vocals.wav",
      "rvc_model": "CUSTOM",
      "custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
      "pitch_change": 0,
      "output_format": "wav"
    }
  }'

Merge Audio with Video

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "ffmpeg-api-merge-audio-video",
    "version": "0.0.1",
    "input": {
      "video_url": "https://example.com/video.mp4",
      "audio_url": "https://example.com/narration.mp3",
      "start_offset": 0
    }
  }'

ElevenLabs Voice IDs

The elevenlabs-text-to-speech model supports these voice IDs. Pass the raw ID string:

Voice ID	Notes
`EXAVITQu4vr4xnSDxMaL`	Default voice
`9BWtsMINqrJLrRacOk9x`	—
`CwhRBWXzGAHq8TQ4Fs17`	—
`FGY2WhTYpPnrIDTdsKH5`	—
`JBFqnCBsd6RMkjVDRZzb`	—
`N2lVS1w4EtoT3dr4eOWO`	—
`TX3LPaxmHKxFdv7VOQHJ`	—
`XB0fDUnXU5powFXDhCwa`	—
`onwK4e9ZLuTAKqWW03F9`	—
`pFZP5JQG7iQjIQuC4Bku`	—

Parameter Reference

See references/MODELS.md for complete parameter details for each model.

eachlabs-voice-audio