Source Code
Chichi Speech Service
This skill provides a FastAPI-based REST service for Qwen3 TTS, specifically configured for reusing a high-quality reference audio prompt for efficient and consistent voice cloning. This service is packaged as an installable CLI.
Installation
Prerequisites: python >= 3.10.
pip install -e .
Usage
1. Start the Service
The service runs on port 9090 by default.
# Start the server (runs in foreground, use & for background or a separate terminal)
# Optional: Uudate to your own reference audio and text for voice cloning
chichi-speech --port 9090 --host 127.0.0.1 --ref-audio "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3-TTS-Repo/clone_2.wav" --ref-text "Okay. Yeah. I resent you. I love you. I respect you. But you know what? You blew it! And thanks to you."
2. Verify Service is Running
Check the health/docs:
curl http://localhost:9090/docs
3. Generate Speech
Use cURL:
curl -X POST "http://localhost:9090/synthesize" \
-H "Content-Type: application/json" \
-d '{
"text": "Nice to meet you",
"language": "English"
}' \
--output output/nice_to_meet.wav
Functionality
- Endpoint:
POST /synthesize - Default Port: 9090
- Voice Cloning: Uses a pre-computed voice prompt from reference files to ensure the cloned voice is consistent and generation is fast.
Requirements
- Python 3.10+
qwen-tts(Qwen3 model library)- Access to a reference audio file for voice cloning.
- By default, it uses public sample audio from Qwen3.
- CRITICAL: You can provide your own reference audio using the
--ref-audioand--ref-textflags.