VoxCPM API - Text to Speech Voice Design API
by Openbmb
VoxCPM API provides developers with powerful tools to integrate continuous, tokenizer-free speech synthesis into their applications. By operating in a continuous latent space rather than relying on discrete audio tokens, it enables deep contextual expressiveness, zero-shot voice cloning, and custom voice design without sacrificing audio fidelity or risking traditional quantization artifacts.

Models Version
LIMITED TIME OFFER
Get $5 Free Credit on First Payment
No strings attached — add funds and get $5 bonus instantly
Openbmb VoxCPM 2.0 Text to Speech API Documentation
POST https://gateway.pixazo.ai/voxcpm/v1/text-to-speech
Authentication
All requests require an API key passed via header.
| Header | Type | Required | Description |
|---|---|---|---|
| Ocp-Apim-Subscription-Key | string | Yes | Your API subscription key |
Text to Speech - Openbmb VoxCPM2
Request Code
POST https://gateway.pixazo.ai/voxcpm/v1/text-to-speech
Content-Type: application/json
Cache-Control: no-cache
Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY
{
"text": "Hello, from Pixazo.",
"cfg_value": 2.0,
"dit_steps": 10
}
import requests
url = "https://gateway.pixazo.ai/voxcpm/v1/text-to-speech"
headers = {
"Content-Type": "application/json",
"Cache-Control": "no-cache",
"Ocp-Apim-Subscription-Key": "YOUR_SUBSCRIPTION_KEY"
}
data = {
"text": "Hello, from Pixazo.",
"cfg_value": 2.0,
"dit_steps": 10
}
response = requests.post(url, json=data, headers=headers)
print(response.json())
const url = 'https://gateway.pixazo.ai/voxcpm/v1/text-to-speech';
const headers = {
'Content-Type': 'application/json',
'Cache-Control': 'no-cache',
'Ocp-Apim-Subscription-Key': 'YOUR_SUBSCRIPTION_KEY'
};
const data = {
text: 'Hello, from Pixazo.',
cfg_value: 2.0,
dit_steps: 10
};
fetch(url, {
method: 'POST',
headers: headers,
body: JSON.stringify(data)
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
curl -v -X POST "https://gateway.pixazo.ai/voxcpm/v1/text-to-speech" \
-H "Content-Type: application/json" \
-H "Cache-Control: no-cache" \
-H "Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY" \
--data-raw '{
"text": "Hello, from Pixazo.",
"cfg_value": 2.0,
"dit_steps": 10
}'
Output
{
"output": "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/openbmb-voxcpm2/1768578707564-851083.wav"
}
Request Parameters - Text to Speech
| Parameter | Required | Type | Default | Description |
|---|---|---|---|---|
| text | Yes | string | — | The text to convert to speech. Supports natural-language sentences and punctuation for prosody control. |
| cfg_value | No | number | 2.0 | Classifier-free guidance scale. Higher values follow the text more strictly at the cost of naturalness. |
| dit_steps | No | integer | 10 | Number of diffusion-transformer inference steps. Higher values can improve quality but take longer. |
Example Request
{
"text": "Hello, from Pixazo.",
"cfg_value": 2.0,
"dit_steps": 10
}
Response
{
"output": "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/openbmb-voxcpm2/1768578707564-851083.wav"
}
Request Headers
| Header | Value |
|---|---|
| Content-Type | application/json |
| Cache-Control | no-cache |
| Ocp-Apim-Subscription-Key | YOUR_SUBSCRIPTION_KEY |
Response Handling
Common status codes.
| Code | Meaning |
|---|---|
| 200 | Success — audio generated |
| 400 | Bad Request |
| 401 | Unauthorized |
| 402 | Insufficient Balance |
| 403 | Forbidden |
| 429 | Too Many Requests |
| 500 | Internal Server Error |
Openbmb VoxCPM 2.0 Text to Speech API Pricing
| Resolution | Price (USD) |
|---|---|
| default | $0 |
Openbmb VoxCPM 2.0 Text to Speech (Voice Design) API Documentation
POST https://gateway.pixazo.ai/voxcpm/v1/voice-cloning
Authentication
All requests require an API key passed via header.
| Header | Type | Required | Description |
|---|---|---|---|
| Ocp-Apim-Subscription-Key | string | Yes | Your API subscription key |
Voice Cloning - Openbmb VoxCPM2
Request Code
POST https://gateway.pixazo.ai/voxcpm/v1/voice-cloning
Content-Type: application/json
Cache-Control: no-cache
Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY
{
"text": "Hello, this is a test.",
"reference_audio_url": "https://your-audio-file.wav",
"prompt_text": "transcript of reference audio (optional)"
}
import requests
url = "https://gateway.pixazo.ai/voxcpm/v1/voice-cloning"
headers = {
"Content-Type": "application/json",
"Cache-Control": "no-cache",
"Ocp-Apim-Subscription-Key": "YOUR_SUBSCRIPTION_KEY"
}
data = {
"text": "Hello, this is a test.",
"reference_audio_url": "https://your-audio-file.wav",
"prompt_text": "transcript of reference audio (optional)"
}
response = requests.post(url, json=data, headers=headers)
print(response.json())
const url = 'https://gateway.pixazo.ai/voxcpm/v1/voice-cloning';
const headers = {
'Content-Type': 'application/json',
'Cache-Control': 'no-cache',
'Ocp-Apim-Subscription-Key': 'YOUR_SUBSCRIPTION_KEY'
};
const data = {
text: 'Hello, this is a test.',
reference_audio_url: 'https://your-audio-file.wav',
prompt_text: 'transcript of reference audio (optional)'
};
fetch(url, {
method: 'POST',
headers: headers,
body: JSON.stringify(data)
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
curl -v -X POST "https://gateway.pixazo.ai/voxcpm/v1/voice-cloning" \
-H "Content-Type: application/json" \
-H "Cache-Control: no-cache" \
-H "Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY" \
--data-raw '{
"text": "Hello, this is a test.",
"reference_audio_url": "https://your-audio-file.wav",
"prompt_text": "transcript of reference audio (optional)"
}'
Output
{
"url": "https://pub-xxx.r2.dev/voxcpm/abc123.wav",
"audio_url": "https://pub-xxx.r2.dev/voxcpm/abc123.wav",
"sample_rate": 48000,
"elapsed_s": 2.3,
"status": "done"
}
Request Parameters - Voice Cloning
| Parameter | Required | Type | Default | Description |
|---|---|---|---|---|
| text | Yes | string | — | The text to synthesize in the cloned voice. Supports natural-language sentences and punctuation for prosody control. |
| reference_audio_url | Yes | string (URL) | — | Publicly accessible URL of the reference audio file (.wav recommended) that the model will clone the voice from. The reference should be a clean recording of a single speaker. |
| prompt_text | No | string | — | Transcript of the reference_audio_url. Providing the exact transcript improves voice-cloning fidelity; if omitted, the model attempts internal transcription. |
Example Request
{
"text": "Hello, this is a test.",
"reference_audio_url": "https://your-audio-file.wav",
"prompt_text": "transcript of reference audio (optional)"
}
Response
{
"url": "https://pub-xxx.r2.dev/voxcpm/abc123.wav",
"audio_url": "https://pub-xxx.r2.dev/voxcpm/abc123.wav",
"sample_rate": 48000,
"elapsed_s": 2.3,
"status": "done"
}
Response Fields
| Field | Type | Description |
|---|---|---|
| url | string | Public URL of the generated .wav audio file (R2 CDN). Equivalent to audio_url. |
| audio_url | string | Alias of url kept for compatibility with audio-focused clients. |
| sample_rate | integer | Sample rate of the output audio, in Hz (e.g. 48000). |
| elapsed_s | number | Server-side processing time in seconds. |
| status | string | Always "done" on a 200 success. Error cases return non-2xx status codes (see Response Handling). |
Request Headers
| Header | Value |
|---|---|
| Content-Type | application/json |
| Cache-Control | no-cache |
| Ocp-Apim-Subscription-Key | YOUR_SUBSCRIPTION_KEY |
Response Handling
Common status codes.
| Code | Meaning |
|---|---|
| 200 | Success — cloned audio generated |
| 400 | Bad Request (missing/invalid text or reference_audio_url, unreachable reference URL) |
| 401 | Unauthorized |
| 402 | Insufficient Balance |
| 403 | Forbidden |
| 429 | Too Many Requests |
| 500 | Internal Server Error |
Openbmb VoxCPM 2.0 Text to Speech (Voice Design) API Pricing
| Resolution | Price (USD) |
|---|---|
| default | $0 |