Pixazo APIModelsZonos2
Pixazo APIModelsZonos2

Zonos 2 API: Pricing, Documentation

by Zyphra

Zonos 2 API is a text-to-speech and voice synthesis API that enables developers to generate natural-sounding speech from text with support for voice cloning, multilingual speech generation, and expressive audio controls. It is designed to create high-quality voice outputs using short reference audio samples while offering customization over speaking style, pitch, speed, and emotional tone. The API can be integrated into applications such as virtual assistants, content creation tools, audiobooks, customer support systems, and accessibility solutions, delivering realistic and responsive speech generation through a scalable cloud-based interface.

Get API Key
Zonos API

Models Version

WELCOME BONUS

Get $5 Free Credit on First Payment

No strings attached — add funds and get $5 bonus instantly

Claim Your $5 →

Zonos 2 API Documentation

https://gateway.pixazo.ai/zonos-2/v1/zonos-2-request

Authentication

All requests require an API key passed via header.

HeaderTypeRequiredDescription
Ocp-Apim-Subscription-KeystringYesYour API subscription key

Zonos 2 generate request

Request Code

POST https://gateway.pixazo.ai/zonos-2/v1/zonos-2-request
Content-Type: application/json
Cache-Control: no-cache
Ocp-Apim-Subscription-Key: YOUR_API_KEY

{
  "reference_audio_url": "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/Reference.wav",
  "text": "Hello, this is a sample of my cloned voice speaking naturally.",
  "language": "en_us",
  "accurate_mode": true,
  "clean_speaker_background": false,
  "temperature": 1.15,
  "top_p": 0,
  "min_p": 0.18,
  "top_k": 106
}
import requests

url = "https://gateway.pixazo.ai/zonos-2/v1/zonos-2-request"
headers = {
    "Content-Type": "application/json",
    "Cache-Control": "no-cache",
    "Ocp-Apim-Subscription-Key": "YOUR_API_KEY"
}
data = {
    "reference_audio_url": "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/Reference.wav",
    "text": "Hello, this is a sample of my cloned voice speaking naturally.",
    "language": "en_us",
    "accurate_mode": true,
    "clean_speaker_background": false,
    "temperature": 1.15,
    "top_p": 0,
    "min_p": 0.18,
    "top_k": 106
}

response = requests.post(url, json=data, headers=headers)
print(response.json())
const url = "https://gateway.pixazo.ai/zonos-2/v1/zonos-2-request";
const headers = {
  "Content-Type": "application/json",
  "Cache-Control": "no-cache",
  "Ocp-Apim-Subscription-Key": "YOUR_API_KEY"
};
const data = {
  "reference_audio_url": "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/Reference.wav",
  "text": "Hello, this is a sample of my cloned voice speaking naturally.",
  "language": "en_us",
  "accurate_mode": true,
  "clean_speaker_background": false,
  "temperature": 1.15,
  "top_p": 0,
  "min_p": 0.18,
  "top_k": 106
};

fetch(url, {
  method: "POST",
  headers: headers,
  body: JSON.stringify(data)
})
.then(response => response.json())
.then(data => console.log(data));
curl -X POST "https://gateway.pixazo.ai/zonos-2/v1/zonos-2-request" \
  -H "Content-Type: application/json" \
  -H "Cache-Control: no-cache" \
  -H "Ocp-Apim-Subscription-Key: YOUR_API_KEY" \
  --data-raw '{
    "reference_audio_url": "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/Reference.wav",
    "text": "Hello, this is a sample of my cloned voice speaking naturally.",
    "language": "en_us",
    "accurate_mode": true,
    "clean_speaker_background": false,
    "temperature": 1.15,
    "top_p": 0,
    "min_p": 0.18,
    "top_k": 106
  }'

Output

{
  "request_id": "zonos-2_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "QUEUED",
  "polling_url": "https://gateway.pixazo.ai/v2/requests/status/zonos-2_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

Webhook (Optional)

Add the X-Webhook-URL header to your generate request to receive a POST callback instead of polling.

X-Webhook-URL: https://your-server.com/webhook/callback

Request Parameters - Zonos 2 generate request

Parameter Required Type Default Allowed values / range Description
reference_audio_urlYesstringURL of the reference audio to clone the voice from. Supported formats: MP3, OGG, WAV, M4A, AAC.
textNostring"Hello, this is a sample of my cloned voice speaking naturally."Text to synthesize in the cloned voice. If omitted, a built-in example is used.
languageNoenum"en_us""en_us", "en_gb", "fr_fr", "de", "es", "it", "pt_br", "ja", "cmn", "ko"Text-normalization language code.
accurate_modeNobooleantrueTrue = closer voice match; false = more expressive delivery.
clean_speaker_backgroundNobooleanfalseMark the reference audio as having a clean background (removes ambient noise processing).
temperatureNofloat1.150–2Sampling temperature (0–2). Higher values increase randomness.
top_pNofloat00–1Nucleus sampling probability (0–1). Value of 0 disables nucleus sampling.
min_pNofloat0.180–1Minimum-probability sampling threshold (0–1). Filters out tokens below this probability.
top_kNointeger1060–2048Top-k sampling cutoff (0–2048). Value of 0 disables top-k sampling.
max_tokensNointeger1–6144Maximum audio frames to generate (1–6144). Limits output duration.
seedNointeger0–2147483647Random seed for reproducibility (0–2147483647).

Example Request

{
  "reference_audio_url": "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/Reference.wav",
  "text": "Hello, this is a sample of my cloned voice speaking naturally.",
  "language": "en_us",
  "accurate_mode": true,
  "clean_speaker_background": false,
  "temperature": 1.15,
  "top_p": 0,
  "min_p": 0.18,
  "top_k": 106
}

Response

{
  "request_id": "zonos-2_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "QUEUED",
  "polling_url": "https://gateway.pixazo.ai/v2/requests/status/zonos-2_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

Request Headers

Header Value
Content-Typeapplication/json
Cache-Controlno-cache
Ocp-Apim-Subscription-KeyYOUR_API_KEY

Response Handling

Common status codes.

CodeMeaning
202Accepted — Request queued
Bad Request
401Unauthorized
402Insufficient Balance
403Forbidden
Too Many Requests
500Internal Server Error

Error Responses

Queue system errors and model validation errors.

Queue System Errors

// 402 — Insufficient balance
{
  "error": "Insufficient Balance",
  "message": "Your wallet does not have enough balance."
}
// 400 — Model not found
{
  "error": "Model not found",
  "message": "Model 'zonos-2' not found or is disabled"
}

Error via Status/Webhook

{
  "request_id": "zonos-2_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "ERROR",
  "model_id": "zonos-2",
  "error": "Description of the error",
  "output": null
}

Retrieving Results

Poll the universal status endpoint to check progress and retrieve results.

Endpoint

GET https://gateway.pixazo.ai/v2/requests/status/{request_id}
Ocp-Apim-Subscription-Key: YOUR_API_KEY

cURL Example

curl -H "Ocp-Apim-Subscription-Key: YOUR_API_KEY" \
  "https://gateway.pixazo.ai/v2/requests/status/zonos-2_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

Response (Completed)

{
  "request_id": "zonos-2_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "COMPLETED",
  "model_id": "zonos-2",
  "error": null,
  "output": {
    "media_url": ["https://pub-582b7213209642b9b995c96c95a30381.r2.dev/v1/zonos-2_019dxxxx/output.wav"],
    "media_type": "audio/wav",
    "file_name": "output.wav",
    "file_size": 1245678
  },
  "created_at": "2026-03-31T10:00:00.000Z",
  "updated_at": "2026-03-31T10:00:15.000Z",
  "completed_at": "2026-03-31T10:00:15.000Z"
}

Response Fields

FieldTypeDescription
request_idstringUnique request identifier
statusstringQUEUED, PROCESSING, COMPLETED, FAILED, or ERROR
model_idstringModel that processed the request
errorstring|nullError message if failed
output.media_urlarrayURLs to generated media (R2 CDN)
output.media_typestringMIME type of the output
created_atstringWhen request was created
completed_atstringWhen request completed
polling_urlstringStatus URL (initial response only)

Status Values

StatusDescription
QUEUEDRequest accepted, waiting to be processed
PROCESSINGBeing processed by the model
COMPLETEDDone — output contains the result
FAILEDFailed — check error field
ERRORSystem error — not charged

Status Flow

QUEUED → PROCESSING → COMPLETED
                    → FAILED
                    → ERROR

Typical Workflow

  1. Send a generate request to the API endpoint
  2. Save the request_id from the response
  3. Poll every 5-10 seconds: GET /v2/requests/status/{request_id}
  4. When status is "COMPLETED", download from output.media_url

Tip: Use X-Webhook-URL header to get a callback instead of polling.

Zonos 2 API Pricing

Your request will cost $0.01 per minute of audio.
about $0.10 for a 10-minute narration
equivalent to $0.60 per hour of audio