Pixazo APIModelsMMAudio v2
Pixazo APIModelsMMAudio v2

MMAudio V2 API - AI Audio Generation APIs

by Sony AI

MMAudio V2 API is a high-performance audio synthesis interface designed to generate high-fidelity sound effects and soundtracks synchronized directly to video content or text descriptions. By utilizing advanced temporal alignment and neural processing, it accurately bridges the gap between visual motion and auditory experience, making it an essential tool for creators looking to automate sound design. The system supports a wide range of features, including prompt-based audio generation, negative prompting for refined control, and adjustable sample rates to ensure output quality matches professional standards.

Get API Key
MMAudio v2 API

Models Version

LIMITED TIME OFFER

Get $5 Free Credit on First Payment

No strings attached — add funds and get $5 bonus instantly

Claim Your $5 →

MMAudio v2 Text to Audio API Documentation

https://gateway.pixazo.ai/mmaudio-v2-text-to-audio/v1

Authentication

All requests require an API key passed via header.

Header Type Required Description
Ocp-Apim-Subscription-Key string Yes Your API subscription key

MMAudio V2 Text to Audio generate request - MMAudio V2 Text to Audio

Request Code

POST https://gateway.pixazo.ai/mmaudio-v2-text-to-audio/v1/mmaudio-v2-text-to-audio-request
Content-Type: application/json
Cache-Control: no-cache
Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY

{
  "prompt": "Gentle ocean waves crashing on a sandy beach with seagulls",
  "negative_prompt": "",
  "num_steps": 25,
  "duration": 8,
  "cfg_strength": 4.5,
  "mask_away_clip": false
}
import requests

url = "https://gateway.pixazo.ai/mmaudio-v2-text-to-audio/v1/mmaudio-v2-text-to-audio-request"
headers = {
    "Content-Type": "application/json",
    "Cache-Control": "no-cache",
    "Ocp-Apim-Subscription-Key": "YOUR_SUBSCRIPTION_KEY"
}
data = {
    "prompt": "Gentle ocean waves crashing on a sandy beach with seagulls",
    "negative_prompt": "",
    "num_steps": 25,
    "duration": 8,
    "cfg_strength": 4.5,
    "mask_away_clip": false
}

response = requests.post(url, json=data, headers=headers)
print(response.json())
const url = "https://gateway.pixazo.ai/mmaudio-v2-text-to-audio/v1/mmaudio-v2-text-to-audio-request";
const headers = {
  "Content-Type": "application/json",
  "Cache-Control": "no-cache",
  "Ocp-Apim-Subscription-Key": "YOUR_SUBSCRIPTION_KEY"
};
const data = {
  "prompt": "Gentle ocean waves crashing on a sandy beach with seagulls",
  "negative_prompt": "",
  "num_steps": 25,
  "duration": 8,
  "cfg_strength": 4.5,
  "mask_away_clip": false
};

fetch(url, {
  method: "POST",
  headers: headers,
  body: JSON.stringify(data)
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error("Error:", error));
curl -X POST "https://gateway.pixazo.ai/mmaudio-v2-text-to-audio/v1/mmaudio-v2-text-to-audio-request" \
  -H "Content-Type: application/json" \
  -H "Cache-Control: no-cache" \
  -H "Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY" \
  --data-raw '{
    "prompt": "Gentle ocean waves crashing on a sandy beach with seagulls",
    "negative_prompt": "",
    "num_steps": 25,
    "duration": 8,
    "cfg_strength": 4.5,
    "mask_away_clip": false
  }'

Output

{
  "request_id": "mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "QUEUED",
  "polling_url": "https://gateway.pixazo.ai/v2/requests/status/mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

Webhook (Optional)

Add the X-Webhook-URL header to your submit request to receive a POST callback when the job completes — no polling required.

Webhook Headers

HeaderRequiredDefaultDescription
X-Webhook-URLYes (to enable)HTTPS endpoint on your server that will receive the POST callback. Must respond 2xx within a few seconds (process async if needed).
X-Webhook-ModeNoterminalterminal — fires once at the final status (COMPLETED/FAILED/ERROR). sync — fires on every poll cycle plus the terminal event, and caps the queue’s polling delay at 15s for tighter progress updates.

Example: enable webhook

X-Webhook-URL: https://your-server.com/webhook/callback
X-Webhook-Mode: terminal

Callback Payload

Your endpoint receives a POST application/json with the same shape as the GET /v2/requests/status/{request_id} response. Example terminal callback (mode terminal):

{
  "request_id": "mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "COMPLETED",
  "model_id": "mmaudio-v2-text-to-audio",
  "error": null,
  "output": {
    "media_url": [
      "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/v1/mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/output.wav"
    ],
    "media_type": "audio/wav"
  },
  "created_at": "2026-05-22T13:17:32.110Z",
  "updated_at": "2026-05-22 13:19:23",
  "completed_at": "2026-05-22 13:19:23"
}

Failure callback shape

{
  "request_id": "mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "ERROR",
  "model_id": "mmaudio-v2-text-to-audio",
  "error": "Description of the error",
  "output": null,
  "created_at": "...",
  "updated_at": "...",
  "completed_at": "..."
}

Delivery semantics

  • terminal mode (default) — exactly one POST when the request reaches a terminal status. No callback during PROCESSING.
  • sync modePOST on every status poll (with delay capped at ~15s) plus a final POST at terminal status. Use when you want progress updates.
  • Idempotency — use request_id as your idempotency key. Network retries can deliver the same callback more than once; your handler must tolerate duplicates.
  • Response — respond 200 OK within a few seconds. The queue does not block on slow handlers, but persistent failures may stop further deliveries.
  • HTTPS required — plain http:// URLs are rejected.

Request Parameters - MMAudio V2 Text to Audio generate request

Field Type Required Default Description
prompt string Yes A detailed text description of the desired audio. Example: "Gentle ocean waves crashing on a sandy beach with seagulls".
negative_prompt string No "" Describes sounds to avoid in the generated audio. Leave empty for no exclusion.
num_steps integer No 25 Number of denoising steps. Higher values improve quality but increase generation time. Range: 10–100.
duration integer No 8 Duration of the generated audio in seconds. Range: 2–30.
cfg_strength number No 4.5 Classifier-Free Guidance strength. Controls how closely the output follows the prompt. Higher values increase prompt adherence. Range: 1.0–10.0.
mask_away_clip boolean No false If true, masks out the beginning and end of the audio to avoid abrupt cuts. Recommended for seamless loops.

Minimum Request

{
  "prompt": "Gentle ocean waves crashing on a sandy beach with seagulls"
}

Full Request (all options)

{
  "prompt": "Gentle ocean waves crashing on a sandy beach with seagulls",
  "negative_prompt": "",
  "num_steps": 25,
  "duration": 8,
  "cfg_strength": 4.5,
  "mask_away_clip": false
}

Response

{
  "request_id": "mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "QUEUED",
  "polling_url": "https://gateway.pixazo.ai/v2/requests/status/mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

Request Headers

Header Value
Content-Type application/json
Cache-Control no-cache
Ocp-Apim-Subscription-Key Your API subscription key

Response Handling

Common status codes for MMAudio V2 Text to Audio generate request.

Code Meaning
202 Accepted — Request queued
Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
Too Many Requests
500 Internal Server Error

Response Handling

Common status codes.

CodeMeaning
202Accepted — Request queued
Bad Request
401Unauthorized
402Insufficient Balance
403Forbidden
Too Many Requests
500Internal Server Error

Error Responses

Queue system errors and model validation errors.

Queue System Errors

// 402 — Insufficient balance
{
  "error": "Insufficient Balance",
  "message": "Your wallet does not have enough balance."
}
// 400 — Model not found
{
  "error": "Model not found",
  "message": "Model 'mmaudio-v2-text-to-audio' not found or is disabled"
}

Error via Status/Webhook

{
  "request_id": "mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "ERROR",
  "model_id": "mmaudio-v2-text-to-audio",
  "error": "Description of the error",
  "output": null
}

Retrieving Results

Poll the universal status endpoint to check progress and retrieve results.

Endpoint

GET https://gateway.pixazo.ai/v2/requests/status/{request_id}
Ocp-Apim-Subscription-Key: YOUR_API_KEY

cURL Example

curl -H "Ocp-Apim-Subscription-Key: YOUR_API_KEY" \
  "https://gateway.pixazo.ai/v2/requests/status/mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

Response (Completed)

{
  "request_id": "mmaudio-v2-text-to-audio_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "status": "COMPLETED",
  "model_id": "mmaudio-v2-text-to-audio",
  "error": null,
  "output": {
    "media_url": [
      "https://pub-582b7213209642b9b995c96c95a30381.r2.dev/v1/mmaudio-v2-text-to-audio_019dxxxx-xxxx/output.ext"
    ],
    "media_type": "application/octet-stream"
  },
  "created_at": "2026-03-31T10:00:00.000Z",
  "updated_at": "2026-03-31T10:00:15.000Z",
  "completed_at": "2026-03-31T10:00:15.000Z"
}

Response Fields

FieldTypeDescription
request_idstringUnique request identifier
statusstringQUEUED, PROCESSING, COMPLETED, FAILED, or ERROR
model_idstringModel that processed the request
errorstring|nullError message if failed
output.media_urlarrayURLs to generated media (R2 CDN)
output.media_typestringMIME type of the output
created_atstringWhen request was created
completed_atstring|nullWhen request completed
polling_urlstringStatus URL (initial response only)

Status Values

StatusDescription
QUEUEDRequest accepted, waiting to be processed
PROCESSINGBeing processed by the model
COMPLETEDDone — output contains the result
FAILEDFailed — check error field
ERRORSystem error — not charged

Status Flow

QUEUED → PROCESSING → COMPLETED
                    → FAILED
                    → ERROR

Typical Workflow

  1. Send a generate request to the API endpoint
  2. Save the request_id from the response
  3. Poll every 5-10 seconds: GET /v2/requests/status/{request_id}
  4. When status is "COMPLETED", download from output.media_url

Tip: Use X-Webhook-URL header to get a callback instead of polling.

MMAudio v2 Text to Audio API Pricing

No data available

Could not load current pricing