PlayHT - AI Voice Generator

PlayHT is a voice AI platform offering text‑to‑speech with hundreds of realistic voices, instant & high‑fidelity voice cloning, a developer API for streaming/real‑time synthesis, multi‑turn, multi‑speaker dialog generation, Voice Agents (no‑code+SDKs for web/mobile), and PlayNote, which turns documents/PDFs into multi‑speaker podcasts. — **PlayHT** is a voice AI platform offering **text‑to‑speech** with hundreds of realistic voices, **instant & high‑fidelity voice cloning**, a **developer API** for streaming/real‑time synthesis, **multi‑turn, multi‑speaker dialog generation**, **Voice Agents** (no‑code+SDKs for web/mobile), and **PlayNote**, which turns documents/PDFs into multi‑speaker podcasts.

What is PlayHT (PlayAI)?

PlayHT / PlayAI is a voice AI platform offering text‑to‑speech (TTS) with hundreds of realistic voices, instant & high‑fidelity voice cloning, a developer API for streaming/real‑time synthesis, multi‑turn, multi‑speaker dialog generation, Voice Agents (no‑code+SDKs for web/mobile), and PlayNote, which turns documents/PDFs into multi‑speaker podcasts.

Key product pillars:

Studio (play.ht): web editor to paste text, pick voices, use SSML, set pronunciations, and export audio. The site advertises over 42 languages in the studio product literature.
Developer platform (docs.play.ht / docs.play.ai): models (PlayDialog, Play3.0‑mini, PlayHT2.0‑turbo), HTTP & WebSocket streaming, batch jobs, and SDKs for Node/Python.
Voice Agents: build & embed voice agents with web and Flutter SDKs; add knowledge bases and actions.
PlayNote: convert PDFs and docs into multi‑speaker podcasts.

Model lineup (v2.3 docs):
• PlayDialog – most expressive, context‑aware, multi‑turn dialog; supports multi‑voice outputs.
• Play3.0‑mini – multilingual, low‑latency, cost‑efficient; streaming latency targeted <200 ms, 48kHz out, and 36 languages.
• PlayHT2.0‑turbo – legacy model.

PlayHT Key Features

Ultra‑realistic TTS + SSML support
Studio workflow includes SSML controls for rate, pitch, volume, pauses, plus pronunciation dictionaries and preview before render.
Multi‑speaker, multi‑turn dialog generation
From one prompt you can generate conversations with multiple voices, or drive two named speakers via the API.
Cross‑language voice cloning & multilingual synthesis
Clone a voice and render it across languages; product pages emphasize multilingual speech synthesis and cross‑language cloning.
Voice Cloning (Instant & High‑Fidelity)

Instant clone from ≥30 seconds of audio;
High‑Fidelity clone recommends 20+ minutes (up to 30 min) for best results; both managed in‑app with guidance.

Low‑latency streaming & Groq acceleration
PlayDialog is available on GroqCloud, delivering 140–200+ characters/second generation with sub‑second response, enabling real‑time agents and IVR.
AI Voice Agents
Create and embed voice agents in minutes; provide documents/websites as knowledge bases, wire custom actions, and drop in the Web SDK or Flutter SDK.
PlayNote (Docs→Podcast)
Automatically restructures PDFs/Documents into a multi‑speaker conversational audio “podcast” with selectable voices.

PlayHT Pros & Cons

Pros

Expressive, dialog‑aware voices (PlayDialog) with multi‑speaker control.
Very low latency via Groq—useful for live agents, IVR, or streaming overlays.
Two cloning modes (quick vs. high‑fidelity) with clear audio requirements.
Rich studio tooling (SSML, pronunciations, preview) + developer SDKs.

Cons

Service stability & support concerns have been raised by users (see recent G2 & Trustpilot reviews).
Pricing clarity: public info varies by source; “Unlimited” has a fair‑usage cap (2.5M chars/month) and refund policy limits.
Post‑acquisition uncertainty: Meta’s acquisition was confirmed in July 2025; third‑party notices mention deprecations and wind‑downs in places—check status before integrating.

Who is using PlayHT?

Creators & teams: podcasters, YouTubers, marketers, e‑learning, accessibility. Industry distribution on G2 shows usage across SMBs in e‑learning, IT services, marketing and online media.
Companies: Third‑party “customers using Play.ht” lists (e.g., Marathon Health, Quorum Analytics, Grin Technologies). Treat such lists as indicative, not exhaustive.

PlayHT Pricing (What we can verify publicly)

Important: Pricing changed several times pre‑ and post‑acquisition and can vary between Studio (play.ht) and API (play.ai). Confirm current tiers in‑app.

Historic Studio tiers often cited by reviewers/directories:
- Free: limited characters; non‑commercial.
- Professional / Creator (~$39/mo): entry paid tier (older listings show 50k words/mo or ~600k/yr allowances).
- Unlimited (~$99/mo): subject to fair‑usage policy (2.5M chars/mo, 30M/yr).
- Team/Enterprise: multi‑seat or custom.
Discounts:20% off for students, educators & nonprofits.
Refund policy: requests within 24 hours and under 5,000 characters used.

What makes PlayHT unique?

Conversational delivery, not just TTS.PlayDialog uses an Adaptive Speech Contextualizer (ASC) to maintain emotional tone and prosody across turns—great for agents, podcasts, and support scripts.
Real‑time speed at scale. Groq acceleration enables sub‑second responses and 140–200+ chars/sec speech generation—important for live calls and assistants.
Document‑to‑podcast automation.PlayNote automatically turns dense docs into multi‑voice podcasts—handy for training and internal comms.
End‑to‑end stack. Studio for creators, Agents for deployment, and APIs for devs under one roof.

Comprehensive Tutorial — How to Use PlayHT / PlayAI

A) Studio (play.ht): Create a polished voiceover fast

Sign up / Log in → open Playground/Studio. Paste your script.
Pick a voice & language. Filter by style (narration, explainer, conversational), gender, or accent. The site materials reference 40+ languages overall and “over 42 languages” in the Studio product copy.
Tune delivery with SSML & controls.
- Adjust rate, pitch, volume; add breaks and emphasis; create pronunciation rules (e.g., brand names).
- Use Preview paragraph‑by‑paragraph before you commit.
Multi‑voice scenes. Assign different voices to different paragraphs for dialog‑style content (e.g., podcast host + guest).
Dubbing / multilingual. Try cross‑language voice cloning to keep the same voice while switching languages for localization.
Export. Download WAV/MP3, then mix in your NLE/DAW as needed. (Check your license and plan limits.)

SSML tips that usually work well across TTS services (supported in Play’s studio per product pages):

Add short breaks between sentences for pacing.
Use <prosody rate=”90%”> to slow tricky lines; <prosody pitch=”+2st”> to brighten a flat passage.
Define <sub alias=”…”>…</sub> for brand names and abbreviations.

Licensing: The studio FAQ describes a freemium model; commercial usage requires the appropriate subscription. Always confirm asset licensing inside your account.

B) Voice Cloning (Instant vs High‑Fidelity)

Open “Create Voice Clone.”
Choose Instant Clone (≥ 30 sec of clean audio; fastest) orHigh‑Fidelity (recommended 20–30 minutes total for the best results).
Upload/record in a quiet room with a decent mic; include varied intonation.
Wait for training (≈1 minute for Instant; longer for HF), then preview and save.
Use your cloned voice in Studio or via API.

C) PlayNote: Turn documents into a multi‑speaker podcast

Go to PlayNote in your Play AI account.
Upload a PDF/doc or import from a URL.
Assign voices for narrator and “guest,” choose tone, and generate.
Review the structure; regenerate segments if needed; export audio for distribution.

D) Voice Agents (web/mobile)

In Agents, click Create Agent and define personality, capabilities, and knowledge base (upload docs/URLs).
Add Actions/Integrations for tasks the agent can perform.
Embed on your site with the Web SDK or Web Embed snippet, or use the Flutter SDK for mobile.
Test latency and barge‑in behavior; deploy.

E) Developer Guide — API & SDKs (current endpoints & models)

Heads‑up: Play provides two doc surfaces:
docs.play.ai (current PlayAI platform; endpoints such as https://api.play.ai/...)
docs.play.ht (prior API surface; endpoints such as https://api.play.ht/api/v2/...)
Favor the PlayAI endpoints unless your account requires legacy APIs.

1) Quickstart: Create audio via HTTP (PlayAI)

cURL (PlayAI/api.play.ai)

curl -X POST 'https://api.play.ai/api/v1/tts/stream' \
  -H "Authorization: Bearer $PLAYAI_KEY" \
  -H "X-USER-ID: $PLAYAI_USER_ID" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "PlayDialog",
    "text": "Hello! This is my first text-to-speech audio using PlayAI!",
    "voice": "s3://voice-cloning-zero-shot/.../manifest.json",
    "outputFormat": "wav"
  }' --output hello.wav

This is the current quickstart shape in docs; SDKs exist for Python/Node/Go/Dart/Swift.

Node SDK (PlayAI): initialize with your userId and apiKey, then stream() with model: "PlayDialog" (for dialog) or "Play3.0-mini" (for lower‑latency multi‑lingual).

2) Real‑time streaming (WebSocket)

Use the WebSocket API for low‑latency continuous audio (useful for agents). Docs outline the socket route and streaming packets; Play also surfaces an HTTP streaming endpoint if sockets don’t fit your stack.

3) Batch TTS jobs

For large scripts, submit a Batch TTS Job, then poll Job Details until ready; download or stitch child jobs as needed. Endpoints are listed under “Batch Text‑to‑Speech.”

4) Multi‑speaker dialog generation

With PlayDialog, you can generate conversation from one request—either via inline turn prefixes or by passing multiple voice manifests (e.g., voice and voice_2) and “Country Mouse / Town Mouse” prefixes, as shown in docs.

5) Models & when to use them

PlayDialog → Most expressive & context‑aware; ideal for agents, podcasts, & scripts needing emotion.
Play3.0‑mini → Fastest streaming, reduced hallucinations, 48kHz output; 36 languages.
PlayHT2.0‑turbo → legacy compatibility.

6) Listing available voices

The List of Pre‑Built Voices page links to the official inventory (voice manifests + sample URLs). Use these IDs in your API calls.

7) SSML in code

The studio supports SSML; for API flows you can send SSML text in your text field (wrapping with <speak>…</speak>). Use prosody, break, sub (alias), etc., as needed and verify voice support.

8) Groq turbo endpoints

When available on your account, the Dialog‑turbo (Groq) option boosts throughput (>200 chars/sec)—useful when you need near‑instantaneous voice responses.

Best Practices & Tips

For cloning: record clean, varied audio; keep room tone consistent; include Q/A, numbers, and different emotions. (HF clones recommend 20+ minutes.)
For SSML: small changes go far; keep rate adjustments under ±15% for naturalness. (Not all tags are supported across all voices—preview first.)
For agents: test barge‑in, timeouts, and latency on real networks; wire action fallbacks and error messages.
For scale: use Batch TTS for long scripts; for live apps choose WebSocket.

Reputation & Reviews (balanced view)

Editorial recognition: Play.ht is regularly listed among notable TTS tools by mainstream tech outlets.
User reviews (mixed): G2 shows ~4.3/5 across SMB‑heavy usage but recent reviews flag support/billing issues; Trustpilot feedback is more negative and inconsistent. Read recent posts before committing.

Status Watch: Acquisition & Deprecations

Meta acquired PlayAI (Play.ht) in mid‑July 2025. Multiple outlets (TechCrunch, Bloomberg/Yahoo, Engadget) confirmed.
Deprecations: Third‑party platform notices indicate PlayHT APIs/voices have been deprecated in places, and older PlayHT 1.0 models were EOL’d in June 2025 by Play’s own help center. Please verify current operability for your plan.

“All Commands” Cheat‑Sheet (most‑used API operations)

Note: Endpoint shapes are summarized from docs; always check your account’s current docs & versions.

Synthesize (HTTP streaming)
POST https://api.play.ai/api/v1/tts/stream
Body: { "model": "PlayDialog" | "Play3.0-mini", "text": "...", "voice": "<manifest url>", "outputFormat": "mp3|wav" }
Headers: Authorization: Bearer <key>, X-USER-ID: <id>
Synthesize (legacy surface)
POST https://api.play.ht/api/v2/tts/stream
Headers: AUTHORIZATION: <apiKey>, X-USER-ID: <userId> (v2.3 docs)
WebSocket streaming
Route & frames as documented under WebSocket API; send text chunks and receive audio packets in real time.
Batch TTS
POST Create Batch TTS Job → poll Get Batch TTS Job Details until completed.
Multi‑speaker dialog
Use PlayDialog with voice + voice_2 and turn_prefix fields to alternate speakers in a single request.
List voices
See List of Pre‑Built Voices and use IDs like s3://voice-cloning-zero-shot/.../manifest.json.
Voice Cloning (app)
In app: Instant (≥30s) vs High‑Fidelity (20–30 min); then reference your clone’s ID in API calls.

Frequently Asked Questions

Is SSML supported?
Yes—studio literature highlights SSML for rate, pitch, volume, pauses, pronunciations. Preview to confirm the result per voice.

How many voices/languages are there?
Play markets a large catalog and multilingual support; model docs list 36 languages for Play3.0‑mini, while studio pages advertise “over 42 languages.” Catalog size and availability can vary by model/tier.

Can I use cloned voices commercially?
Check your plan’s license and Terms; ensure you have the right to clone any voice you upload.

Is “Unlimited” really unlimited?
No—fair usage policies apply (e.g., 2.5M chars/month, 30M/year in recent help‑center guidance).

What about refunds?
Policy requires requests within 24 hours and <5,000 characters used.

The Bottom Line

Play.ht/PlayAI remains a feature‑rich voice stack with standout dialog‑aware models and low‑latency options (via Groq) for real‑time voice experiences. That said, with the Meta acquisition and reported deprecations, new buyers should verify current service status and terms, especially if you’re building long‑lived products or high‑volume pipelines.

Other Popular AI Tools

BeamJobs – AI Resume Builder and Cover Letter Generator

Chat Forefront AI – Your New AI Assistant

Chapple AI – Ultimate AI Generator

AI Tutor Pro – Your Personal Digital Assistant

AI Cheat Check – The Checker AI

Related Tools & Articles

voice

Uberduck AI - AI Voice Generator

PlayHT - AI Voice Generator

Visit Tool →

VISIT PLAYHT

What is PlayHT (PlayAI)?

Key product pillars:

Studio (play.ht): web editor to paste text, pick voices, use SSML, set pronunciations, and export audio. The site advertises over 42 languages in the studio product literature.
Developer platform (docs.play.ht / docs.play.ai): models (PlayDialog, Play3.0‑mini, PlayHT2.0‑turbo), HTTP & WebSocket streaming, batch jobs, and SDKs for Node/Python.
Voice Agents: build & embed voice agents with web and Flutter SDKs; add knowledge bases and actions.
PlayNote: convert PDFs and docs into multi‑speaker podcasts.

Model lineup (v2.3 docs):
• PlayDialog – most expressive, context‑aware, multi‑turn dialog; supports multi‑voice outputs.
• Play3.0‑mini – multilingual, low‑latency, cost‑efficient; streaming latency targeted <200 ms, 48kHz out, and 36 languages.
• PlayHT2.0‑turbo – legacy model.

PlayHT Key Features

Ultra‑realistic TTS + SSML support
Studio workflow includes SSML controls for rate, pitch, volume, pauses, plus pronunciation dictionaries and preview before render.
Multi‑speaker, multi‑turn dialog generation
From one prompt you can generate conversations with multiple voices, or drive two named speakers via the API.
Cross‑language voice cloning & multilingual synthesis
Clone a voice and render it across languages; product pages emphasize multilingual speech synthesis and cross‑language cloning.
Voice Cloning (Instant & High‑Fidelity)

Instant clone from ≥30 seconds of audio;
High‑Fidelity clone recommends 20+ minutes (up to 30 min) for best results; both managed in‑app with guidance.

Low‑latency streaming & Groq acceleration
PlayDialog is available on GroqCloud, delivering 140–200+ characters/second generation with sub‑second response, enabling real‑time agents and IVR.
AI Voice Agents
Create and embed voice agents in minutes; provide documents/websites as knowledge bases, wire custom actions, and drop in the Web SDK or Flutter SDK.
PlayNote (Docs→Podcast)
Automatically restructures PDFs/Documents into a multi‑speaker conversational audio “podcast” with selectable voices.

PlayHT Pros & Cons

Pros

Expressive, dialog‑aware voices (PlayDialog) with multi‑speaker control.
Very low latency via Groq—useful for live agents, IVR, or streaming overlays.
Two cloning modes (quick vs. high‑fidelity) with clear audio requirements.
Rich studio tooling (SSML, pronunciations, preview) + developer SDKs.

Cons

Service stability & support concerns have been raised by users (see recent G2 & Trustpilot reviews).
Pricing clarity: public info varies by source; “Unlimited” has a fair‑usage cap (2.5M chars/month) and refund policy limits.
Post‑acquisition uncertainty: Meta’s acquisition was confirmed in July 2025; third‑party notices mention deprecations and wind‑downs in places—check status before integrating.

Who is using PlayHT?

Creators & teams: podcasters, YouTubers, marketers, e‑learning, accessibility. Industry distribution on G2 shows usage across SMBs in e‑learning, IT services, marketing and online media.
Companies: Third‑party “customers using Play.ht” lists (e.g., Marathon Health, Quorum Analytics, Grin Technologies). Treat such lists as indicative, not exhaustive.

PlayHT Pricing (What we can verify publicly)

Important: Pricing changed several times pre‑ and post‑acquisition and can vary between Studio (play.ht) and API (play.ai). Confirm current tiers in‑app.

Historic Studio tiers often cited by reviewers/directories:
- Free: limited characters; non‑commercial.
- Professional / Creator (~$39/mo): entry paid tier (older listings show 50k words/mo or ~600k/yr allowances).
- Unlimited (~$99/mo): subject to fair‑usage policy (2.5M chars/mo, 30M/yr).
- Team/Enterprise: multi‑seat or custom.
Discounts:20% off for students, educators & nonprofits.
Refund policy: requests within 24 hours and under 5,000 characters used.

What makes PlayHT unique?

Conversational delivery, not just TTS.PlayDialog uses an Adaptive Speech Contextualizer (ASC) to maintain emotional tone and prosody across turns—great for agents, podcasts, and support scripts.
Real‑time speed at scale. Groq acceleration enables sub‑second responses and 140–200+ chars/sec speech generation—important for live calls and assistants.
Document‑to‑podcast automation.PlayNote automatically turns dense docs into multi‑voice podcasts—handy for training and internal comms.
End‑to‑end stack. Studio for creators, Agents for deployment, and APIs for devs under one roof.

Comprehensive Tutorial — How to Use PlayHT / PlayAI

A) Studio (play.ht): Create a polished voiceover fast

Sign up / Log in → open Playground/Studio. Paste your script.
Pick a voice & language. Filter by style (narration, explainer, conversational), gender, or accent. The site materials reference 40+ languages overall and “over 42 languages” in the Studio product copy.
Tune delivery with SSML & controls.
- Adjust rate, pitch, volume; add breaks and emphasis; create pronunciation rules (e.g., brand names).
- Use Preview paragraph‑by‑paragraph before you commit.
Multi‑voice scenes. Assign different voices to different paragraphs for dialog‑style content (e.g., podcast host + guest).
Dubbing / multilingual. Try cross‑language voice cloning to keep the same voice while switching languages for localization.
Export. Download WAV/MP3, then mix in your NLE/DAW as needed. (Check your license and plan limits.)

SSML tips that usually work well across TTS services (supported in Play’s studio per product pages):

Add short breaks between sentences for pacing.
Use <prosody rate=”90%”> to slow tricky lines; <prosody pitch=”+2st”> to brighten a flat passage.
Define <sub alias=”…”>…</sub> for brand names and abbreviations.

Licensing: The studio FAQ describes a freemium model; commercial usage requires the appropriate subscription. Always confirm asset licensing inside your account.

B) Voice Cloning (Instant vs High‑Fidelity)

Open “Create Voice Clone.”
Choose Instant Clone (≥ 30 sec of clean audio; fastest) orHigh‑Fidelity (recommended 20–30 minutes total for the best results).
Upload/record in a quiet room with a decent mic; include varied intonation.
Wait for training (≈1 minute for Instant; longer for HF), then preview and save.
Use your cloned voice in Studio or via API.

C) PlayNote: Turn documents into a multi‑speaker podcast

Go to PlayNote in your Play AI account.
Upload a PDF/doc or import from a URL.
Assign voices for narrator and “guest,” choose tone, and generate.
Review the structure; regenerate segments if needed; export audio for distribution.

D) Voice Agents (web/mobile)

In Agents, click Create Agent and define personality, capabilities, and knowledge base (upload docs/URLs).
Add Actions/Integrations for tasks the agent can perform.
Embed on your site with the Web SDK or Web Embed snippet, or use the Flutter SDK for mobile.
Test latency and barge‑in behavior; deploy.

E) Developer Guide — API & SDKs (current endpoints & models)

Heads‑up: Play provides two doc surfaces:
docs.play.ai (current PlayAI platform; endpoints such as https://api.play.ai/...)
docs.play.ht (prior API surface; endpoints such as https://api.play.ht/api/v2/...)
Favor the PlayAI endpoints unless your account requires legacy APIs.

1) Quickstart: Create audio via HTTP (PlayAI)

cURL (PlayAI/api.play.ai)

curl -X POST 'https://api.play.ai/api/v1/tts/stream' \
  -H "Authorization: Bearer $PLAYAI_KEY" \
  -H "X-USER-ID: $PLAYAI_USER_ID" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "PlayDialog",
    "text": "Hello! This is my first text-to-speech audio using PlayAI!",
    "voice": "s3://voice-cloning-zero-shot/.../manifest.json",
    "outputFormat": "wav"
  }' --output hello.wav

This is the current quickstart shape in docs; SDKs exist for Python/Node/Go/Dart/Swift.

Node SDK (PlayAI): initialize with your userId and apiKey, then stream() with model: "PlayDialog" (for dialog) or "Play3.0-mini" (for lower‑latency multi‑lingual).

2) Real‑time streaming (WebSocket)

3) Batch TTS jobs

For large scripts, submit a Batch TTS Job, then poll Job Details until ready; download or stitch child jobs as needed. Endpoints are listed under “Batch Text‑to‑Speech.”

4) Multi‑speaker dialog generation

5) Models & when to use them

PlayDialog → Most expressive & context‑aware; ideal for agents, podcasts, & scripts needing emotion.
Play3.0‑mini → Fastest streaming, reduced hallucinations, 48kHz output; 36 languages.
PlayHT2.0‑turbo → legacy compatibility.

6) Listing available voices

The List of Pre‑Built Voices page links to the official inventory (voice manifests + sample URLs). Use these IDs in your API calls.

7) SSML in code

8) Groq turbo endpoints

When available on your account, the Dialog‑turbo (Groq) option boosts throughput (>200 chars/sec)—useful when you need near‑instantaneous voice responses.

Best Practices & Tips

For cloning: record clean, varied audio; keep room tone consistent; include Q/A, numbers, and different emotions. (HF clones recommend 20+ minutes.)
For SSML: small changes go far; keep rate adjustments under ±15% for naturalness. (Not all tags are supported across all voices—preview first.)
For agents: test barge‑in, timeouts, and latency on real networks; wire action fallbacks and error messages.
For scale: use Batch TTS for long scripts; for live apps choose WebSocket.

Reputation & Reviews (balanced view)

Editorial recognition: Play.ht is regularly listed among notable TTS tools by mainstream tech outlets.
User reviews (mixed): G2 shows ~4.3/5 across SMB‑heavy usage but recent reviews flag support/billing issues; Trustpilot feedback is more negative and inconsistent. Read recent posts before committing.

Status Watch: Acquisition & Deprecations

Meta acquired PlayAI (Play.ht) in mid‑July 2025. Multiple outlets (TechCrunch, Bloomberg/Yahoo, Engadget) confirmed.
Deprecations: Third‑party platform notices indicate PlayHT APIs/voices have been deprecated in places, and older PlayHT 1.0 models were EOL’d in June 2025 by Play’s own help center. Please verify current operability for your plan.

“All Commands” Cheat‑Sheet (most‑used API operations)

Note: Endpoint shapes are summarized from docs; always check your account’s current docs & versions.

Synthesize (HTTP streaming)
POST https://api.play.ai/api/v1/tts/stream
Body: { "model": "PlayDialog" | "Play3.0-mini", "text": "...", "voice": "<manifest url>", "outputFormat": "mp3|wav" }
Headers: Authorization: Bearer <key>, X-USER-ID: <id>
Synthesize (legacy surface)
POST https://api.play.ht/api/v2/tts/stream
Headers: AUTHORIZATION: <apiKey>, X-USER-ID: <userId> (v2.3 docs)
WebSocket streaming
Route & frames as documented under WebSocket API; send text chunks and receive audio packets in real time.
Batch TTS
POST Create Batch TTS Job → poll Get Batch TTS Job Details until completed.
Multi‑speaker dialog
Use PlayDialog with voice + voice_2 and turn_prefix fields to alternate speakers in a single request.
List voices
See List of Pre‑Built Voices and use IDs like s3://voice-cloning-zero-shot/.../manifest.json.
Voice Cloning (app)
In app: Instant (≥30s) vs High‑Fidelity (20–30 min); then reference your clone’s ID in API calls.

Frequently Asked Questions

Is SSML supported?
Yes—studio literature highlights SSML for rate, pitch, volume, pauses, pronunciations. Preview to confirm the result per voice.

Can I use cloned voices commercially?
Check your plan’s license and Terms; ensure you have the right to clone any voice you upload.

Is “Unlimited” really unlimited?
No—fair usage policies apply (e.g., 2.5M chars/month, 30M/year in recent help‑center guidance).

What about refunds?
Policy requires requests within 24 hours and <5,000 characters used.

The Bottom Line

Other Popular AI Tools

BeamJobs – AI Resume Builder and Cover Letter Generator

Chat Forefront AI – Your New AI Assistant

Chapple AI – Ultimate AI Generator

AI Tutor Pro – Your Personal Digital Assistant

AI Cheat Check – The Checker AI

Related Tools & Articles

voice

What is PlayHT (PlayAI)?

PlayHT Key Features

PlayHT Pros & Cons

Who is using PlayHT?

PlayHT Pricing (What we can verify publicly)

What makes PlayHT unique?

Comprehensive Tutorial — How to Use PlayHT / PlayAI

A) Studio (play.ht): Create a polished voiceover fast

B) Voice Cloning (Instant vs High‑Fidelity)

C) PlayNote: Turn documents into a multi‑speaker podcast

D) Voice Agents (web/mobile)

E) Developer Guide — API & SDKs (current endpoints & models)

1) Quickstart: Create audio via HTTP (PlayAI)

2) Real‑time streaming (WebSocket)

3) Batch TTS jobs

4) Multi‑speaker dialog generation

5) Models & when to use them

6) Listing available voices

7) SSML in code

8) Groq turbo endpoints

Best Practices & Tips

Reputation & Reviews (balanced view)

Status Watch: Acquisition & Deprecations

“All Commands” Cheat‑Sheet (most‑used API operations)

Frequently Asked Questions

The Bottom Line

Other Popular AI Tools

Related Tools & Articles

FineVoice - All‑in‑One AI Voice Studio for Creators

Resemble AI - AI Voice Generation Platform

Hume AI - Voice Intelligence Powered By Empathic AI

Eleven Labs - AI Voice Generator for Realistic Speech

SpeechGen - AI Text‑to‑Speech / AI Voice Generator

Uberduck AI - AI Voice Generator

What is PlayHT (PlayAI)?

PlayHT Key Features

PlayHT Pros & Cons

Who is using PlayHT?

PlayHT Pricing (What we can verify publicly)

What makes PlayHT unique?

Comprehensive Tutorial — How to Use PlayHT / PlayAI

A) Studio (play.ht): Create a polished voiceover fast

B) Voice Cloning (Instant vs High‑Fidelity)

C) PlayNote: Turn documents into a multi‑speaker podcast

D) Voice Agents (web/mobile)

E) Developer Guide — API & SDKs (current endpoints & models)

1) Quickstart: Create audio via HTTP (PlayAI)

2) Real‑time streaming (WebSocket)

3) Batch TTS jobs

4) Multi‑speaker dialog generation

5) Models & when to use them

6) Listing available voices

7) SSML in code

8) Groq turbo endpoints

Best Practices & Tips

Reputation & Reviews (balanced view)

Status Watch: Acquisition & Deprecations

“All Commands” Cheat‑Sheet (most‑used API operations)

Frequently Asked Questions

The Bottom Line

Other Popular AI Tools

Related Tools & Articles

FineVoice - All‑in‑One AI Voice Studio for Creators

Resemble AI - AI Voice Generation Platform

Hume AI - Voice Intelligence Powered By Empathic AI

Eleven Labs - AI Voice Generator for Realistic Speech

SpeechGen - AI Text‑to‑Speech / AI Voice Generator

Uberduck AI - AI Voice Generator