🔒 Private by design — run it in your own VPC

Voice AI you can run inside your own walls

Studio-quality text-to-speech in 30 languages, with voice design and controllable cloning. Use our cloud — or self-host the engine so your audio never leaves your network.

Built on the open-source, Apache-2.0 VoxCPM engine. No lock-in. Commercial-ready.
Designed for regulated teams: 🏥 Healthcare🏦 Financial services🏛️ Government ⚖️ Legal📞 Contact centres
Capabilities

Everything the cloud players do — without sending your data away

One engine, three ways to make a voice, thirty languages.

🎨

Voice Design

Describe a voice in plain words — “warm, middle-aged, calm” — and get a brand-new voice. No recording needed.

🎛️

Controllable Cloning

Clone any voice from a short clip, with consent built in. Steer emotion, pace and style while keeping the timbre.

🌍

30 Languages

From English and Mandarin to Hindi, Arabic and Swahili — plus dialects. Just type; no language tag required.

🔊

48kHz Studio Audio

Crisp, broadcast-grade output with built-in super-resolution. Ready for video, IVR and audiobooks.

Real-Time Streaming

Audio starts before generation finishes. Fast enough for live agents and conversational apps.

🔌

Drop-in API

Swap one base URL and migrate off your current provider. SDKs for Python and JavaScript.

The difference

Your audio is sensitive. Keep it that way.

Most voice AI is cloud-only — every recording and transcript leaves your perimeter. Vocala lets you run the whole engine in your own VPC.

Cloud-only voice AI

  • Your audio + transcripts sent to a third party
  • Off-limits for many healthcare / finance / gov teams
  • Proprietary lock-in, opaque pricing
  • Data residency is whatever they decide

Vocala self-host edition

  • Engine runs inside your network — audio never leaves
  • Meets data-residency & compliance requirements
  • Open Apache-2.0 engine — no lock-in, commercial-ready
  • Production targets blocked by a built-in safety guard
  • SSO, audit logs, per-team usage metering
Talk to us about self-hosting →
Trust & governance

Every clip is disclosed, traceable, and tamper-evident

The controls regulated teams need — built in, not bolted on. This is what sets Vocala apart from a model with an API.

🔏

Signed provenance

Every generation returns an Ed25519-signed manifest — discloses it's AI, binds to the audio bytes, and is verifiable by anyone with the public key. No trust in us required.

💧

Inaudible watermark

An AudioSeal watermark embedded on synthesis and detectable after the fact — proven 0.0 on clean audio, 1.0 on watermarked.

Consent-first cloning

Cloning requires a recorded consent acknowledgement, stored in a per-voice consent ledger. Responsible by design.

🛡️

Content guardrails

Redact PII (email, phone, card) and block disallowed content by policy — before a word is ever spoken.

🗣️

Pronunciation control

Per-team lexicons teach the engine your brand names, drug names and tickers, so they're said right every time.

🏢

Self-host licensing

Run the whole stack in your VPC with an offline license; only usage counts — never audio — leave for billing.

How it works

From text to voice in four steps

Type or paste your text

Any of 30 languages. Long-form or a single line.

Pick, design or clone a voice

Use a preset, describe a new voice, or clone one with consent.

Generate — in cloud or your VPC

The engine runs where you choose. We never see self-hosted audio.

Stream, download or call the API

Play in the Studio, export 48kHz WAV, or hit the REST API.

🎙️

Try it free in the Studio

Create a workspace and generate your first clip in under a minute. No credit card.

Open the Studio →
Pricing

Simple plans. Self-host when you need it.

Prices in AUD per seat / month. Cancel anytime. Enterprise unlocks self-hosting, SSO and unlimited volume.

Free

A$0
Try the Studio, watermarked, non-commercial.
  • 1 seat
  • 10k characters / mo
  • Preset voices
  • Cloud only
Start free

Creator

A$29/seat/mo
For solo creators & small teams.
  • 3 seats
  • 500k characters / mo
  • Voice Design + Cloning
  • Commercial use, 48kHz
Choose Creator
Most popular

Pro

A$149/seat/mo
For product teams shipping voice.
  • 10 seats
  • 2M characters / mo
  • Everything in Creator
  • REST API + keys
  • Batch generation
Choose Pro

Enterprise

Custom
For regulated & high-volume teams.
  • Unlimited seats & volume
  • Self-host in your VPC
  • SSO / SAML + audit logs
  • SLA & priority support
  • Custom branded voices
Book a demo

Usage-based API billing also available (~A$0.18 / 1k characters).

FAQ

Questions, answered

What does “self-hosted” actually mean?

The voice engine runs on your own infrastructure — your cloud VPC or on-prem. Your text and audio never touch our servers. You manage it from the same Vocala control plane; only metadata (usage counts, plan) syncs.

How is this different from ElevenLabs or OpenAI TTS?

Quality is comparable, but those are cloud-only and proprietary. Vocala is built on the open Apache-2.0 VoxCPM engine, so you can run it yourself, avoid lock-in, and keep regulated data in-house — usually at lower cost at volume.

Is voice cloning safe and legal?

Cloning requires an explicit, recorded consent acknowledgement that we store in a per-voice consent ledger. We strongly discourage impersonation and provide audit trails for every cloned voice.

Which languages are supported?

30, including English, Chinese (and several dialects), Spanish, Hindi, Arabic, Japanese, French, German, Portuguese, Russian and more — no language tag needed.

Can I try before paying?

Yes. The Free plan needs no credit card — create a workspace and generate right away. Upgrade in-app when you’re ready.

Give your product a voice — without giving away your data.

Start free in the cloud today, move to self-host when compliance calls for it.