Text to Speech - Oration AI Documentation

The Text to Speech (TTS) node converts text into spoken audio and plays it to the caller in real time.

Behavior

Parameter	Type	Default	Range / Options	Description
`text`	string	`""`	—	The text to speak. Supports `{{variable}}` interpolation.
`provider`	enum	`"ElevenlabsTTSConfig"`	Provider-specific	TTS provider to use
`model`	enum	`"eleven_flash_v2_5"`	Model-specific	TTS model identifier
`voice`	string	`"default"`	—	Voice identifier
`voiceId`	string	`""`	—	Specific voice ID from the provider
`language`	string	`"en"`	—	Language code (e.g., `en`, `en-US`, `es`)
`speed`	number	`1.0`	0.5–2.0	Speech rate multiplier
`pitch`	number	`1.0`	0.5–2.0	Speech pitch multiplier
`emphasis`	enum	`"moderate"`	`none`, `moderate`, `strong`, `reduced`	Emphasis level for speech delivery

Handle	Description
End	Speech playback completed — flow continues to the next node
Error	TTS generation or playback failed

Dynamic greeting

Set text to "Hello {{customer_name}}, thank you for calling." after looking up the caller in a Database node.

Slow-paced instructions

Set speed: 0.8 for complex instructions that callers need time to process.

Multi-language support

Use a Switch on language preference, then route to separate TTS nodes with matching language and voiceId configurations.