Behavior
- Listens for the caller’s speech using the configured ASR model
- Returns a transcript with a confidence score
- If confidence is below the threshold, emits
SPEECH.NO_MATCH - Supports grammar constraints and recognition hints for improved accuracy
- Falls back to DTMF input if enabled
Configuration
Recognition
| Parameter | Type | Default | Range / Options | Description |
|---|---|---|---|---|
language | string | "en-US" | — | Language code for recognition |
model | enum | "default" | default, enhanced, medical, phone_call | ASR model selection |
confidenceThreshold | number | 0.7 | 0–1 | Minimum confidence score to accept (below this triggers NO_MATCH) |
grammar | string | "" | — | Grammar constraint (SRGS format or plain text word list) |
hints | string[] | [] | — | Recognition hints — words or phrases the ASR model should prioritize |
partialResults | boolean | false | — | Return partial (interim) results during recognition |
Timing & retries
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
timeout | number | 10 | 1–60 | Maximum listening duration in seconds |
silenceTimeout | number | 3 | 1–15 | Stop listening after this many seconds of silence |
retries | number | 2 | 1–5 | Number of retry attempts on no match |
Output & fallback
| Parameter | Type | Default | Description |
|---|---|---|---|
variableName | string | "speech_input" | Variable to store the recognized transcript |
fallbackToDTMF | boolean | true | Accept DTMF input if speech recognition fails |
ASR models
| Model | Best for |
|---|---|
default | General-purpose recognition |
enhanced | Higher accuracy with larger vocabulary |
medical | Medical terminology and clinical conversations |
phone_call | Optimized for telephony audio quality |
Output handles
| Handle | Description |
|---|---|
| Success | Speech recognized with confidence above threshold |
| No Match | Speech detected but confidence below threshold, or no matching grammar |
| Timeout | No speech detected within timeout |
Output variables
| Variable | Type | Description |
|---|---|---|
transcript | string | The recognized speech text |
confidence | number | Recognition confidence score (0–1) |
Use cases
Name capture
Name capture
Set
hints to common first names and confidenceThreshold: 0.6 for flexible name recognition. Store the result in {{customer_name}}.Yes/No confirmation
Yes/No confirmation
Set
grammar to "yes no" and model: "phone_call" for reliable binary responses over phone lines.Medical intake
Medical intake
Use
model: "medical" with hints for medication names and symptoms. Set silenceTimeout: 5 to give patients time to think.