Connectors
The Start node exposes four required connectors, each representing one part of the conversational stack:- Hear — Select the transcriber model
- Think — Select the chat model
- Speak — Select the voice model
- Agent / Path — The first node in the workflow’s logic
Agent nodes inherit the Start node’s Hear/Think/Speak settings unless explicitly overridden.
Main Settings
Opening the Start node shows the configuration panel where you set the defaults for the entire workflow.External Inputs
External Inputs allow you to pass runtime parameters into the workflow. These are commonly used for:- CRM or backend identifiers (customer_id, phone_number)
- Preloaded context (appointment_time, plan_type)
- Dynamic variables needed by later nodes
-
Parameter Name
A variable name (e.g.,customer_name,case_id). -
Input Type. Allowed types:
StringNumberIntegerBoolean
-
Required
When enabled, the workflow cannot run unless the parameter is provided.
For outbound calls:- If triggered inside Breez → must be supplied when Testing or Calling
- For API-triggered calls → must be included in the request payload.
Master Instruction
The Master Instruction is the top-level system directive that governs the agent’s behavior across the entire workflow. Important:This instruction is automatically appended to the end of every Agent node’s instruction, regardless of the Agent’s individual settings. This ensures:
- Consistent personality
- Shared rules across the call
- A unified global prompt
The bundled default instruction that exists upon loading a new workflow is simply a template — users can modify it freely.
Fast Turn Delay (Semantic/Provider)
Fast Turn Delay determines how quickly the agent responds when smart, semantic turn detection is available.In this mode, the agent can tell whether the user is mid-sentence or actually finished, so it can react after very short pauses. Lower values (e.g., 0.05s)
- Responds almost immediately
- Fast, snappy interactions
- Risk: may interrupt if the user pauses briefly mid-sentence
- Waits a bit longer before replying
- Feels more natural and less interruptive
- Trade-off: responses start slightly slower
Slow Turn Delay (VAD Only)
Slow Turn Delay is used when only basic Voice Activity Detection (VAD) is available.VAD can detect sound vs. silence but cannot interpret meaning, so the agent needs a longer pause to avoid interrupting the user. Lower values (e.g., 0.1s)
- Responds quickly once silence is detected
- Suitable for fast exchanges
- Risk: may cut in when users pause to think or breathe
- Gives users more room to pause mid-sentence
- Feels smoother on calls or with slower speech
- Trade-off: responses feel slower
Key Difference
- Fast Turn Delay → Used when the agent can understand meaning and determine if the user is finished.
- Slow Turn Delay → Used when the agent only knows that the user stopped making sound.
Advanced Timing (Optional)
Enabling Show Advanced Timing reveals fine-grained control of turn-taking behavior, interruptions, and silence handling. These settings allow you to tune how natural, fast, or cautious the agent feels in conversation. Below is a reference for each setting.Maximum Semantic Delay
Sets the maximum time the agent will wait before responding when the semantic turn detector is unsure whether the caller has finished speaking. Most interactions are handled confidently by the turn detector. However, in ambiguous moments - such as trailing speech, mumbling, or background noise - the detector may hesitate. Max Semantic Delay acts as a safety cap for these uncertain cases.- Higher values: the agent waits longer, reducing the chance of interrupting the caller but increasing response latency.
- Lower values: the agent responds faster, but with a higher risk of speaking over the caller.
Minimum Interruption Duration
Defines how long (in seconds) the caller must speak (continuously) before the agent accepts it as a valid interruption and stops talking - also known as a barge-in. This prevents the agent from pausing its response due to tiny noises or accidental sounds.- Higher values: the caller must speak for longer before interrupting the agent.
- Lower values: even very short utterances can interrupt the agent.
Minimum Interruption Words
Specifies how many words the caller must speak before the agent treats it as a valid interruption (barge-in). This works alongside Minimum Interruption Duration to ensure the agent only stops talking when the caller clearly intends to speak — not due to background noise or accidental sounds.- Higher values: the caller must say more words before interrupting.
- Lower values: even very short phrases (e.g., “wait”) can interrupt the agent.
VAD Activation Threshold
Controls how sensitive the local voice-activity detector (VAD) is when deciding whether the caller is actually speaking. VAD listens to the audio signal and tries to distinguish real speech from background noise. This setting adjusts how strong the speech signal must be before VAD treats it as intentional speech.-
Lower values (more sensitive):
The system treats quieter or softer sounds as speech. Useful in quiet environments, but may trigger false positives in noisy ones. -
Higher values (less sensitive):
The caller must speak more clearly or with stronger volume for VAD to recognize it. Helps reduce accidental triggers in noisy conditions.
VAD Minimum Speech Duration
Defines how long the caller must speak continuously before VAD treats the sound as real speech rather than background noise. This helps the system avoid reacting to coughs, mic pops, keyboard clicks, or other brief sounds.- Higher values: the caller must speak for longer before being recognized as actively talking.
- Lower values: shorter bursts of speech are accepted, making the agent more responsive but more prone to accidental triggers.
Prefix Padding
Specifies how much audio to retain from just before VAD detects speech, which helps capture the very beginning of what the caller says. Without prefix padding, the system might clip off the first syllable or two when the caller interrupts (barge-in). Adding padding ensures a more complete transcription.- Higher values: more audio is included before the detected speech; safer for capturing soft or quick interruptions.
- Lower values: less pre-roll audio; tighter and faster barge-in handling, but may miss the start of a sentence.
Allow Interruptions
Controls whether the caller is allowed to interrupt the agent while it is speaking (a behavior known as barge-in).- Enabled: the caller can speak at any time, and the agent will stop talking as soon as an interruption is detected. This creates a more natural, conversational experience.
- Disabled: the agent completes its full message before listening again, preventing mid-sentence interruptions.
Resume After Brief Sounds
Determines whether the agent should continue speaking after short, non-meaningful sounds from the caller — such as coughs, “mm-hmm,” throat clears, or other quick noises. When enabled, the agent will pause momentarily but then resume its message automatically unless the caller continues speaking long enough to trigger a full interruption.- Enabled: brief noises do not interrupt the agent. It resumes speaking after a short delay.
- Disabled: any detected sound is treated as speech, making interruptions more sensitive.
Resume Timeout
Determines how long the agent waits before continuing its response after a non-interrupting sound (such as a cough, “mm-hmm,” or short vocalization). This setting only applies when:- The user makes a brief noise that does not meet the interruption thresholds
(e.g., Minimum Interruption Duration or Minimum Interruption Words), and - Resume After Brief Sounds is enabled.
- Shorter values: the agent continues speaking sooner after minor noises.
- Longer values: the agent waits longer before resuming, which can feel more natural for callers who make short filler sounds.
End Call on Silence
Enables the workflow to automatically end the call if the caller remains silent for too long.This prevents the agent from staying on the line indefinitely when a caller has hung up, walked away, or abandoned the session. When enabled:
- The system begins monitoring for extended silence.
- If no meaningful speech is detected within the configured timeout window, the call ends gracefully.
- Outbound calls where recipients often pick up but say nothing.
- Workflows with long pauses that need guardrails to avoid unnecessary usage charges.
- Any scenario where abandoned calls should resolve without agent intervention.
Silence Timeout (s)
Defines how long the system should wait (in seconds) before ending the call due to sustained silence.This value is only used when End Call on Silence is enabled.
- Lower values (10–30s): end abandoned calls quickly and reduce wasted minutes.
- Higher values (60–120s): allow more time for callers who pause frequently or require a longer thinking period.
Best Practices
- Keep Master Instructions concise and universal — use Agent nodes for case-specific instructions.
- Use External Inputs only for data you actually need; avoid cluttering the interface.
- Adjust timing carefully — overly aggressive interruption settings may feel unnatural.
- Always verify test runs after changing timing or transcriber/voice models.
Summary
The Start node establishes the baseline configuration for the workflow:- Default models (Hear/Think/Speak)
- Master instruction
- Runtime inputs
- Timing & interruption behavior
- The starting point of the node graph

