Chat Model Nodes

Chat Model nodes determine which large-language model the agent uses for its reasoning, responses, and task execution.
Each model node represents a different provider or model family, but they all serve the same purpose: powering the agent’s thought process during a session. These nodes allow you to customize:

Model selection — choose the specific LLM used for generation.
Advanced settings — such as reasoning depth or provider-specific optimization.
Performance tradeoffs — balancing accuracy, latency, and cost depending on your use case.

Chat models can be configured in two places:

Start Node — sets the default model for the entire call.
Agent Nodes — optionally override the Start Node’s model for specific tool paths.

Below are the supported model providers and their configuration details.

OpenAI Brain

The OpenAI Brain node selects which OpenAI large-language model powers the agent’s reasoning and responses.
This model is invoked whenever the agent “thinks” or generates text during a call.

Main Settings

Model

Choose the OpenAI model used for this agent. Models vary in speed, cost, and reasoning depth.
The o-series models (o1, o3, o4) are optimized for advanced reasoning tasks, while the GPT-4.x and GPT-5.x families provide strong general performance for real-time voice agents. The dropdown includes:

GPT-5.1 and GPT-5.x variants
GPT-4.1 and GPT-4.x variants
o-series models (o1/o3/o4, Mini/Nano variants)

Select the model that best fits your latency and reasoning requirements.

Reasoning Effort (GPT-5.1)

Controls how much internal reasoning GPT-5.1 performs. This setting only applies when using a GPT-5.1 model. Options:

None (Recommended for Voice) Choosing None allows OpenAI to automatically route reasoning and is generally preferred for real-time conversational latency.
Low
Medium
High

OpenAI Realtime Brain

The OpenAI Realtime Brain is a multimodal realtime model that handles hearing, thinking, and speaking within a single node.
When selected on the Start node or an Agent node, it automatically replaces the Hear, Think, and Speak connectors (unless you enable a separate TTS voice from the settings).

Main Settings

Use Separate Voice

Controls whether or not the realtime model returns audio. Options:

No (Use OpenAI built-in voice) — Use OpenAI’s built-in realtime voice. Simpler but slightly more expensive than using a different Voice model.
Yes (Use separate TTS node) — Run the realtime model in text-only mode and use a separate Voice model for audio. This gives you more control over voices.
- A Speak connector becomes available so you can attach a separate TTS node.
- Voice and Speed fields are hidden.

Voice

Voice used when the model returns built-in audio.
Only shown when Use Separate Voice = No.

Model

Select the OpenAI Realtime model to use.

Speed

Controls playback speed of generated audio.
Range: 0.25 – 1.5
Only shown when Use Separate Voice = No.

Introduction

Organizations

Projects

Workflow Builder

Nodes

Telephony

FAQ

OpenAI Brain

Main Settings

Model

Reasoning Effort (GPT-5.1)

OpenAI Realtime Brain

Main Settings

Use Separate Voice

Voice

Model

Speed

Introduction

Organizations

Projects

Workflow Builder

Nodes

Telephony

FAQ

​OpenAI Brain

​Main Settings

​Model

​Reasoning Effort (GPT-5.1)

​OpenAI Realtime Brain

​Main Settings

​Use Separate Voice

​Voice

​Model

​Speed

OpenAI Brain

Main Settings

Model

Reasoning Effort (GPT-5.1)

OpenAI Realtime Brain

Main Settings

Use Separate Voice

Voice

Model

Speed