Each model node represents a different provider or model family, but they all serve the same purpose: powering the agent’s thought process during a session. These nodes allow you to customize:
- Model selection — choose the specific LLM used for generation.
- Advanced settings — such as reasoning depth or provider-specific optimization.
- Performance tradeoffs — balancing accuracy, latency, and cost depending on your use case.
- Start Node — sets the default model for the entire call.
- Agent Nodes — optionally override the Start Node’s model for specific tool paths.
OpenAI Brain
The OpenAI Brain node selects which OpenAI large-language model powers the agent’s reasoning and responses.This model is invoked whenever the agent “thinks” or generates text during a call.
Main Settings
Model
Choose the OpenAI model used for this agent. Models vary in speed, cost, and reasoning depth.The o-series models (o1, o3, o4) are optimized for advanced reasoning tasks, while the GPT-4.x and GPT-5.x families provide strong general performance for real-time voice agents. The dropdown includes:
- GPT-5.1 and GPT-5.x variants
- GPT-4.1 and GPT-4.x variants
- o-series models (o1/o3/o4, Mini/Nano variants)
Select the model that best fits your latency and reasoning requirements.
Reasoning Effort (GPT-5.1)
Controls how much internal reasoning GPT-5.1 performs. This setting only applies when using a GPT-5.1 model. Options:- None (Recommended for Voice) Choosing None allows OpenAI to automatically route reasoning and is generally preferred for real-time conversational latency.
- Low
- Medium
- High
OpenAI Realtime Brain
The OpenAI Realtime Brain is a multimodal realtime model that handles hearing, thinking, and speaking within a single node.When selected on the Start node or an Agent node, it automatically replaces the Hear, Think, and Speak connectors (unless you enable a separate TTS voice from the settings).
Main Settings
Use Separate Voice (cheaper)
Controls whether or not the realtime model returns audio. Options:- No (Use OpenAI built-in voice) — Use OpenAI’s built-in realtime voice. Simpler but slightly more expensive than using a different Voice model.
- Yes (Use separate TTS node, cheaper) — Run the realtime model in text-only mode and use a separate Voice model for audio. This is usually cheaper and gives you more control over voices.
- A Speak connector becomes available so you can attach a separate TTS node.
- Voice and Speed fields are hidden.
Voice
Voice used when the model returns built-in audio.Only shown when Use Separate Voice = No.
Model
Select the OpenAI Realtime model to use.Speed
Controls playback speed of generated audio.Range: 0.25 – 1.5
Only shown when Use Separate Voice = No.

