Guides
A step-by-step guide to writing Speak node prompts, variable extraction instructions, and outcomes that qualify inbound leads accurately — without breaking when real callers deviate from the script.
Last updated
A rigid qualification script loses callers. A vague one wastes rep time. The difference between the two often comes down to how prompts, variables, and outcomes are written inside the agent builder.
This guide walks through how to write Speak node prompts, variableVariableA named value the voice agent stores during a conversation — caller name, intent, qualifying answers — and uses to drive routing and post-call actions. extraction instructions, and outcomes in Thoughtly that qualify inbound leads accurately — without breaking when real people talk like real people. Every example uses consumer-industry lead conversion scenarios: insurance quote requests, mortgage inquiries, home services estimates, and education enrollment calls.
Overfitting happens when your agent only works for callers who say exactly what you expected. In lead qualification, the symptoms are specific:
The result: callers abandon, variables come back empty, and your CRMCRMThe system of record for leads, contacts, deals, and activity. Thoughtly reads from and writes to your CRM continuously. fills with incomplete records or false negatives. The agent technically runs — just not for anyone who doesn't follow the script in your head.
When you use a Prompt speak node, you're giving the agent a mini playbook — not a script. The most common mistake is writing prompts that describe what to say word-for-word. Instead, write prompts that describe the goal, the constraints, and the tone.
Here is a prompt pattern that overfits:
Ask the caller: Do you currently have homeowners insurance?
If yes, ask who their current provider is.
If no, ask if they've ever had a policy before.And here is a prompt pattern that works:
Determine whether the caller currently has homeowners coverage
and, if so, who their provider is. Keep it conversational —
they may volunteer this without a direct question. If they're
unsure or don't have coverage, note that and move on.The second version handles real variations: the caller who says "Yeah, I'm with State Farm but I'm shopping around," the one who says "I let my policy lapse last year," and the one who just says "No." All three produce usable data. The first version stalls on the second and third.
Authoring checklist for Prompt speak nodes:
Reference Variables using the bolt icon for personalization — first name, lead source, or metadata passed from your CRM or automation trigger. The Advanced prompt in Settings is for persona and guardrails. Keep routing logic in nodes and outcomes, not in the global prompt.
Variables are where most overfitting happens. The extraction instructions field is the heart of each variable, and the quality of those instructions determines whether the agent captures real data or comes back empty.
Thoughtly's docs recommend this extraction template:
Goal: Extract the <thing> the caller states.
If multiple candidates: choose the most recent, high-confidence value.
If absent or unclear: return an empty value (no placeholder text).
Normalization: <how to clean or format, e.g., lowercase, numbers only>.
Do not invent values.| Field | Value |
|---|---|
| Variable name | coverage_type |
| Format | Text |
| Source | Current speak node |
| Extraction instructions | Extract the type of insurance coverage the caller is asking about (auto, home, renters, umbrella, life, etc.). Normalize to lowercase. If the caller mentions multiple types, capture the primary one they're calling about. If unclear, leave empty. Do not invent values. |
This handles "I need car insurance," "looking for auto coverage," "my wife wants to add me to her policy — it's for the house," and "I'm not sure, I think renters?" — all without breaking.
Extract whether the caller said 'auto insurance' or 'home insurance'.
Return one of those exact strings.That version misses renters, umbrella, life, and every synonym the caller might naturally use.
Thoughtly offers two outcome types per node. Picking the wrong one is a reliable source of overfitting — and you cannot mix both types in the same node.
coverage_type == "auto", zip_code is not empty, callback_ok == trueLabels like "Qualified" and "Highly qualified" overlap — the AI cannot reliably distinguish them. Use concrete, mutually exclusive labels instead:
Rules that are too specific — budget > 500000 AND coverage_type == "home" AND zip_code matches "9..." — create dead ends for callers who qualify on two of three criteria. Layer rules across multiple nodes instead, so each node checks one or two signals. And always add an Else/default outcome to catch unexpected responses.
The biggest structural mistake in qualification agents is cramming everything into one or two nodes. When a single Prompt speak node tries to collect five data points, the extraction instructions compete, the prompt becomes unwieldy, and callers feel interrogated rather than helped.
A better pattern is one signal per node, with graceful progression:
| Node | Type | Goal | Variables | Outcome type |
|---|---|---|---|---|
| 1 — Start | Start (Message) | Greeting, identify yourself, compliance language | None | N/A |
| 2 — Coverage need | Speak → Prompt | Find out what coverage they need and if they're switching or new | coverage_type, new_or_switch | Prompt-based: Stated need / Has questions / Not sure yet |
| 3 — Location + timeline | Speak → Prompt | Confirm state or metro area and general urgency | location, urgency | Rule-based: location is not empty → Node 4, else → re-ask |
| 4 — Next step | Speak → Prompt | Offer a quote callback or immediate transfer | None | Prompt-based: Ready now → Transfer / Prefers callback → booking / Not interested → End |
Each node collects one or two signals, validates them with a variable, and routes with a focused outcome. The caller experiences a conversation, not an intake form.
If callers ask product questions mid-qualification — "Do you cover flood damage?" or "What's your deductible range?" — don't build separate branching nodes for every possible question. Attach a Genius knowledge base with Q&A pairs for your most common product questions. The agent answers inline using retrieved knowledge and returns to the qualification flow without additional nodes.
Overfitting survives because builders test with ideal-path callers. Thoughtly's testing tools help you stress-test before going live.
The fastest way to check variable extraction and outcome routing. Run through these caller types:
Test with a real voice call to check pronunciation, pacing, and interruption behavior. Listen for moments where the agent pauses too long, talks over the caller, or re-asks something the caller already answered.
Use sample metadata to simulate different lead sources and CRM contexts. Pass in test data like a lead source, coverage type, or priority level to verify how the agent personalizes its approach.
After deploying a qualification agent, track these metrics to catch overfitting and underfitting early:
| Metric | What it tells you | Red flag |
|---|---|---|
| Variable fill rate | % of calls producing a non-empty value for each qualification variable | Any core variable below 60% fill rate needs looser extraction instructions |
| Outcome distribution | How callers distribute across outcome branches | 80%+ hitting the Else/default path means outcomes are too narrow |
| False positive rate | % of "qualified" leads that don't convert with a human rep | Agent is qualifying everyone — underfitting |
| False negative rate | How many callers routed to "not qualified" actually had intent | Check call recordings — the agent is rejecting real leads |
| Avg. call duration | Length of the qualification conversation | Under 1 min: not collecting enough. Over 5 min: interrogating |
| Caller drop-off by node | Where in the flow callers hang up | Spike at a specific node means that step is too aggressive or slow |
Use Thoughtly's call history and analytics to review outcomes, variable values, and transcripts across calls. Compare the agent's disposition against your CRM conversion data to close the feedback loop — the agent's "qualified" tag should correlate with downstream bookings and closed deals.
For inbound consumer leads, 3–5 core signals is the practical range: type of need, location or service area, urgency or timeline, eligibility or fit, and preferred next step. More than that and you're running an intake form. If deeper qualification is needed, split it across an initial call and a follow-up.
Use prompt-based outcomes for classifying caller intent — "interested," "has objections," "not a fit" — and rule-based outcomes for branching on structured data like zip code, coverage type, or consent flags. You can't mix both in the same node, so pick the type that matches the primary decision at that step.
A vague prompt lacks a clear goal: "Talk to the caller about insurance." A well-written flexible prompt has a specific goal but allows conversational variation: "Determine what type of insurance the caller needs. They may say it directly or describe a situation — capture the type either way." The goal is precise; the path to it stays open.
Overfitting signals: high variable-empty rates, callers getting stuck or routed to the wrong node, frequent Else/default hits, and the agent re-asking for information the caller already provided. Underfitting signals: every caller routes to "qualified," variables accept any value without validation, and human reps report that transferred leads are not actually ready.
No. Genius is a knowledge retrieval layer — it answers product and policy questions using a Q&A database. Qualification logic (what signals to collect, how to route based on answers, when to transfer) belongs in Speak nodes, Variables, and Outcomes. Use Genius to handle FAQ-type interruptions during qualification so you don't need separate branching nodes for every product question.