How to Write Lead-Qualification Prompts Without Overfitting

How to Write Lead-Qualification Prompts Without Overfitting | Thoughtly

Last updated July, 2026

How to Write Lead-Qualification Prompts Without Overfitting

A rigid qualification script loses callers. A vague one wastes rep time. The difference between the two often comes down to how prompts, variables, and outcomes are written inside the agent builder.

This guide walks through how to write Speak node prompts, variable extraction instructions, and outcomes in Thoughtly that qualify inbound leads accurately — without breaking when real people talk like real people. Every example uses consumer-industry lead conversion scenarios: insurance quote requests, mortgage inquiries, home services estimates, and education enrollment calls.

What You'll Need

A Thoughtly account with access to the Agent Builder
At least one inbound voice agent deployed or ready to deploy
Familiarity with Speak nodes, Variables, and Outcomes (the Thoughtly docs cover each in detail)
A clear definition of what "qualified" means for your team — the signals that predict conversion, not just a checklist

What Overfitting Looks Like in a Voice Agent

Overfitting happens when your agent only works for callers who say exactly what you expected. In lead qualification, the symptoms are specific:

Extraction instructions that require a specific phrase like "I need auto insurance" but miss "I'm shopping for car coverage" or "my wife said I should call about the house"
Boolean variables forced onto ambiguous answers. A true/false variable for "qualified" breaks when the real answer is "maybe — depends on the quote"
Overlapping prompt-based outcome labels like "Interested" and "Somewhat interested" that the AI cannot reliably distinguish
A single massive Speak node that tries to collect name, email, location, intent, urgency, and coverage type in one turn

The result: callers abandon, variables come back empty, and your CRM fills with incomplete records or false negatives. The agent technically runs — just not for anyone who doesn't follow the script in your head.

Write Goal-First Prompt Instructions

When you use a Prompt speak node, you're giving the agent a mini playbook — not a script. The most common mistake is writing prompts that describe what to say word-for-word. Instead, write prompts that describe the goal, the constraints, and the tone.

Here is a prompt pattern that overfits:

Ask the caller: Do you currently have homeowners insurance?
If yes, ask who their current provider is.
If no, ask if they've ever had a policy before.

Pattern that overfits

And here is a prompt pattern that works:

Determine whether the caller currently has homeowners coverage
and, if so, who their provider is. Keep it conversational —
they may volunteer this without a direct question. If they're
unsure or don't have coverage, note that and move on.

Pattern that allows natural conversation

The second version handles real variations: the caller who says "Yeah, I'm with State Farm but I'm shopping around," the one who says "I let my policy lapse last year," and the one who just says "No." All three produce usable data. The first version stalls on the second and third.

Authoring checklist for Prompt speak nodes:

State the goal in one sentence
List constraints — what not to say, length limits, any compliance language
Note must-say points only if regulation requires them (consent disclosures, licensing disclaimers)
End with a tone cue: "Keep it concise," "Stay warm but direct," or "Match the caller's pace"

Reference Variables using the bolt icon for personalization — first name, lead source, or metadata passed from your CRM or automation trigger. The Advanced prompt in Settings is for persona and guardrails. Keep routing logic in nodes and outcomes, not in the global prompt.

Design Variables That Extract Without Breaking

Variables are where most overfitting happens. The extraction instructions field is the heart of each variable, and the quality of those instructions determines whether the agent captures real data or comes back empty.

Thoughtly's docs recommend this extraction template:

Goal: Extract the <thing> the caller states.
If multiple candidates: choose the most recent, high-confidence value.
If absent or unclear: return an empty value (no placeholder text).
Normalization: <how to clean or format, e.g., lowercase, numbers only>.
Do not invent values.

Variable extraction template

Real example: insurance lead qualification

Field	Value
Variable name	coverage_type
Format	Text
Source	Current speak node
Extraction instructions	Extract the type of insurance coverage the caller is asking about (auto, home, renters, umbrella, life, etc.). Normalize to lowercase. If the caller mentions multiple types, capture the primary one they're calling about. If unclear, leave empty. Do not invent values.

This handles "I need car insurance," "looking for auto coverage," "my wife wants to add me to her policy — it's for the house," and "I'm not sure, I think renters?" — all without breaking.

Contrast with an overfit version

Extract whether the caller said 'auto insurance' or 'home insurance'.
Return one of those exact strings.

Overfit extraction — misses most real-world phrasing

That version misses renters, umbrella, life, and every synonym the caller might naturally use.

Variable design rules

Use Text format for open-ended qualification signals. Reserve Number for truly numeric values (budget, credit score, square footage) and Boolean only for clear yes/no decisions (consent, callback permission)
Set Source to Current speak node when the question is direct and you want only the latest answer. Use Conversation history when the caller may have mentioned the value earlier
Always include the empty-value fallback. "If absent or unclear: return an empty value" prevents the AI from inventing data
Include 2–3 examples of real caller phrasing in the extraction instructions — not to restrict extraction, but to show the AI the range of responses it should handle

Choose the Right Outcome Type for Each Decision

Thoughtly offers two outcome types per node. Picking the wrong one is a reliable source of overfitting — and you cannot mix both types in the same node.

When to use prompt-based outcomes

Classifying open-ended intent: "Is this caller interested, objecting, or asking a question?"
Handling varied phrasing where exact keywords are unpredictable
Routing on 3–5 distinct paths based on conversational tone or stated preference

When to use rule-based outcomes

Branching on captured variable values: coverage_type == "auto", zip_code is not empty, callback_ok == true
Deterministic logic for compliance decisions — consent checks, eligibility gates, licensing requirements
Checking the result of a mid-call Action (CRM lookup, scheduling API, verification)

The overfitting trap with prompt-based outcomes

Labels like "Qualified" and "Highly qualified" overlap — the AI cannot reliably distinguish them. Use concrete, mutually exclusive labels instead:

Wants to get a quote now
Interested but needs to talk to spouse
Not interested
Has questions before deciding

The overfitting trap with rule-based outcomes

Rules that are too specific — budget > 500000 AND coverage_type == "home" AND zip_code matches "9..." — create dead ends for callers who qualify on two of three criteria. Layer rules across multiple nodes instead, so each node checks one or two signals. And always add an Else/default outcome to catch unexpected responses.

Layer Qualification Across Multiple Nodes

The biggest structural mistake in qualification agents is cramming everything into one or two nodes. When a single Prompt speak node tries to collect five data points, the extraction instructions compete, the prompt becomes unwieldy, and callers feel interrogated rather than helped.

A better pattern is one signal per node, with graceful progression:

Example flow: inbound insurance qualification

Node	Type	Goal	Variables	Outcome type
1 — Start	Start (Message)	Greeting, identify yourself, compliance language	None	N/A
2 — Coverage need	Speak → Prompt	Find out what coverage they need and if they're switching or new	coverage_type, new_or_switch	Prompt-based: Stated need / Has questions / Not sure yet
3 — Location + timeline	Speak → Prompt	Confirm state or metro area and general urgency	location, urgency	Rule-based: location is not empty → Node 4, else → re-ask
4 — Next step	Speak → Prompt	Offer a quote callback or immediate transfer	None	Prompt-based: Ready now → Transfer / Prefers callback → booking / Not interested → End

Each node collects one or two signals, validates them with a variable, and routes with a focused outcome. The caller experiences a conversation, not an intake form.

When to use Genius instead of prompt logic

If callers ask product questions mid-qualification — "Do you cover flood damage?" or "What's your deductible range?" — don't build separate branching nodes for every possible question. Attach a Genius knowledge base with Q&A pairs for your most common product questions. The agent answers inline using retrieved knowledge and returns to the qualification flow without additional nodes.

Test with Diverse Caller Scenarios

Overfitting survives because builders test with ideal-path callers. Thoughtly's testing tools help you stress-test before going live.

Test Agent (text chat)

The fastest way to check variable extraction and outcome routing. Run through these caller types:

The cooperative caller who answers everything directly
The rambling caller who buries the answer in a story about their neighbor's roof
The defensive caller who gives one-word answers and pushes back on questions
The confused caller who doesn't know what type of coverage they need
The out-of-order caller who volunteers their location before you ask and skips the coverage question

Call Me (real call)

Test with a real voice call to check pronunciation, pacing, and interruption behavior. Listen for moments where the agent pauses too long, talks over the caller, or re-asks something the caller already answered.

What to verify in each test run

Did variables extract correctly, or are they empty/wrong?
Did outcomes route to the right node?
Did the agent adapt when the caller deviated from the expected flow?
Did the Else/default outcome catch edge cases?
Is the conversation under 3–4 minutes for a standard qualification?

Use sample metadata to simulate different lead sources and CRM contexts. Pass in test data like a lead source, coverage type, or priority level to verify how the agent personalizes its approach.

Common Mistakes

Writing prompts like scripts. Prompt speak nodes are instructions, not dialogue. Tell the agent what to accomplish. Use Message mode with Verbatim only for compliance-required language.
Using Conversation history when Current speak node is safer. If you ask "What's your email?" and extract from the full conversation, the agent might pull an email mentioned earlier in a different context.
Overlapping outcome labels. "Positive" and "Very positive" are meaningless to the AI. Use action-oriented labels: "Wants a quote," "Needs more info," "Not a fit."
Skipping the Else outcome. Every rule-based outcome set needs a default fallback. Without one, callers who don't match any rule get stuck in silence.
Too many variables per node. Three variables per node is a practical ceiling. Beyond that, extraction accuracy drops because the instructions compete for attention in the same response window.
Not testing negative paths. If you only test the happy path, you'll never find the overfitting. Test with callers who say no, who are confused, who interrupt, and who change their mind mid-call.
Putting qualification logic in the Advanced prompt. The Advanced prompt in Settings is for persona, tone, and guardrails. Keep routing and qualification behavior in Speak nodes and Outcomes.

Measuring Success

After deploying a qualification agent, track these metrics to catch overfitting and underfitting early:

Metric	What it tells you	Red flag
Variable fill rate	% of calls producing a non-empty value for each qualification variable	Any core variable below 60% fill rate needs looser extraction instructions
Outcome distribution	How callers distribute across outcome branches	80%+ hitting the Else/default path means outcomes are too narrow
False positive rate	% of "qualified" leads that don't convert with a human rep	Agent is qualifying everyone — underfitting
False negative rate	How many callers routed to "not qualified" actually had intent	Check call recordings — the agent is rejecting real leads
Avg. call duration	Length of the qualification conversation	Under 1 min: not collecting enough. Over 5 min: interrogating
Caller drop-off by node	Where in the flow callers hang up	Spike at a specific node means that step is too aggressive or slow

Use Thoughtly's call history and analytics to review outcomes, variable values, and transcripts across calls. Compare the agent's disposition against your CRM conversion data to close the feedback loop — the agent's "qualified" tag should correlate with downstream bookings and closed deals.

Frequently Asked Questions

How many qualification variables should an agent collect per call?

For inbound consumer leads, 3–5 core signals is the practical range: type of need, location or service area, urgency or timeline, eligibility or fit, and preferred next step. More than that and you're running an intake form. If deeper qualification is needed, split it across an initial call and a follow-up.

Should I use prompt-based or rule-based outcomes for qualification routing?

Use prompt-based outcomes for classifying caller intent — "interested," "has objections," "not a fit" — and rule-based outcomes for branching on structured data like zip code, coverage type, or consent flags. You can't mix both in the same node, so pick the type that matches the primary decision at that step.

What is the difference between a vague prompt and one that avoids overfitting?

A vague prompt lacks a clear goal: "Talk to the caller about insurance." A well-written flexible prompt has a specific goal but allows conversational variation: "Determine what type of insurance the caller needs. They may say it directly or describe a situation — capture the type either way." The goal is precise; the path to it stays open.

How do I know if my agent is overfitting versus underfitting?

Overfitting signals: high variable-empty rates, callers getting stuck or routed to the wrong node, frequent Else/default hits, and the agent re-asking for information the caller already provided. Underfitting signals: every caller routes to "qualified," variables accept any value without validation, and human reps report that transferred leads are not actually ready.

Can Genius replace qualification prompts entirely?

No. Genius is a knowledge retrieval layer — it answers product and policy questions using a Q&A database. Qualification logic (what signals to collect, how to route based on answers, when to transfer) belongs in Speak nodes, Variables, and Outcomes. Use Genius to handle FAQ-type interruptions during qualification so you don't need separate branching nodes for every product question.

Sources and Further Reading

Thoughtly Agent Builder overview — docs.thoughtly.com/agents/overview
Thoughtly Variables documentation — docs.thoughtly.com/agents/variables
Thoughtly Outcomes and branching — docs.thoughtly.com/agents/outcomes
Thoughtly Speak nodes reference — docs.thoughtly.com/agents/nodes
Thoughtly Tips and tricks — docs.thoughtly.com/agents/tips-and-tricks
Thoughtly Genius knowledge base — docs.thoughtly.com/genius/getting-started
How to Qualify Inbound Leads Over Voice and SMS with Thoughtly — thoughtly.com/blog/how-to-qualify-inbound-leads-voice-sms-thoughtly
How to Build a Speed-to-Lead AI Agent with Thoughtly — thoughtly.com/blog/how-to-build-speed-to-lead-ai-agent-thoughtly
How to Route Hot Leads to Humans with Early Call Summaries — thoughtly.com/blog/how-to-route-hot-leads-humans-early-call-summaries
How to Build an Appointment-Setting AI Agent — thoughtly.com/blog/how-to-build-appointment-setting-ai-agent

How to Write Lead-Qualification Prompts Without Overfitting

How to Write Lead-Qualification Prompts Without Overfitting

What You'll Need

What Overfitting Looks Like in a Voice Agent

Write Goal-First Prompt Instructions

Design Variables That Extract Without Breaking

Real example: insurance lead qualification

Contrast with an overfit version

Variable design rules

Choose the Right Outcome Type for Each Decision

When to use prompt-based outcomes

When to use rule-based outcomes

The overfitting trap with prompt-based outcomes

The overfitting trap with rule-based outcomes

Layer Qualification Across Multiple Nodes

Example flow: inbound insurance qualification

When to use Genius instead of prompt logic

Test with Diverse Caller Scenarios

Test Agent (text chat)

Call Me (real call)

What to verify in each test run

Common Mistakes

Measuring Success

Frequently Asked Questions

How many qualification variables should an agent collect per call?

Should I use prompt-based or rule-based outcomes for qualification routing?

What is the difference between a vague prompt and one that avoids overfitting?

How do I know if my agent is overfitting versus underfitting?

Can Genius replace qualification prompts entirely?

Sources and Further Reading

Keep reading

How to Build an Automotive Lead Qualification Agent with Thoughtly

How to Build a Home Services AI Agent with Thoughtly

How to Build an Insurance Lead Qualification Agent with Thoughtly

Every lead called instantly. Every conversation handled perfectly.