Deep Expertise Track · Lesson 7
Handoff Orchestration
Multi-agent Pattern 3: agents hand control to each other dynamically
Handoff Orchestration: Dynamic Delegation
Lesson 7 — Multi-agent Pattern 3: agents hand control to each other based on context
- What handoff orchestration is (triage, routing, dynamic delegation)
- How it differs from sequential (fixed order) and concurrent (parallel)
- Build a customer support triage agent that hands off to specialists
- The infinite handoff loop problem and how to prevent it
The Pattern
Source: Microsoft Azure — AI Agent Orchestration Patterns
Handoff vs Sequential vs Concurrent
| Pattern | Who decides order? | Parallel? | Agents talk? |
|---|---|---|---|
| Sequential | Developer (fixed) | No | Hand off output only |
| Concurrent | Developer (all run) | Yes | No — independent |
| Handoff | The agent decides | No (one at a time) | Transfer full context |
Real-World Example: Customer Support
Microsoft's guide describes a telecom CRM with handoff agents:
Build It: BA Triage System
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os
llm = ChatOpenAI(model="deepseek-chat", api_key=os.getenv("DEEPSEEK_API_KEY"),
base_url="https://api.deepseek.com", temperature=0)
# Triage agent: decides who to hand off to
triage_prompt = ChatPromptTemplate.from_messages([
("system", """You are a BA triage agent. Analyze the user's request and decide
which specialist should handle it. Choose ONE:
- "requirements" → for writing user stories, BRDs, acceptance criteria
- "process" → for process mapping, workflow design, gap analysis
- "data" → for data analysis, SQL queries, report requirements
- "done" → if you can answer directly without a specialist
Respond with ONLY the specialist name or "done"."""),
("user", "{request}"),
])
# Specialist agents
specialists = {
"requirements": ChatPromptTemplate.from_messages([
("system", "You are a requirements specialist. Write detailed user stories "
"with acceptance criteria in BDD format (Given/When/Then)."),
("user", "{request}"),
]) | llm,
"process": ChatPromptTemplate.from_messages([
("system", "You are a process analysis specialist. Map the process flow, "
"identify bottlenecks, and suggest improvements."),
("user", "{request}"),
]) | llm,
"data": ChatPromptTemplate.from_messages([
("system", "You are a data analysis specialist. Define data requirements, "
"suggest SQL queries, and identify data quality issues."),
("user", "{request}"),
]) | llm,
}
def run_handoff(request: str, max_handoffs: int = 5):
"""Triage agent routes to specialist. Specialist may hand back."""
current_request = request
for i in range(max_handoffs):
# Step 1: Triage decides who handles it
route = (triage_prompt | llm).invoke({"request": current_request}).content.strip()
print(f"Handoff {i+1}: routing to '{route}'")
if route == "done":
return "Triage agent answered directly."
if route not in specialists:
return f"Error: unknown specialist '{route}'"
# Step 2: Specialist handles it
result = specialists[route].invoke({"request": current_request}).content
print(f" Specialist response: {result[:100]}...")
return result
return "Max handoffs reached."
result = run_handoff("Write user stories for a login page with OTP verification")
The Infinite Handoff Problem
Agents can bounce a task back and forth forever: Triage → Specialist A → "Not my domain" → Triage → Specialist B → "Actually A should handle this" → Triage → ...
Always set max_handoffs — the same as max_iterations in the ReAct loop. Without it, you'll burn tokens indefinitely.
The one-sentence summary
Handoff orchestration lets agents dynamically transfer control to the right specialist — best when you can't predict which expert is needed, but always cap handoffs to prevent infinite loops.
Practice Drill
- Create
ba-work-agent/handoff_system.pywith the code above - Test with different inputs: "Map the order fulfillment process", "Write SQL for monthly sales report", "Analyze login requirements"
- Does the triage agent route correctly every time?
- Add a 4th specialist: "testing" for test cases and UAT plans
- Try a tricky input that could go to 2 specialists. Does the triage agent pick one?
Q1: What's the key difference between handoff and sequential orchestration?
Show answer
In sequential, the developer fixes the order (A → B → C always). In handoff, the agent decides who to transfer to based on the conversation. The order is dynamic and unpredictable.
Q2: When should you NOT use handoff?
Show answer
When the right specialist is identifiable from the initial input. If you always know "billing issues go to billing agent," use deterministic routing instead — it's cheaper and more predictable. Handoff is for when the right specialist emerges during processing.
Want to see these patterns in action?
Explore the live apps built with these agent architectures.
Explore the Lab →