Deep Expertise Track · Lesson 7

Handoff Orchestration

Multi-agent Pattern 3: agents hand control to each other dynamically

Handoff Orchestration: Dynamic Delegation

Lesson 7 — Multi-agent Pattern 3: agents hand control to each other based on context

What you'll learn
  1. What handoff orchestration is (triage, routing, dynamic delegation)
  2. How it differs from sequential (fixed order) and concurrent (parallel)
  3. Build a customer support triage agent that hands off to specialists
  4. The infinite handoff loop problem and how to prevent it

The Pattern

┌──────────────────────────────────────────────────────────────┐ │ HANDOFF ORCHESTRATION │ │ (triage / routing / dynamic delegation) │ │ │ │ ┌──────────┐ │ │ │ Triage │──── "this is a billing issue" ──▶ ┌──────────┐ │ │ │ Agent │──── "this is technical" ──▶ │ Billing │ │ │ │ (first) │──── "this needs a human" ──▶ │ Agent │ │ │ └──────────┘ └──────────┘ │ │ │ │ │ │ │ ONE agent active at a time. │ │ │ │ Control TRANSFERS (not shared). ▼ │ │ │ ┌──────────┐ │ │ └────────────────────────────────────────│ Technical │ │ │ │ Agent │ │ │ └──────────┘ │ │ │ │ Key: The triage agent DECIDES who to hand off to, │ │ based on the conversation. The order is NOT predetermined. │ └──────────────────────────────────────────────────────────────┘

Source: Microsoft Azure — AI Agent Orchestration Patterns

Handoff vs Sequential vs Concurrent

PatternWho decides order?Parallel?Agents talk?
SequentialDeveloper (fixed)NoHand off output only
ConcurrentDeveloper (all run)YesNo — independent
HandoffThe agent decidesNo (one at a time)Transfer full context

Real-World Example: Customer Support

Microsoft's guide describes a telecom CRM with handoff agents:

CUSTOMER SUPPORT HANDOFF CHAIN: User: "My internet is down and I got charged twice this month" ┌──────────────┐ │ Triage Agent │── "This has 2 issues: technical + billing" │ (generalist) │── "Hand off to Technical Agent first" └──────┬───────┘ ▼ ┌──────────────┐ │ Technical │── "Checking network status in your area..." │ Agent │── "Found outage. Will be fixed in 2 hours." │ │── "Hand off to Billing Agent for the charge issue" └──────┬───────┘ ▼ ┌──────────────┐ │ Billing Agent│── "Found duplicate charge. Refunding ₹500." │ │── "Both issues resolved. Hand off to Triage for closure." └──────┬───────┘ ▼ ┌──────────────┐ │ Triage Agent │── "Both issues resolved. Closing ticket." └──────────────┘

Build It: BA Triage System

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os

llm = ChatOpenAI(model="deepseek-chat", api_key=os.getenv("DEEPSEEK_API_KEY"),
                 base_url="https://api.deepseek.com", temperature=0)

# Triage agent: decides who to hand off to
triage_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a BA triage agent. Analyze the user's request and decide
which specialist should handle it. Choose ONE:
- "requirements" → for writing user stories, BRDs, acceptance criteria
- "process" → for process mapping, workflow design, gap analysis
- "data" → for data analysis, SQL queries, report requirements
- "done" → if you can answer directly without a specialist

Respond with ONLY the specialist name or "done"."""),
    ("user", "{request}"),
])

# Specialist agents
specialists = {
    "requirements": ChatPromptTemplate.from_messages([
        ("system", "You are a requirements specialist. Write detailed user stories "
                   "with acceptance criteria in BDD format (Given/When/Then)."),
        ("user", "{request}"),
    ]) | llm,
    
    "process": ChatPromptTemplate.from_messages([
        ("system", "You are a process analysis specialist. Map the process flow, "
                   "identify bottlenecks, and suggest improvements."),
        ("user", "{request}"),
    ]) | llm,
    
    "data": ChatPromptTemplate.from_messages([
        ("system", "You are a data analysis specialist. Define data requirements, "
                   "suggest SQL queries, and identify data quality issues."),
        ("user", "{request}"),
    ]) | llm,
}

def run_handoff(request: str, max_handoffs: int = 5):
    """Triage agent routes to specialist. Specialist may hand back."""
    current_request = request
    
    for i in range(max_handoffs):
        # Step 1: Triage decides who handles it
        route = (triage_prompt | llm).invoke({"request": current_request}).content.strip()
        print(f"Handoff {i+1}: routing to '{route}'")
        
        if route == "done":
            return "Triage agent answered directly."
        
        if route not in specialists:
            return f"Error: unknown specialist '{route}'"
        
        # Step 2: Specialist handles it
        result = specialists[route].invoke({"request": current_request}).content
        print(f"  Specialist response: {result[:100]}...")
        return result
    
    return "Max handoffs reached."

result = run_handoff("Write user stories for a login page with OTP verification")

The Infinite Handoff Problem

Key Risk

Agents can bounce a task back and forth forever: Triage → Specialist A → "Not my domain" → Triage → Specialist B → "Actually A should handle this" → Triage → ...

Always set max_handoffs — the same as max_iterations in the ReAct loop. Without it, you'll burn tokens indefinitely.

The one-sentence summary

Handoff orchestration lets agents dynamically transfer control to the right specialist — best when you can't predict which expert is needed, but always cap handoffs to prevent infinite loops.

Practice Drill

  1. Create ba-work-agent/handoff_system.py with the code above
  2. Test with different inputs: "Map the order fulfillment process", "Write SQL for monthly sales report", "Analyze login requirements"
  3. Does the triage agent route correctly every time?
  4. Add a 4th specialist: "testing" for test cases and UAT plans
  5. Try a tricky input that could go to 2 specialists. Does the triage agent pick one?
⚡ Quick Check
Q1: What's the key difference between handoff and sequential orchestration?
Show answer

In sequential, the developer fixes the order (A → B → C always). In handoff, the agent decides who to transfer to based on the conversation. The order is dynamic and unpredictable.

Q2: When should you NOT use handoff?
Show answer

When the right specialist is identifiable from the initial input. If you always know "billing issues go to billing agent," use deterministic routing instead — it's cheaper and more predictable. Handoff is for when the right specialist emerges during processing.

Want to see these patterns in action?

Explore the live apps built with these agent architectures.

Explore the Lab →

← Back to Deep Expertise Track