Deep Expertise Track · Lesson 8
Group Chat Orchestration
Multi-agent Pattern 4: agents collaborate through shared conversation
Group Chat Orchestration: Debate and Maker-Checker
Lesson 8 — Multi-agent Pattern 4: agents collaborate through a shared conversation thread
- What group chat orchestration is (roundtable, multi-agent debate, council)
- The maker-checker sub-pattern (generator-verifier loop, reflection)
- How it maps to Anthropic's "Evaluator-Optimizer" workflow
- Why Microsoft recommends limiting to 3 agents max
The Pattern
Source: Microsoft Azure — AI Agent Orchestration Patterns
The Maker-Checker Sub-Pattern
Microsoft defines a specific type of group chat called maker-checker (also known as evaluator-optimizer, generator-verifier, or reflection loop):
Source: Anthropic — Building Effective Agents (Evaluator-Optimizer section)
When to Use vs Avoid
| Use when | Avoid when |
|---|---|
| Consensus-building needed | Basic task delegation is sufficient |
| Quality control via debate | Real-time processing (chat is slow) |
| Multidisciplinary discussion | Deterministic workflow without discussion |
| Iterative refinement (maker-checker) | No clear way to determine completion |
"To maintain effective control, consider limiting group chat orchestration to three or fewer agents." More agents = harder to manage conversation flow and prevent infinite loops.
Build It: Maker-Checker for BRD Quality
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os
llm = ChatOpenAI(model="deepseek-chat", api_key=os.getenv("DEEPSEEK_API_KEY"),
base_url="https://api.deepseek.com", temperature=0)
maker_prompt = ChatPromptTemplate.from_messages([
("system", "You are a BA document writer. Write or revise a BRD section based on "
"the requirements and any feedback from the reviewer."),
("user", "Requirements: {requirements}\n\nPrevious draft: {draft}\n\n"
"Reviewer feedback: {feedback}"),
])
checker_prompt = ChatPromptTemplate.from_messages([
("system", """You are a strict BRD quality reviewer. Evaluate the draft against:
1. Are user stories in proper format (As a... I want... So that...)?
2. Are acceptance criteria testable (Given/When/Then)?
3. Is scope clearly defined?
4. Are edge cases covered?
If ALL criteria are met, respond: "APPROVED"
If issues remain, respond: "REVISION NEEDED: [list issues]"""),
("user", "{draft}"),
])
def run_maker_checker(requirements: str, max_rounds: int = 3):
draft = ""
feedback = "No previous draft. Write the first version."
for round_num in range(max_rounds):
print(f"\n--- Round {round_num + 1} ---")
# Maker generates/revises
print("Maker: Writing draft...")
draft = (maker_prompt | llm).invoke({
"requirements": requirements,
"draft": draft or "(none yet)",
"feedback": feedback,
}).content
# Checker evaluates
print("Checker: Reviewing draft...")
review = (checker_prompt | llm).invoke({"draft": draft}).content
if "APPROVED" in review:
print("Checker: APPROVED!")
return draft
print(f"Checker: {review[:150]}...")
feedback = review
print("Max rounds reached. Returning last draft.")
return draft
result = run_maker_checker("Login page with OTP, forgot password, social login, and session timeout")
print(f"\n=== FINAL BRD ===\n{result}")
The one-sentence summary
Group chat orchestration lets agents collaborate through a shared conversation — best for consensus-building and quality control, but keep it to 3 agents max and always set iteration caps to prevent infinite debate loops.
Practice Drill
- Create
ba-work-agent/maker_checker.pywith the code above - Run it. How many rounds does it take to get APPROVED?
- Make the checker stricter — add a 5th criterion. Does it take more rounds?
- Try making the checker too strict (never approves). What happens at max_rounds?
Q1: What's the difference between maker-checker and the evaluator-optimizer pattern from Anthropic?
Show answer
They're the same pattern under different names. Anthropic calls it "evaluator-optimizer" — one LLM generates, another evaluates and gives feedback, loop until good enough. Microsoft calls it "maker-checker." Same concept: generate → evaluate → revise → repeat.
Q2: Why does Microsoft recommend max 3 agents for group chat?
Show answer
Conversation flow becomes unmanageable with more agents. The chat manager has to decide who speaks next, and with 5+ agents, turn-taking gets chaotic, loops become likely, and the conversation thread grows too long. 3 agents with distinct roles is the sweet spot.
Want to see these patterns in action?
Explore the live apps built with these agent architectures.
Explore the Lab →