Deep Expertise Track · Lesson 2
The ReAct Loop
Building an agent from scratch with zero frameworks — the Think-Act-Observe loop
The ReAct Loop: Building an Agent From Scratch
Lesson 2 — understand the agent loop by building one with zero frameworks
- What ReAct (Reason + Act) actually is — from the original 2022 paper
- How to build a working agent in 50 lines of Python with NO framework
- Why the loop is the defining feature of an agent (not the tools, not the LLM)
- The 3 failure modes of the ReAct loop and how to guard against them
What is ReAct?
ReAct (pronounced "ree-act") was introduced in a 2022 paper by Yao et al. at Princeton University. The name is a portmanteau of Reasoning + Acting. The core insight is simple but powerful:
LLMs are good at reasoning (chain-of-thought) but can't take action. They're good at acting (generating text) but their reasoning degrades without external feedback. Combine both in a loop — reason about what to do, take action (call a tool), observe the result, reason again — and you get an agent that's smarter than either approach alone.
Source: Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models" (2022)
The ReAct Loop Visualized
Build It From Scratch (No Framework)
Anthropic recommends: "Start by using LLM APIs directly. Many patterns can be implemented in a few lines of code." Let's prove it. Here's a working ReAct agent in ~50 lines of Python using only the OpenAI SDK (which works with DeepSeek):
Source: Anthropic — Building Effective Agents
import json
from openai import OpenAI
client = OpenAI(api_key="your-key", base_url="https://api.deepseek.com")
# --- Define tools as plain Python functions ---
def get_stock_price(ticker: str) -> str:
"""Get current stock price"""
return f"{ticker} is at ₹1,054"
def get_financials(ticker: str) -> str:
"""Get quarterly financials"""
return f"{ticker}: Revenue ₹85,000Cr, Net Profit ₹17,000Cr, NIM 3.2%"
def search_news(query: str) -> str:
"""Search recent news"""
return f"RBI may cut rates next quarter. Positive for banks."
# Registry: tool name → function
TOOLS = {
"get_stock_price": get_stock_price,
"get_financials": get_financials,
"search_news": search_news,
}
# --- The ReAct Loop ---
def run_agent(goal: str, max_iterations: int = 10):
"""The simplest possible agent. No framework. Just a loop."""
# The system prompt tells the LLM HOW to reason
system_prompt = f"""You are a stock research agent.
Available tools: {list(TOOLS.keys())}
To use a tool, output EXACTLY this format:
Thought: your reasoning about what to do next
Action: tool_name
Action Input: the argument to pass to the tool
When you have enough information, output:
Thought: I now have enough information
Final Answer: your complete answer
Tools available: {list(TOOLS.keys())}"""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": goal},
]
for i in range(max_iterations):
print(f"\n--- Iteration {i+1} ---")
# 1. Call the LLM
response = client.chat.completions.create(
model="deepseek-chat",
messages=messages,
temperature=0,
)
output = response.choices[0].message.content
print(output)
# 2. Check if the LLM gave a final answer
if "Final Answer:" in output:
return output.split("Final Answer:")[1].strip()
# 3. Parse the action from the LLM output
if "Action:" not in output:
messages.append({"role": "assistant", "content": output})
messages.append({"role": "user", "content": "Please use a tool or give a Final Answer."})
continue
# Extract tool name and input
lines = output.split("\n")
action_line = [l for l in lines if l.startswith("Action:")][0]
action = action_line.split("Action:")[1].strip()
input_line = [l for l in lines if l.startswith("Action Input:")][0]
action_input = input_line.split("Action Input:")[1].strip()
# 4. Call the tool and get observation
if action in TOOLS:
observation = TOOLS[action](action_input)
else:
observation = f"Error: unknown tool '{action}'"
print(f"Observation: {observation}")
# 5. Feed observation back to the LLM (THE LOOP)
messages.append({"role": "assistant", "content": output})
messages.append({"role": "user", "content": f"Observation: {observation}"})
return "Max iterations reached without a final answer."
# --- Run it ---
result = run_agent("Should I hold or sell SBIN?")
print(f"\n=== RESULT ===\n{result}")
What's Happening in Each Iteration
The 3 Failure Modes
| Failure | What happens | Guard |
|---|---|---|
| Infinite loop | LLM keeps calling tools without ever giving a Final Answer | max_iterations cap. Always set this. |
| Parse failure | LLM doesn't follow the Thought/Action format. Output can't be parsed. | handle_parsing_errors — send "please use the correct format" back to LLM |
| Hallucinated tool | LLM calls a tool that doesn't exist | Tool registry check. Return error message, LLM retries with correct tool. |
These are the exact same failure modes that LangChain's AgentExecutor handles for you. That's why frameworks exist — they solve these problems once so you don't have to. But now you know what's happening under the hood.
The one-sentence summary
ReAct is a loop where the LLM reasons (Thought), picks a tool (Action), sees the result (Observation), and repeats until it has enough to answer — and you can build it in 50 lines with no framework.
Practice Drill
- Create a new file
react_from_scratch.pyin yourba-work-agentproject - Copy the code above and add your
DEEPSEEK_API_KEY - Run it:
python react_from_scratch.py - Watch the loop execute. Count how many iterations it takes.
- Now change the goal to something vague like "Tell me about SBIN" — does the agent handle it differently?
- Try removing a tool from the TOOLS dict — does the agent recover when it tries to call a missing tool?
Q1: In the ReAct loop, what triggers the loop to end?
Show answer
The LLM outputs "Final Answer:" — the parser detects this and returns the answer, breaking the loop. If the LLM never does this, max_iterations is the safety net.
Q2: Why does the observation get sent back as a "user" message, not an "assistant" message?
Show answer
Because the observation is the TOOL's output, not the LLM's output. The LLM is the assistant; tools are external. The message alternation (assistant → user → assistant) keeps the conversation well-formed for the API. This is a chat completions API constraint — messages must alternate roles.
Q3: What's the difference between this from-scratch agent and your LangChain ba-work-agent?
Show answer
Functionally identical. LangChain's create_react_agent + AgentExecutor does the exact same loop — parse LLM output, extract action, call tool, feed observation back. The difference is LangChain handles: (1) parsing errors, (2) tool registration with schemas, (3) verbose logging, (4) async support, (5) streaming. You're now paying the abstraction tax for convenience.
Where to Go Deeper
- ReAct Paper (Yao et al., 2022) — Read sections 1-3 for the theoretical foundation.
- IBM: What is a ReAct Agent? — Accessible overview for quick reference.
- Anthropic: Building Effective Agents — The "Agents" section at the bottom covers the autonomous agent loop.
Want to see these patterns in action?
Explore the live apps built with these agent architectures.
Explore the Lab →