Prompt Engineering: The ReAct Pattern
Learn the ReAct pattern, a prompting technique that combines reasoning and action to build effective tool-using LLM agents.
What you'll learn
- ✓What ReAct is and why it works
- ✓The Thought-Action-Observation loop
- ✓How to write a ReAct prompt
- ✓When ReAct beats plain chain-of-thought
- ✓Common pitfalls and how to fix them
Prerequisites
- •Familiar with APIs
- •Basic prompting
What and Why
ReAct stands for “Reasoning and Acting.” It is a prompting pattern introduced in a 2022 paper that interleaves natural language reasoning with concrete tool calls. Instead of asking the model to either think silently or just call a tool, ReAct asks it to alternate between thinking out loud and acting on the world, observing the result, and thinking again.
The pattern works because it gives the model a structured way to plan, use external information, and self-correct. Plain chain-of-thought is great for closed problems but fails when the answer requires fresh data, computation, or a system query. ReAct lets the model reach outside its weights without losing its train of thought.
Mental Model
Think of a person solving a crossword with a dictionary on the table. They look at a clue, mutter a guess (thought), reach for the dictionary to check a word (action), read the entry (observation), and update their guess. Repeat until done. The thinking and the looking-up are interleaved, not separated.
ReAct codifies this into a transcript the LLM produces: Thought: ... Action: tool(args) Observation: ... repeating until a Final Answer.
Hands-on Example
Here is a minimal ReAct prompt for an agent with two tools, search and calculator:
You have access to:
- search(query): web search
- calc(expression): math
Use this format:
Thought: reason about what to do
Action: tool_name(args)
Observation: tool result
... (repeat) ...
Thought: I now know the answer
Final Answer: ...
Question: What is the population of Tokyo divided by the population of Paris?
A typical model trace looks like:
Thought: I need both populations.
Action: search("population of Tokyo 2025")
Observation: ~13.96 million
Thought: Now Paris.
Action: search("population of Paris 2025")
Observation: ~2.10 million
Thought: Divide.
Action: calc("13.96 / 2.10")
Observation: 6.65
Final Answer: About 6.65.
+--------+
| Thought|<-----+
+--------+ |
| |
v |
+--------+ |
| Action | |
+--------+ |
| |
v |
+--------+ |
| Observ.|------+
+--------+
|
v
Final Answer
In code, you parse each model output for Action:, execute the tool, paste the result as Observation:, and continue generating. Most agent frameworks (LangChain agents, LlamaIndex) implement this loop for you.
Trade-offs
ReAct is more expensive than plain prompting. Each loop is a model call, and tool calls add their own latency. Five iterations can easily mean a 10-second response. It also produces a lot of tokens you do not show to the user, so cost climbs.
It can also loop forever if the model gets confused or a tool returns garbage. You must enforce max steps, a stop condition, and a fallback.
Compared to native function calling (built into modern model APIs), ReAct is more transparent but less reliable in format. Function calling enforces JSON schemas; ReAct relies on the model to follow text conventions. In production today, most teams use function calling for execution and keep ReAct-style reasoning steps as visible thoughts.
ReAct shines when the problem genuinely requires multiple tool calls in sequence, with each call depending on the previous result. For simple one-shot tool use, plain function calling is enough.
Practical Tips
- Cap the number of iterations. Six is a reasonable default.
- Parse strictly. If the model emits malformed
Action:lines, return an error observation and let it retry. - Include a few-shot example in the prompt showing exactly the format you want.
- Truncate noisy observations. A 10,000-token search dump kills the next call.
- Log every Thought-Action-Observation trace. It is the most useful debugging artifact you will ever get.
- For production, prefer model-native tool calling but keep an internal
thoughtfield for visible reasoning.
Wrap-up
ReAct gives an LLM a clean structure for combining reasoning with tool use. The loop of thinking, acting, observing, and revising lets models tackle multi-step problems that go beyond their training data. It is more expensive and brittle than a single call, but for any task that genuinely needs external information or computation, ReAct or its native function-calling descendants are the foundation of modern LLM agents.
Related articles
- Prompt Engineering Prompt Engineering Anti-Patterns: Mistakes That Quietly Hurt Quality
A field guide to the most common prompt engineering anti-patterns, why they degrade LLM output quality, and concrete refactors that fix each one.
- Prompt Engineering Prompt Engineering: Chain of Thought
Use chain-of-thought prompting to unlock multi-step reasoning, with zero-shot, few-shot, and structured variants for production use.
- Prompt Engineering Prompt Engineering: Evaluation Loops
How to build evaluation loops for prompts so you can iterate with evidence instead of vibes. Covers datasets, graders, regressions, and how to make eval cheap enough to run often.
- Prompt Engineering Prompt Engineering: Few-shot vs Zero-shot
Decide between zero-shot and few-shot prompting by weighing example quality, cost, and how strictly you need to control output format.