Agentic Workflows
The Four Design Patterns of Agentic Workflows
Master the core reasoning architectures—Reflection, Tool Use, Planning, and Multi-Agent Collaboration—that transform LLMs into autonomous problem solvers.
In this article
The Shift from Prompting to Agentic Reasoning
Traditional interaction with Large Language Models typically follows a linear path where a user provides a prompt and the model returns a direct response. This zero-shot approach works well for creative writing or simple data transformation but often fails when tasked with multi-step logical reasoning. Engineers frequently encounter the limitations of this model when the complexity of the request exceeds the reasoning capacity of a single inference pass.
Agentic workflows represent a fundamental shift by treating the model as the core engine of an iterative loop rather than a static endpoint. Instead of trying to get the perfect answer in one go, we design systems that allow the model to think, act, observe the results, and then refine its approach. This iterative nature mimics the human software development process where we rarely write a perfect feature without several rounds of debugging and testing.
The primary driver behind this transition is the need for reliability in production environments where hallucinations or logic errors can be costly. By breaking down a monolithic task into a series of smaller, verifiable steps, we create a system that can recover from its own mistakes. This architecture moves us away from prompt engineering and toward system engineering, where the focus is on the flow of data and the logic of the feedback loops.
A critical mental model for understanding this shift is the difference between an algorithm and a heuristic. Standard software relies on hard-coded algorithms to solve predictable problems, whereas agentic systems use heuristics to navigate ambiguity. By giving an agent the ability to pause and evaluate its own work, we unlock the capability to solve non-linear problems that were previously impossible for automated systems.
Defining the Agentic Loop
At its core, an agentic loop consists of four distinct phases: planning, acting, observing, and reflecting. During the planning phase, the model decomposes a high-level goal into a sequence of actionable steps or tool calls. The acting phase involves executing these steps, which might include querying a database, searching the web, or running a block of code.
Once an action is taken, the system captures the observation, which serves as the external feedback for the model. The reflection phase is where the magic happens, as the model evaluates whether the observation moved it closer to the final goal or if a change in strategy is required. This closed-loop system ensures that the agent stays aligned with the desired outcome even when external environment variables change unexpectedly.
- Deterministic Logic: Best for predictable inputs with fixed outcomes like calculating tax or parsing standard JSON.
- Agentic Reasoning: Necessary for open-ended tasks like automated bug fixing or personalized research reports.
- Hybrid Approaches: Using agents to handle high-level logic while delegating specialized sub-tasks to traditional microservices.
Implementing the Reflection Pattern for Quality Control
One of the most effective ways to improve the performance of an LLM is through the Reflection pattern. This architectural design involves two distinct prompts or agents: one to generate an initial output and another to critique that output for errors or improvements. By forcing the model to look back at its own work from a critical perspective, we significantly reduce the rate of logical fallacies and hallucinations.
In a real-world scenario, such as generating a complex SQL query for a data analyst, a single-pass prompt might misinterpret the schema or join conditions. With a reflection loop, the system takes that generated SQL and runs it through a critique agent tasked with finding edge cases or performance bottlenecks. The feedback is then passed back to the generator to produce a final, high-quality version of the query.
This pattern is particularly powerful because it allows the developer to inject specific domain knowledge into the critique phase. You can instruct the reflection agent to check for security vulnerabilities, style guide adherence, or computational efficiency. This modularity ensures that the agentic system adheres to the same standards as a human code review process.
Code Generation and Self-Correction
To implement a robust reflection loop, you must manage the state of the conversation and ensure the critique provides actionable feedback. If the feedback is too vague, the generator will likely repeat the same mistakes in the next iteration. Good critique instructions focus on specific metrics like execution time, error messages from a compiler, or logical consistency with the original user intent.
1def agentic_code_generator(user_goal, max_iterations=3):
2 # Initial draft of the solution
3 current_code = llm.generate_code(user_goal)
4
5 for i in range(max_iterations):
6 # Run code in a sandboxed environment to get real feedback
7 result, error = sandbox.execute(current_code)
8
9 if not error and result.is_valid:
10 return current_code
11
12 # Reflection: Analyze the failure and plan a fix
13 critique = llm.critique_code(current_code, error)
14
15 # Update the code based on the critique
16 current_code = llm.refine_code(current_code, critique)
17
18 return current_code # Return best effort after max attemptsNotice how the code example uses an external sandbox to provide objective feedback to the model. Using real execution data rather than just the model's inner intuition creates a much stronger grounded feedback loop. This prevents the system from getting stuck in a hallucination cycle where it keeps insisting that incorrect code is actually functional.
Autonomous Tool Use and Planning Architectures
Agents become truly useful when they can interact with the external world through tool use. A tool can be anything from a simple REST API to a complex shell environment or a vector database. The challenge lies in teaching the agent when to use a tool, how to format the input parameters correctly, and how to interpret the tool's output within the context of the larger task.
Modern planning architectures like ReAct combine reasoning and acting into a seamless flow. The model generates a 'Thought' explaining its current understanding, followed by an 'Action' selecting a tool to call. This transparency allows developers to audit the agent's decision-making process in real-time, making it easier to identify where the logic might be breaking down.
Effective planning requires the agent to maintain a global view of the task while executing granular steps. Without a strong planning component, agents often get distracted by irrelevant information or enter infinite loops where they repeatedly call the same tool with the same parameters. We mitigate this by implementing planning modules that explicitly track progress toward the final objective.
The ReAct Pattern Implementation
Implementing the ReAct pattern involves crafting a system prompt that encourages the model to document its internal monologue. This serves as a form of scratchpad that helps the model stay focused and provides the necessary context for subsequent steps. When the model outputs a specific token indicating a tool call, the orchestration layer intercepts the output, executes the tool, and feeds the result back to the model as an 'Observation'.
1async function runAgentLoop(task) {
2 let history = [{ role: 'system', content: 'You are a helpful assistant with tool access.' }];
3 let status = 'IN_PROGRESS';
4
5 while (status === 'IN_PROGRESS') {
6 const response = await model.generate(history);
7 history.push(response);
8
9 if (response.toolCall) {
10 // Execute the requested tool and append the observation
11 const toolResult = await executeTool(response.toolCall.name, response.toolCall.args);
12 history.push({ role: 'observation', content: JSON.stringify(toolResult) });
13 } else {
14 // No more tools needed, task is complete
15 status = 'COMPLETED';
16 }
17 }
18 return history[history.length - 1];
19}Tool use turns an LLM from a chatterbox into an operator. The integrity of your system depends entirely on how well you define the boundaries and error handling of those tools.
Multi-Agent Orchestration and Communication
For massive tasks like building a full-stack application or conducting deep market research, a single agent often becomes overwhelmed by the context size and the diversity of required skills. Multi-agent systems solve this by creating a team of specialized agents, each with a narrow focus and a clear set of responsibilities. This division of labor mirrors a human organization where experts collaborate to solve complex problems.
There are two primary ways to organize these agents: hierarchical and peer-to-peer. In a hierarchical structure, a 'Manager' agent delegates tasks to 'Worker' agents and synthesizes their outputs. In a peer-to-peer structure, agents post messages to a shared workspace or message bus, allowing for more fluid and decentralized collaboration. Choosing the right architecture depends on whether your task requires strict top-down control or creative cross-pollination.
Communication protocols between agents are vital for preventing confusion. We typically use structured data formats like JSON to pass messages, ensuring that the 'Writer' agent knows exactly what the 'Editor' agent is asking for. By standardizing the interface between agents, we can swap out different models or prompt versions for specific roles without breaking the entire workflow.
Designing Specialized Agent Personas
A persona is more than just a stylistic choice; it defines the agent's constraints and priorities. A 'Security Auditor' agent should be prompted to be skeptical and pessimistic, while a 'Product Manager' agent should focus on user value and high-level requirements. These conflicting perspectives ensure that the final output has been thoroughly vetted from multiple angles.
- Context Isolation: Each agent only sees the information relevant to its specific sub-task to prevent context window clutter.
- State Handoffs: Explicitly defining what data is passed from one agent to the next to maintain a clear chain of custody.
- Conflict Resolution: Implementing logic to handle cases where two agents disagree on the best path forward.
Successful multi-agent systems often use a shared state object that tracks the progress of the entire team. This prevents redundant work and allows the manager agent to make informed decisions about who should take the next action based on the current state of the project.
Production Challenges and State Management
Moving an agentic workflow from a local script to a production environment introduces significant challenges regarding state management and reliability. Unlike a simple API call, an agentic loop can run for minutes and span dozens of individual inferences. This requires a robust persistence layer to store the conversation history, tool outputs, and internal reasoning steps in case of a system failure or network timeout.
Cost and latency are the two most common hurdles in agentic systems. Because these workflows involve multiple model calls per user request, the token usage can grow exponentially if not carefully monitored. Developers must implement strict limits on the number of iterations and use cheaper, faster models for simple reflection or data parsing tasks to keep the system economically viable.
Observability is the final piece of the puzzle. Traditional logging isn't enough when you need to understand why an agent decided to delete a database table or ignore a specific constraint. You need specialized tracing tools that visualize the entire agentic graph, showing the inputs, thoughts, actions, and observations for every single step in the process.
Managing Infinite Loops and Hallucinations
The most dangerous failure mode for an autonomous agent is the infinite loop, where it repeatedly attempts a failing action without changing its approach. To prevent this, we implement 'circuit breakers' that terminate the execution after a certain number of failed tool calls or if the agent enters a repetitive reasoning pattern. These safeguards are essential for protecting both your cloud budget and the underlying infrastructure.
Autonomy is a spectrum, not a binary. In production, always start with more constraints and gradually loosen them as you gain confidence in the agent's stability.
By implementing a human-in-the-loop (HITL) mechanism for high-stakes decisions, you can enjoy the benefits of agentic automation while maintaining a safety net. The system can pause the loop and ask for human confirmation before executing a destructive action, ensuring that the agent remains an assistant rather than a liability.
