AI Agent Architecture Patterns: ReAct, Plan-and-Execute, Multi-Agent

AI Agent Architecture Patterns: ReAct, Plan-and-Execute, and Multi-Agent

The AI agent landscape has matured significantly. At Obaro Labs, we have built agents across all three dominant architecture patterns, and each has a distinct sweet spot. This post breaks down the ReAct pattern, Plan-and-Execute, and Multi-Agent orchestration with real code examples and honest assessments of when each pattern works - and when it fails.

Why Architecture Matters for Agents

An AI agent is more than a prompt and an API call. It is a software system that reasons, takes actions, observes results, and iterates. The architecture you choose determines how the agent reasons, how reliably it completes tasks, how observable its behavior is, and how much it costs to run.

Choosing the wrong pattern leads to agents that are expensive, unreliable, or impossible to debug. We have seen all three failure modes in production.

Pattern 1: ReAct (Reasoning + Acting)

ReAct is the simplest and most widely adopted agent pattern. The agent alternates between reasoning (thinking about what to do) and acting (calling a tool), then observes the result and reasons again. This loop continues until the agent determines it has enough information to respond.

How It Works:

The agent receives a user query
It reasons about what information it needs
It selects and calls a tool
It observes the tool output
It reasons again - either calling another tool or generating a final response

When to Use ReAct:

Single-objective tasks (answer a question, complete a form, look up information)
Tasks requiring 1-5 tool calls
When you need simplicity and debuggability
When latency budget allows for sequential tool calls

When ReAct Breaks Down:

Complex multi-step tasks requiring more than 7-8 tool calls (the LLM loses context)
Tasks requiring parallel information gathering
When the agent needs to backtrack or revise its approach

// ReAct Agent Implementation
import { ChatOpenAI } from "@langchain/openai";

interface Tool {
  name: string;
  description: string;
  execute: (input: string) => Promise<string>;
}

interface AgentStep {
  thought: string;
  action: string;
  actionInput: string;
  observation: string;
}

async function reactAgent(
  query: string,
  tools: Tool[],
  maxSteps: number = 8
): Promise<string> {
  const llm = new ChatOpenAI({ model: "gpt-4o", temperature: 0 });
  const steps: AgentStep[] = [];
  const toolDescriptions = tools
    .map((t) => t.name + ": " + t.description)
    .join("
");

  for (let i = 0; i < maxSteps; i++) {
    const prompt = buildReActPrompt(query, toolDescriptions, steps);
    const response = await llm.invoke(prompt);
    const parsed = parseReActResponse(response.content as string);

    if (parsed.action === "FINISH") {
      return parsed.actionInput; // Final answer
    }

    const tool = tools.find((t) => t.name === parsed.action);
    if (!tool) {
      steps.push({
        ...parsed,
        observation: "Error: Tool " + parsed.action + " not found",
      });
      continue;
    }

    const observation = await tool.execute(parsed.actionInput);
    steps.push({ ...parsed, observation });

    // Log for observability
    console.log("Step " + (i + 1) + ": " + parsed.action
      + "(" + parsed.actionInput + ") -> "
      + observation.substring(0, 200));
  }

  return "Agent reached maximum steps without completing the task.";
}

Performance Characteristics:

In our deployments, ReAct agents average 2.3 tool calls per task completion, with a median latency of 3.2 seconds. The pattern works well for 85-90% of single-turn agent tasks. Error rates increase sharply beyond 5 sequential tool calls.

Pattern 2: Plan-and-Execute

Plan-and-Execute separates planning from execution. The agent first creates a complete plan (a list of steps), then executes each step, optionally re-planning if something goes wrong. This pattern is inspired by classical AI planning and works well for complex, multi-step tasks.

How It Works:

A planning LLM call analyzes the task and produces a step-by-step plan
An execution loop processes each step, calling tools as needed
After each step, the agent can optionally re-plan based on new information
The agent synthesizes results from all steps into a final response

When to Use Plan-and-Execute:

Complex tasks requiring 5-15+ steps
Tasks where the steps are somewhat predictable
When you need better cost control (planning is one LLM call, execution can use cheaper models)
Workflows that benefit from showing the user a plan before executing

When Plan-and-Execute Breaks Down:

Highly dynamic tasks where the next step depends entirely on the previous result
Tasks requiring real-time adaptation (the upfront planning step adds latency)
When the task is simple enough that planning is overhead

// Plan-and-Execute Agent Implementation
interface PlanStep {
  id: number;
  description: string;
  tool: string;
  input: string;
  dependsOn: number[];
  status: "pending" | "running" | "completed" | "failed";
  result?: string;
}

interface ExecutionPlan {
  goal: string;
  steps: PlanStep[];
}

async function planAndExecuteAgent(
  query: string,
  tools: Tool[],
): Promise<string> {
  const planner = new ChatOpenAI({ model: "gpt-4o", temperature: 0 });
  const executor = new ChatOpenAI({ model: "gpt-4o-mini", temperature: 0 });

  // Phase 1: Create the plan
  const plan = await createPlan(planner, query, tools);
  console.log("Plan created with " + plan.steps.length + " steps");

  // Phase 2: Execute each step
  const results: Map<number, string> = new Map();

  for (const step of plan.steps) {
    // Check dependencies are met
    const depsReady = step.dependsOn.every(
      (depId) => results.has(depId)
    );
    if (!depsReady) {
      step.status = "failed";
      continue;
    }

    // Gather context from dependencies
    const depContext = step.dependsOn
      .map((id) => "Step " + id + " result: " + results.get(id))
      .join("
");

    step.status = "running";
    const tool = tools.find((t) => t.name === step.tool);

    if (tool) {
      const enrichedInput = await executor.invoke(
        "Given context:
" + depContext
        + "
Execute: " + step.description
        + "
Tool input should be: " + step.input
      );
      const result = await tool.execute(enrichedInput.content as string);
      results.set(step.id, result);
      step.status = "completed";
      step.result = result;
    }
  }

  // Phase 3: Synthesize results
  const synthesis = await planner.invoke(
    "Original query: " + query
    + "
Results:
"
    + Array.from(results.entries())
        .map(([id, r]) => "Step " + id + ": " + r)
        .join("
")
    + "
Provide a comprehensive answer."
  );

  return synthesis.content as string;
}

Performance Characteristics:

Plan-and-Execute agents handle complex tasks 40% more reliably than ReAct for tasks requiring more than 5 steps. The upfront planning adds 1-2 seconds of latency, but total execution time is often lower because the plan avoids dead-end reasoning paths. We have seen 30% cost reduction by using gpt-4o-mini for execution steps while reserving gpt-4o for planning.

Pattern 3: Multi-Agent Orchestration

Multi-Agent systems use multiple specialized agents that collaborate to complete a task. Each agent has a focused role - one might be a researcher, another a writer, another a critic. An orchestrator coordinates their work.

How It Works:

An orchestrator agent receives the task and decomposes it into sub-tasks
Each sub-task is routed to a specialized agent
Specialized agents complete their work and return results
The orchestrator synthesizes results, potentially routing follow-up tasks
A final response is assembled from all agent outputs

When to Use Multi-Agent:

Tasks requiring diverse expertise (research + analysis + writing)
When you want to enforce separation of concerns
Long-running workflows where different phases have different requirements
When you need specialized system prompts and tool sets per role

When Multi-Agent Breaks Down:

Simple tasks (the coordination overhead is not justified)
When latency is critical (multiple agent calls add up)
When the task does not have natural role boundaries
When budget is very tight (more agents means more LLM calls)

// Multi-Agent Orchestrator
interface AgentConfig {
  name: string;
  role: string;
  systemPrompt: string;
  tools: Tool[];
  model: string;
}

interface TaskAssignment {
  agentName: string;
  task: string;
  context: string;
}

class MultiAgentOrchestrator {
  private agents: Map<string, AgentConfig>;
  private orchestratorLLM: ChatOpenAI;

  constructor(agents: AgentConfig[]) {
    this.agents = new Map(agents.map((a) => [a.name, a]));
    this.orchestratorLLM = new ChatOpenAI({
      model: "gpt-4o",
      temperature: 0,
    });
  }

  async execute(query: string): Promise<string> {
    // Step 1: Decompose task
    const assignments = await this.decompose(query);
    console.log("Decomposed into "
      + assignments.length + " agent tasks");

    // Step 2: Execute agent tasks (parallel where possible)
    const results = await Promise.all(
      assignments.map((assignment) =>
        this.runAgent(assignment)
      )
    );

    // Step 3: Synthesize
    const finalResponse = await this.synthesize(
      query, assignments, results
    );
    return finalResponse;
  }

  private async runAgent(
    assignment: TaskAssignment
  ): Promise<string> {
    const config = this.agents.get(assignment.agentName);
    if (!config) throw new Error("Unknown agent: "
      + assignment.agentName);

    const agentLLM = new ChatOpenAI({
      model: config.model,
      temperature: 0,
    });

    // Each agent runs its own ReAct loop with its own tools
    return reactAgent(
      assignment.task,
      config.tools,
      5
    );
  }
}

// Example: Research and Analysis pipeline
const agents: AgentConfig[] = [
  {
    name: "researcher",
    role: "Information Gathering",
    systemPrompt: "You are a research agent. Find relevant data.",
    tools: [searchTool, databaseTool, apiTool],
    model: "gpt-4o",
  },
  {
    name: "analyst",
    role: "Data Analysis",
    systemPrompt: "You analyze data and extract insights.",
    tools: [calculatorTool, chartTool],
    model: "gpt-4o",
  },
  {
    name: "writer",
    role: "Report Generation",
    systemPrompt: "You write clear, concise reports.",
    tools: [formatterTool, templateTool],
    model: "gpt-4o-mini",
  },
];

Performance Characteristics:

Multi-agent systems excel at complex, multi-faceted tasks. In our deployments, a three-agent research pipeline produces 35% higher quality outputs (measured by human evaluation) compared to a single-agent approach on tasks requiring research, analysis, and report generation. The trade-off is 2-3x higher cost and 2-4x higher latency.

Choosing the Right Pattern: A Decision Framework

Here is the framework we use at Obaro Labs to select an architecture:

Start with ReAct if the task is well-defined, requires fewer than 5 tool calls, and the user expects a quick response. This covers most chatbot, Q&A, and simple automation use cases.

Graduate to Plan-and-Execute when tasks regularly require more than 5 steps, when you want to show users a plan before execution, or when you need to optimize cost by using different models for planning versus execution.

Adopt Multi-Agent when the task naturally decomposes into distinct roles, when you need specialized tool sets per role, or when quality is more important than speed and cost.

Hybrid Approaches in Production

In practice, we often combine patterns. Our most common production architecture is a Plan-and-Execute outer loop where each execution step runs a ReAct agent. This gives us the reliability of upfront planning with the flexibility of reactive execution at each step.

For one healthcare client, we built a clinical document processing pipeline using a multi-agent system where the orchestrator uses Plan-and-Execute and each specialist agent uses ReAct. The researcher agent finds relevant clinical guidelines, the extractor agent pulls structured data from documents, and the validator agent checks for compliance. This system processes over 10,000 clinical documents per day with a 96% accuracy rate.

Observability Across All Patterns

Regardless of which pattern you choose, invest heavily in observability. Every agent decision, tool call, and intermediate result should be logged with trace IDs that allow you to reconstruct the full execution path. We use OpenTelemetry with custom spans for agent steps, and we have built dashboards that show step-by-step reasoning traces for any agent execution.

Key metrics to track across all patterns:

Task completion rate: What percentage of tasks does the agent complete successfully?
Average tool calls per task: Are your agents efficient?
Latency distribution: Track p50, p95, and p99 separately
Cost per task: Break down by LLM calls, tool calls, and infrastructure
Escalation rate: How often does the agent hand off to a human?

Conclusion

There is no single best agent architecture. The right choice depends on your task complexity, latency requirements, cost constraints, and quality expectations. Start simple with ReAct, measure its limitations, and graduate to more sophisticated patterns only when the data tells you to. The worst mistake we see is teams starting with a complex multi-agent system for a task that a simple ReAct agent could handle in two tool calls.

At Obaro Labs, we have reference implementations for all three patterns that we adapt to each client's specific needs. If you are building an agent system and want to discuss architecture, reach out - we are happy to share what we have learned.

AI Agent Architecture Patterns: ReAct, Plan-and-Execute, and Multi-Agent

AI Agent Architecture Patterns: ReAct, Plan-and-Execute, and Multi-Agent

Why Architecture Matters for Agents

Pattern 1: ReAct (Reasoning + Acting)

Pattern 2: Plan-and-Execute

Pattern 3: Multi-Agent Orchestration

Choosing the Right Pattern: A Decision Framework

Hybrid Approaches in Production

Observability Across All Patterns

Conclusion

Related Posts

Vector Databases Explained: When You Need One and How to Choose

From Proof of Concept to Production: The 80% That Gets Ignored

Ready to build your AI advantage?

AI Agent Architecture Patterns: ReAct, Plan-and-Execute, and Multi-AgentAI Agent Architecture Patterns: ReAct, Plan-and-Execute, and Multi-Agent

AI Agent Architecture Patterns: ReAct, Plan-and-Execute, and Multi-Agent

Why Architecture Matters for Agents

Pattern 1: ReAct (Reasoning + Acting)

Pattern 2: Plan-and-Execute

Pattern 3: Multi-Agent Orchestration

Choosing the Right Pattern: A Decision Framework

Hybrid Approaches in Production

Observability Across All Patterns

Conclusion

Related Posts

Vector Databases Explained: When You Need One and How to Choose

From Proof of Concept to Production: The 80% That Gets Ignored

Ready to build your AI advantage?

AI Agent Architecture Patterns: ReAct, Plan-and-Execute, and Multi-Agent