Jump to content
Jump to content
✓ Done
/resources · Reference

AI Agent Architecture Reference

Reviewed by Josh Ausmus · Updated April 2026

Download PDF ↓

ReAct . Thought-action-observation loop

The agent outputs a thought, picks an action or tool, receives an observation, then repeats. It interleaves reasoning with tool use in one loop. This grounds responses in real results and lets the agent adapt when things go sideways.[[1]]. Https://arxiv.org/html/2601.01743v1

Plan-and-Execute . Upfront planning then execution

The agent generates a full step-by-step plan first, then runs each step in sequence. Planning happens in one shot before any tools get called. It works better on predictable tasks but struggles when the world changes mid-execution.[[2]]. Https://redis.io/blog/ai-agent-architecture/

Tool Calling / Function Calling (structured tool use)

The model outputs structured calls with exact names and JSON arguments instead of free-form text. The runtime parses it, runs the tool, and feeds the result back. This delivers clean integration and fewer parsing errors than raw text actions.[[1]]. Https://arxiv.org/html/2601.01743v1

Code Execution Agents (sandboxed code running)

The agent writes Python or similar, ships it to a restricted interpreter, and gets stdout plus any artifacts back. It iterates by inspecting outputs and patching the code. Great for math, data, or automation. The sandbox keeps it from melting the host.[[3]]. Https://www.linkedin.com/posts/skphd_ai-aiagents-agenticai-activity-7438947025439801344-sGLk

Multi-Agent: Supervisor pattern

One coordinator agent receives the task, routes pieces to specialized workers, collects outputs, and synthesizes the final answer. It centralizes decision making and logging. The supervisor becomes a bottleneck if you scale past a handful of workers.[[4]]. Https://www.agilesoftlabs.com/blog/2026/03/multi-agent-ai-systems-enterprise-guide

Multi-Agent: Debate pattern

Multiple agents argue different angles or solutions, critique each other, and converge on a better answer through rounds of discussion. It surfaces blind spots that one model misses. Communication volume climbs fast and debugging gets messy.[[1]]. Https://arxiv.org/html/2601.01743v1

Multi-Agent: Pipeline pattern

Agents sit in a fixed sequence. Each takes the previous output, does its job, and passes the result downstream. Simple to reason about and debug. The slowest stage blocks everything and the flow hates unexpected branches.[[4]]. Https://www.agilesoftlabs.com/blog/2026/03/multi-agent-ai-systems-enterprise-guide

Memory: Context window (short-term)

Everything the agent needs right now lives inside the model's token limit. Recent messages, thoughts, and tool results stay in prompt. It forgets everything the moment you exceed the window or start a new session.[[2]]. Https://redis.io/blog/ai-agent-architecture/

Memory: Vector store (long-term semantic)

You embed past documents, experiences, or summaries and query by similarity at runtime. The retriever pulls the most relevant chunks into context. It scales to thousands of items but can return stale or low-quality matches without good metadata filtering.[[1]]. Https://arxiv.org/html/2601.01743v1

Memory: Conversation history (session tracking)

The system keeps a running log of the full chat or key summaries across turns. It feeds the relevant slice back on each call. Without summarization the history grows until the context window explodes.[[2]]. Https://redis.io/blog/ai-agent-architecture/

Pattern Comparison

Pattern Best For Weakness Example Framework
ReAct Dynamic tasks, real-time adaptation High token use, variable latency LangGraph, LangChain
Plan-and-Execute Predictable, long-horizon work Brittle if plan is wrong or env changes LangGraph
Tool Calling Reliable, structured integrations Limited reasoning between calls OpenAI SDK, Anthropic
Code Execution Math, data analysis, automation Sandbox limits, security overhead Cursor, custom Python
Supervisor Multi-Agent Task routing, mixed expertise Supervisor bottleneck CrewAI, LangGraph
Debate Multi-Agent Complex reasoning, error catching High comms cost, hard to debug AutoGen
Pipeline Multi-Agent Linear workflows, content/data No parallelism, stalls on failure LangGraph, CrewAI
Context Window Short sessions Forgets everything beyond limit Any LLM
Vector Store Large knowledge bases Retrieval quality varies LlamaIndex, LangChain
Conversation History Multi-turn sessions Context bloat without summarization LangGraph checkpoints

Use the pattern that matches your failure modes. Most production systems combine two or three instead of betting on one.

Related Guides
what are ai reasoning tokens: hidden compute costs
what are ai reasoning tokens? Hidden chain-of-thought computations in OpenAI o3 and DeepSeek R1 multiply costs 5-20x during test-time compute.
FPGA vs Microcontroller: Which Runs Your Smart Home Hub
FPGA vs Microcontroller: Which Runs Your Smart Home Hub. MCUs are preferred for lower cost, simpler updates, and better power in smart home hubs.
Zigbee vs Z-Wave: The Protocols Running Your Smart Home
Zigbee vs Z-Wave: The Protocols Running Your Smart Home. Key tradeoffs in mesh behavior, RF reliability, MCU overhead for smart home scaling.