AI Agent Architecture Reference

/resources · Reference

Reviewed by Josh Ausmus · Updated April 2026

ReAct . Thought-action-observation loop

The agent outputs a thought, picks an action or tool, receives an observation, then repeats. It interleaves reasoning with tool use in one loop. This grounds responses in real results and lets the agent adapt when things go sideways.[[1]]. Https://arxiv.org/html/2601.01743v1

Plan-and-Execute . Upfront planning then execution

The agent generates a full step-by-step plan first, then runs each step in sequence. Planning happens in one shot before any tools get called. It works better on predictable tasks but struggles when the world changes mid-execution.[[2]]. Https://redis.io/blog/ai-agent-architecture/

Tool Calling / Function Calling (structured tool use)

The model outputs structured calls with exact names and JSON arguments instead of free-form text. The runtime parses it, runs the tool, and feeds the result back. This delivers clean integration and fewer parsing errors than raw text actions.[[1]]. Https://arxiv.org/html/2601.01743v1

Code Execution Agents (sandboxed code running)

The agent writes Python or similar, ships it to a restricted interpreter, and gets stdout plus any artifacts back. It iterates by inspecting outputs and patching the code. Great for math, data, or automation. The sandbox keeps it from melting the host.[[3]]. Https://www.linkedin.com/posts/skphd_ai-aiagents-agenticai-activity-7438947025439801344-sGLk

Multi-Agent: Supervisor pattern

One coordinator agent receives the task, routes pieces to specialized workers, collects outputs, and synthesizes the final answer. It centralizes decision making and logging. The supervisor becomes a bottleneck if you scale past a handful of workers.[[4]]. Https://www.agilesoftlabs.com/blog/2026/03/multi-agent-ai-systems-enterprise-guide

Multi-Agent: Debate pattern

Multiple agents argue different angles or solutions, critique each other, and converge on a better answer through rounds of discussion. It surfaces blind spots that one model misses. Communication volume climbs fast and debugging gets messy.[[1]]. Https://arxiv.org/html/2601.01743v1

Multi-Agent: Pipeline pattern

Agents sit in a fixed sequence. Each takes the previous output, does its job, and passes the result downstream. Simple to reason about and debug. The slowest stage blocks everything and the flow hates unexpected branches.[[4]]. Https://www.agilesoftlabs.com/blog/2026/03/multi-agent-ai-systems-enterprise-guide

Memory: Context window (short-term)

Everything the agent needs right now lives inside the model's token limit. Recent messages, thoughts, and tool results stay in prompt. It forgets everything the moment you exceed the window or start a new session.[[2]]. Https://redis.io/blog/ai-agent-architecture/

Memory: Vector store (long-term semantic)

You embed past documents, experiences, or summaries and query by similarity at runtime. The retriever pulls the most relevant chunks into context. It scales to thousands of items but can return stale or low-quality matches without good metadata filtering.[[1]]. Https://arxiv.org/html/2601.01743v1

Memory: Conversation history (session tracking)

The system keeps a running log of the full chat or key summaries across turns. It feeds the relevant slice back on each call. Without summarization the history grows until the context window explodes.[[2]]. Https://redis.io/blog/ai-agent-architecture/

Pattern Comparison

Pattern	Best For	Weakness	Example Framework
ReAct	Dynamic tasks, real-time adaptation	High token use, variable latency	LangGraph, LangChain
Plan-and-Execute	Predictable, long-horizon work	Brittle if plan is wrong or env changes	LangGraph
Tool Calling	Reliable, structured integrations	Limited reasoning between calls	OpenAI SDK, Anthropic
Code Execution	Math, data analysis, automation	Sandbox limits, security overhead	Cursor, custom Python
Supervisor Multi-Agent	Task routing, mixed expertise	Supervisor bottleneck	CrewAI, LangGraph
Debate Multi-Agent	Complex reasoning, error catching	High comms cost, hard to debug	AutoGen
Pipeline Multi-Agent	Linear workflows, content/data	No parallelism, stalls on failure	LangGraph, CrewAI
Context Window	Short sessions	Forgets everything beyond limit	Any LLM
Vector Store	Large knowledge bases	Retrieval quality varies	LlamaIndex, LangChain
Conversation History	Multi-turn sessions	Context bloat without summarization	LangGraph checkpoints

Use the pattern that matches your failure modes. Most production systems combine two or three instead of betting on one.

what are ai reasoning tokens? Hidden chain-of-thought computations in OpenAI o3 and DeepSeek R1 multiply costs 5-20x during test-time compute.

FPGA vs Microcontroller: Which Runs Your Smart Home Hub

FPGA vs Microcontroller: Which Runs Your Smart Home Hub. MCUs are preferred for lower cost, simpler updates, and better power in smart home hubs.

Zigbee vs Z-Wave: The Protocols Running Your Smart Home

Zigbee vs Z-Wave: The Protocols Running Your Smart Home. Key tradeoffs in mesh behavior, RF reliability, MCU overhead for smart home scaling.