Agent Architecture Fundamentals
Conceptual Explanation
An agent is an LLM equipped with:
- Tools — functions the model can call (search, database query, API call, file write)
- Memory — access to prior context beyond the current prompt
- A goal — an objective it pursues across multiple steps
- A loop — a mechanism to continue reasoning until the goal is reached or a stop condition is met
The critical distinction from a chain: in a chain, the developer decides the sequence of LLM calls in advance. In an agent, the LLM decides what to do next at each step based on what it observes. This autonomy is both the value proposition and the primary risk.
The Agent vs. Chain Distinction
Chain (developer controls flow):
Step 1: LLM summarizes document (hardcoded)
Step 2: LLM extracts entities (hardcoded)
Step 3: LLM formats output (hardcoded)
Agent (LLM controls flow):
Goal: "Research this patient's medication history and flag interactions"
Step 1: LLM decides to call get_patient_medications()
Step 2: LLM decides to call check_drug_interactions(medications)
Step 3: LLM decides result is sufficient → produces final answer
[Or: LLM decides to call get_allergy_history() first, then proceed]The agent dynamically selects tools and decides when it has enough information. This flexibility enables handling cases the developer did not anticipate at design time.
Core Architecture
The agent's execution model follows the Perceive → Reason → Act → Observe cycle:
- Perceive: The agent receives input (user query, system trigger, prior step result). This is combined with the system prompt, available tool schemas, and any memory retrieved from external stores into the context window.
- Reason: The LLM generates a response. If it determines a tool is needed, it outputs a structured tool call (JSON with tool name and arguments). If it has enough information, it outputs a final response.
- Act: The framework executes the tool call against the real system (API, database, file). The LLM does not execute code — it outputs a request; the framework executes it.
- Observe: The tool result is appended to the context as an observation. The agent loops back to Reason with updated context.
This loop continues until the LLM either produces a final answer (no tool call) or a stop condition is met (max iterations, timeout, error threshold).
Architecture Diagram
Standalone diagram: architecture/mermaid/02-agent-loop.mmd
Common Mistakes
Running agents where chains suffice. If the sequence of steps is known at design time and doesn't change based on input, use a chain. The agent overhead (latency, cost, unpredictability) is not justified.
Unbounded loops. Forgetting max_iterations causes runaway cost and resource consumption. Always set a limit; always log when it is reached.
Too many tools. Giving an agent 30 tools degrades tool selection accuracy. Keep the tool registry focused: 5–15 tools per agent. Specialize agents rather than creating a general-purpose agent with every possible tool.
Mutable tool side effects without idempotency. If a tool writes to a database and the agent fails mid-workflow, re-running the workflow executes the write again. All write-side tools must be idempotent or the agent must checkpoint state after each write.
Trusting tool output without validation. Tool results can contain errors, unexpected formats, or injection payloads. Validate tool output structure before appending it to the agent's context.
Best Practices
- Use agents only when the workflow cannot be expressed as a predetermined sequence
- Set
max_iterationson every agent; alert when it triggers in production - Keep tool registries focused: 5–15 tools maximum per agent
- Use small models for tool routing decisions; reserve frontier models for reasoning
- Implement all write-side tools as idempotent operations
- Checkpoint agent state after each successful tool call for resumability
- Require human-in-the-loop review before irreversible actions (sends, writes, external submissions)
- Log all tool calls with inputs, outputs, timestamps, and cost for auditability
Alternatives
| Approach | When to Choose | Trade-off |
|---|---|---|
| Sequential chain | Steps are known at design time; no branching needed | Less flexible; faster and more predictable |
| Parallel chain (fan-out) | Multiple independent tasks; aggregate results | No dynamic decision-making; requires all paths to be known |
| Router + chain | Small number of fixed paths based on input classification | More predictable than an agent; less flexible |
| Full agent (ReAct) | Branching logic is complex or data-driven; paths cannot be predetermined | Most flexible; highest cost, latency, and unpredictability |
| Human workflow + AI assist | Task requires human judgment throughout; AI augments but doesn't automate | Lower automation; more reliable for high-stakes decisions |
Trade-offs
| Dimension | Advantage | Cost |
|---|---|---|
| Flexibility | Handles unforeseen cases dynamically | Introduces unpredictability |
| Autonomy | Reduces human coordination overhead | Requires containment architecture |
| Capability | Completes multi-step workflows | 5–20x cost vs. single-turn calls |
| Resumability | Can checkpoint and continue | Requires persistent state infrastructure |
| Debuggability | Flexible tool sequences | Harder to trace failures than deterministic chains |
Interview Questions
Q1: What distinguishes an agent from a chain in LLM-based systems?
Category: Architecture Difficulty: Senior Role: AI Architect
Answer Framework:
A chain is a developer-defined sequence of LLM calls and transformations where the flow is predetermined. The developer decides at build time what steps to execute and in what order. A chain is appropriate when the task has a fixed structure.
An agent is an LLM-powered system where the model itself decides what to do next at each step, based on its reasoning about the current state. The developer provides tools and a goal; the agent determines the sequence of tool calls required to achieve that goal. This makes agents appropriate for tasks where the path depends on data encountered at runtime.
The practical consequence: agents can handle cases the developer did not anticipate, but they also introduce unpredictability, higher cost, and new failure modes that chains do not have. The choice between them is an explicit architectural decision, not a preference.
Key Points to Hit: Developer controls flow (chain) vs. LLM controls flow (agent); dynamic tool selection; when each is appropriate; trade-offs are real.
Red Flags: "Agents are just better chains" — they are not. They are a different architectural pattern with different cost/benefit profiles.
Q2: A prior authorization workflow requires 8 sequential steps, each dependent on the previous. Should you use an agent or a chain?
Category: System Design Difficulty: Principal Role: AI Architect
Answer Framework:
The key question is whether the steps and their sequence are known at design time. If the prior auth workflow always executes the same 8 steps in the same order regardless of input, a chain is more appropriate — it is faster, cheaper, easier to debug, and more predictable than an agent.
However, prior auth is rarely that simple. Real workflows branch: different payers have different criteria; some requests require clinical literature lookup while others don't; some requests can be auto-approved while others require escalation. If these decision points depend on data retrieved at runtime, an agent or an agent-augmented state machine (LangGraph) is warranted.
The production pattern: a LangGraph state machine where fixed paths are expressed as deterministic edges and dynamic decisions are expressed as conditional edges routing to an LLM-powered decision node. This gives you the predictability of a chain for known paths and the flexibility of an agent for the branching logic.
Red Flags: "Use an agent for everything" — overcomplicated; "Use a chain for everything" — breaks when faced with realistic workflow variation.
Q3: What is the "agent paradox" and how does it affect system design?
Category: Architecture Difficulty: Principal Role: AI Architect / Engineering Manager
Answer Framework:
The agent paradox is the observation that the more autonomous and capable you make an agent, the more dangerous and expensive its failure modes become. A highly capable agent that can take consequential actions — send emails, submit to payers, write to patient records — is also an agent that can cause consequential harm if it makes a mistake.
This creates a design tension: the value of an agent scales with how much it can do autonomously, but the risk also scales. The architectural resolution is graduated autonomy: the agent operates autonomously up to a defined risk threshold, above which it requires human approval. Low-risk actions (reading records, querying guidelines) proceed automatically. High-risk actions (submitting prior auth decisions, updating medication records) require physician review.
This is why human-in-the-loop is an architectural requirement for enterprise clinical agents, not an optional feature. It is the mechanism that resolves the agent paradox.
Key Takeaways
- An agent is an LLM equipped with tools, memory, and a goal-directed loop — the LLM decides what to do next at each step
- The ReAct pattern (Reason → Act → Observe) is the foundation of all practical agent architectures
- Agents are appropriate when workflow paths cannot be predetermined; chains are appropriate when they can
- Tool calling via the API's native
toolsparameter is strongly preferred over prompt-engineered ReAct - Every agent must have a
max_iterationscircuit breaker; unbounded loops are a production risk - Agent cost is 5–20x single-turn cost; model tier selection per agent role is the primary cost lever
- Human-in-the-loop is not optional for enterprise agents that take consequential actions — it resolves the agent paradox
- The security perimeter of an agent is the union of the permissions of all its tools — minimize both