Agentic AI — Quick Reference
One-Line Definition
An agent is an LLM that perceives its environment, reasons about what to do, uses tools to act, and observes the results — repeating this loop until a task is complete or a defined limit is reached.
When to Use Agents / When NOT to Use
| Use Agents When | Do NOT Use Agents When |
|---|---|
| Task requires dynamic tool selection at runtime | Task is a fixed, deterministic sequence of calls |
| Workflow depends on results to decide next steps | Outcome is always the same regardless of inputs |
| Multiple sub-tasks must be coordinated | A single LLM call with good prompting is sufficient |
| Human-in-the-loop at specific decision points is required | Latency or cost constraints prohibit multi-turn execution |
| Workflow is too complex for a single context window | Task is simple enough for a chain |
Agent Paradox: The harder the task, the more an agent helps — and the harder it is to verify the agent's work. Plan evaluation before you deploy.
Core Concepts
| Concept | Definition |
|---|---|
| ReAct Loop | Perceive → Reason → Act (tool call) → Observe (tool result) → repeat |
| Tool | A function the model can call; defined by name, description, and input_schema |
| Orchestrator | Agent that decomposes goals and delegates to workers |
| Worker | Specialized agent with a narrow tool scope; does not delegate |
| Checkpointer | Persists workflow state between agent turns (enables HITL and resumption) |
| Thread ID | Unique identifier for a workflow run; the key for checkpointer state lookup |
| HITL Interrupt | Pause in execution at a defined node, awaiting human input before resuming |
Framework Comparison
| Dimension | Raw SDK (Anthropic) | LangGraph | CrewAI |
|---|---|---|---|
| Control | Maximum | High | Medium |
| Configuration effort | High | Medium | Low |
| Typed state | Manual | TypedDict built-in |
Implicit (task context) |
| Checkpointing | Manual | Built-in (PostgresSaver) |
Limited |
| HITL | Manual | interrupt_before |
Not native |
| Best for | Custom topology; learning | Production stateful workflows | Role-based team patterns |
| Choose when | No framework overhead needed | Complex routing + HITL + persistence | Rapid role-based prototyping |
Tool Design Rules
python
{
"name": "action_verb_noun", # Unambiguous action name
"description": (
"One sentence: what it does. "
"Second sentence: when to use it vs. similar tools. "
"Third sentence: what it returns or error behavior."
),
"input_schema": {
"type": "object",
"properties": {
"required_param": {
"type": "string",
"description": "Precise description with format, example if not obvious",
},
"optional_param": {
"type": "integer",
"description": "What it controls (default X, range Y–Z)",
"default": 3,
},
},
"required": ["required_param"],
},
}Tool Side-Effect Classification:
| Class | Examples | Authorization |
|---|---|---|
| Read | getpatientsummary, search_guidelines | Autonomous |
| Write | createdraft, updaterecord | Log + notify |
| External | submittopayer, send_email | HITL required |
| Delete | delete_record | HITL + confirmation |
Memory System Summary
| Type | Storage | Scope | Use Case |
|---|---|---|---|
| Working | Context window | One turn | Current reasoning state |
| Episodic | Progressive summarization | Session | Conversation continuity |
| Semantic | Vector store (tool call) | Cross-session | Knowledge retrieval |
| Procedural | System prompt | Always active | Behavioral constraints |
LangGraph Patterns
python
from typing import Annotated
import operator
from typing_extensions import TypedDict
# State schema — Annotated list accumulates; other fields replace (last-write-wins)
class WorkflowState(TypedDict):
messages: Annotated[list[dict], operator.add] # accumulates
patient_id: str # replaced
result: dict # replaced
# Node = pure function: state dict → partial state dict
def my_node(state: WorkflowState) -> dict:
# ... do work ...
return {"result": {"key": "value"}, "messages": [{"node": "my_node", "status": "done"}]}
# Conditional edge routing
def route_fn(state: WorkflowState) -> Literal["node_a", "node_b"]:
return "node_a" if state["result"].get("ok") else "node_b"
# Checkpointer choice
from langgraph.checkpoint.memory import MemorySaver # dev/test only
from langgraph.checkpoint.postgres import PostgresSaver # production
# HITL pattern
graph = builder.compile(
checkpointer=PostgresSaver.from_conn_string(CONN),
interrupt_before=["human_review_node"],
)
# Phase 1: Run to interrupt
result = graph.invoke(initial_state, {"configurable": {"thread_id": "run-001"}})
# Phase 2: Resume after human input
result = graph.invoke(
{"human_decision": "approved"},
{"configurable": {"thread_id": "run-001"}}
)Multi-Agent Topology Quick Selection
| Topology | Structure | Best For | Risk |
|---|---|---|---|
| Orchestrator-Worker | One orchestrator → N workers | Well-defined sub-tasks | Orchestrator is single point of failure |
| Hierarchical | Orchestrator → Sub-orchestrators → Workers | Very complex, nested tasks | Latency and cost multiply with depth |
| Peer-to-Peer | Agents call each other | Debate / negotiation patterns | Hard to trace; control flow complex |
HITL Design Checklist
- [ ] Identify which workflow nodes require human review (policy-based, risk-based, confidence-based, anomaly-based)
- [ ] Design interrupt point before any external action is taken
- [ ] Persist state to durable storage (PostgreSQL) before notifying reviewer
- [ ] Define reviewer role and authorization for each HITL trigger type
- [ ] Define SLA for review; implement escalation path for SLA breach
- [ ] Write all decisions to immutable audit log (reviewer ID, timestamp, decision, rationale)
- [ ] Test the resume path explicitly — the interrupt is only half the pattern
MCP Quick Reference
| Primitive | Definition | Example |
|---|---|---|
| Tool | Callable function with side effects | get<em>patient</em>summary(patient_id) |
| Resource | Read-only data identified by URI | clinical://guidelines/index |
| Prompt | Reusable parameterized prompt template | prior-auth-evaluation-template |
Transport: stdio (local subprocess) vs. HTTP+SSE (remote, multi-user)
Security rule: Tool authorization enforced at the MCP server — the client is not a security boundary.
Security Quick Reference
| Threat | Defense Layer | Priority |
|---|---|---|
| Prompt injection (direct) | Input validation before agent | High |
| Prompt injection (indirect) | Validate retrieved content; prompt hardening | High |
| Excessive agency | Tool allowlist per workflow | Critical |
| Privilege escalation | Trust level enforcement on inter-agent messages | High |
| PHI exfiltration | Tool scope enforcement; audit logging | Critical |
Defense hierarchy (most to least reliable):
- Tool authorization enforcement (independent of model reasoning)
- Input/output validation (detects known patterns)
- System prompt hardening (reduces model compliance with injection)
Observability Checklist
- [ ] Span-based tracing: one root span per workflow, child spans per node and tool call
- [ ] LLM call instrumentation: model, input tokens, output tokens, latency
- [ ] Compliance audit log: separate, immutable; every tool call and HITL decision logged
- [ ] PHI masked before transmission to third-party trace stores
- [ ] Quality gate defined and monitored (automated evaluation pipeline)
- [ ] Key metrics: success rate, HITL trigger rate, quality score, P95 latency, token cost
Common Interview Questions
- When would you choose an agent over a chain? — When the task requires dynamic tool selection based on results; when control flow cannot be determined in advance.
- What is the "agent paradox"? — Complex tasks benefit most from agents, but are also hardest to evaluate. Always define evaluation before deployment.
- How does LangGraph's checkpointing enable HITL? —
interrupt_beforepauses execution and returns control;PostgresSaverpersists state durably; the caller resumes by invokinggraph.invoke()with the human decision in state.
- Why enforce tool authorization at the MCP server rather than relying on the agent? — The LLM is not a reliable security boundary — it can be manipulated by prompt injection. The MCP server is code; it enforces authorization independent of the model's reasoning.
- What are the four HITL trigger categories? — Confidence-based (agent uncertainty), risk-based (action side-effect level), policy-based (business rule mandates review regardless of confidence), anomaly-based (agent detects unexpected state).
- What is the M×N problem MCP solves? — Without a standard protocol, M AI applications × N backend systems = M×N integrations. With MCP: M clients + N servers; each addition requires one implementation, not M or N.