AI Agents the Definitive Guide (Early Release) - Design, Deployment, and Evaluation (Nicole Koenigstein)（Z-Library）

(This page has no text content)

AI Agents: The Definitive Guide Design, Deployment, and Evaluation for Production With Early Release ebooks, you get books in their earliest form—the author’s raw and unedited content as they write—so you can take advantage of these technologies long before the official release of these titles. Nicole Koenigstein JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

AI Agents: The Definitive Guide by Nicole Koenigstein Copyright © 2026 Nicole Koenigstein. All rights reserved. Published by O’Reilly Media, Inc., 141 Stony Circle, Suite 195, Santa Rosa, CA 95401. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (https://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Nicole Butterfield Development Editor: Michele Cronin Production Editor: Elizabeth Faerm Copyeditor: Emily Wydeven Proofreader: TO COME Indexer: TO COME Cover Designer: Susan Brown Cover Illustrator: José Marzan Jr. Interior Designer: David Futato Interior Illustrator: Kate Dullea September 2026: First Edition Revision History for the Early Release 2025-09-12: First Release 2025-11-24: Second Release 2025-12-15: Third Release 2026-02-24: Fourth Release 2026-03-06: Fifth Release 2026-04-02: Sixth Release 2026-05-06: Seventh Release 2026-06-02: Eighth Release JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

2026-06-18: Final Release See https://oreilly.com/catalog/errata.csp?isbn=9798341666931 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. AI Agents: The Definitive Guide, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 979-8-341-66693-1 [LSI] JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Chapter 1. From LLMs to Agents: The Foundational Blueprint One of the defining traits of human intelligence is the way we combine inner reasoning with concrete actions, and this maps surprisingly well to how large language model (LLM) agents operate when paired with tools. Take the process of building a coding project. As a human developer you begin with a prompt, the client’s request. First comes reasoning, where you sketch out a plan of how to approach it. Then comes action, such as searching for documentation, writing functions, or debugging errors. Feedback enters the loop when tests fail, a peer review points out gaps, or the client tests the application. Each step is not static but iterative, with reasoning adapting to new insights and actions changing in response. The same applies to a single LLM agent. Given a task, the agent first reasons about what is missing or what step comes next, then takes actions by calling tools to retrieve data, run code, or check results. Like a developer’s feedback loop, the environment provides signals such as errors, gaps, or confirmations that guide the next iteration. External support, such as code libraries or documentation, maps to tools like retrieval systems or web searches, which refine the output beyond the model’s own raw ability. An LLM agent is a large language model embedded in a loop of reasoning, acting, and feedback, where it can call external tools and adapt its behavior based on results. Unlike a standalone LLM, which is limited to static text generation, an agent operates as a decision-making entity within a workflow. Figure 1-1 illustrates this flow. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Figure 1-1. An LLM agent cycles through reasoning, acting, and observing feedback until the goal is met. In this book, I will use AI agents, agents, and LLM agents interchangeably. Strictly speaking, however, an AI agent does not need to involve a language model at all. In the classical sense, an agent perceives an environment, makes decisions, and takes actions to achieve a goal. It does not need to understand or generate human language. For example, in pure reinforcement learning, agents can be trained directly through trial-and-error interaction with an environment. The benefit of using an LLM as part of an agent system is that it adds strong generalization, reasoning, and natural language capabilities, which makes agent behavior more flexible and broadly applicable beyond narrowly defined environments. Just as individual skills eventually meet their limits, a single agent can only take a workflow so far before complexity demands coordination. If you think about it, in a small company, a single contributor might handle everything end-to-end: designing, coding, testing, and deploying. But as the project grows, the team needs to expand. One person might focus on backend, another on frontend, another on testing. To keep everything aligned, a tech lead or VP of engineering steps in, coordinating the workload, assigning tasks, and integrating results. The difference between a lone contributor and a coordinated team maps directly onto the difference between a single agent and a multi-agent system (MAS). Just as a tech lead coordinates multiple contributors, a supervisor agent can orchestrate multiple specialized agents to collaborate on solving a complex goal. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

ACCOMPANYING WEBSITE AND BOOK COMMUNITY You can find additional material for this book on the accompanying website. The website includes resources such as code examples, use case examples, and supplementary material that extends the topics covered in the chapters. There is also a Discord channel for the book where readers can ask questions, discuss examples, and follow updates. It is also important to recognize that neither humans nor agents operate with unbounded autonomy. Developers are guided by coding standards, project requirements, and organizational processes, and their freedom is bounded by these structures. Agents are similarly constrained by their workflows, operating only within the scope of the tools, permissions, and safeguards they are provided. Far from being a weakness, this bounded autonomy is what makes both systems viable. For humans, structure ensures code quality, maintainability, and client satisfaction. For agents, constraints ensure safety, reliability, and alignment with the goals set by their developers and users. This parallel highlights why extending pure LLMs with tools makes sense. A language model on its own is like a developer cut off from resources like StackOverflow, IDEs, or GitHub, capable of reasoning but unable to gather new data, verify results, or improve their task output without outside help. Just as software teams scale their capabilities by adding better tools, resources, and leadership, LLMs extend their capability when embedded in agentic workflows and even multiple specialized agents within a MAS. Without these upgrades, both remain confined to the limits of their initial training and quickly become overwhelmed when faced with complex tasks. With tools, feedback, and a divide and conquer approach, LLMs can iteratively refine, adapt, and solve complex tasks that would otherwise exceed their standalone capacity. I assume in this book that you have a working knowledge of using LLMs. Maybe you’ve read the book Hands-On Large Language Models, or a similar work. You don’t need to know how to deploy LLMs to make your agents work, since I’ll guide you through those steps in later chapters. However, I expect you are comfortable reading and modifying Python code, and that you are familiar with core ML and deep learning concepts such as backpropagation and training loops. While you don’t need detailed knowledge of backpropagation, it helps you to understand how models learn by adjusting parameters based on training signals. Familiarity with linear algebra and calculus is helpful, since they underpin much of modern deep learning and optimization. It is also useful if you understand how transformers process sequences and have at least a basic sense of how embeddings work or vector stores support retrieval. If you haven’t touched these topics in a while, DON’T PANIC! Just like The Hitchhiker’s Guide to the Galaxy advises not to panic, I’ll say the same here in my guide. That said, every concept will be grounded in code, so your knowledge becomes usable again without requiring you to dust off old textbooks. And no, before you ask, the answer to everything about AI agents is not 42, but I promise you’ll get your answers throughout this book. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

CODE FOR THE BOOK All code examples from this book are available in the GitHub accompanying repository. The repository is organized by chapter, so you can easily find the code that belongs to each section. If you would like a closer look at transformers across different domains and how embeddings are used for text and other modalities, my book Transformers: The Definitive Guide can serve as a complementary source. It also explains the fundamentals of reinforcement learning, which will help you build a stronger foundation for understanding the advanced agent concepts introduced later in this book. Therefore, this book is written for readers who want to go beyond curiosity, which is why understanding these fundamentals is important. This includes data scientists, software and ML engineers who are building real systems in production, and technical leaders who need to understand the trade-offs, costs, and guardrails involved. This book is not about explaining LLMs from scratch, nor is it about abstract promises, or thought experiments. It is about the practical realities of building and running AI agents that can perform and be useful outside a PowerPoint presentation or a lab demo. This chapter explains how LLMs evolve from static prompts to dynamic agentic systems, why tool use is essential, and what autonomy means when agents operate within defined workflows, permissions, and safeguards. Finite and Hierarchical State Machines: The Base Paradigm of Agents LangGraph, CrewAI, and similar frameworks expose different levels of abstraction for structuring agent behavior. LangGraph makes the stateful, graph based model explicit, while CrewAI presents a higher level framework around agents, crews, and flows. A framework may not expose this structure as a formal finite state machines (FSMs) or hierarchical state machines (HSMs) API. However, any agentic workflow still needs a control model: it must track state, decide which action to take, observe the result, update its context, and eventually terminate. At a high level, a state machine is a computational model that exists in one of a finite number of states at any given time. It transitions between states in response to specific events or inputs, with the next state determined by the current state and the input. This section is meant to help you build a mental model of agents by showing how state-machine patterns help explain the control flow of modern AI agent systems. In simple terms, think of it as structured transitions, a pattern long used in compilers, GUIs, embedded systems, and networking protocols, now applied to reasoning and tool use in LLM-based agents. My intention in grounding your understanding first in FSMs and HSMs is more than a loose analogy. I want to give you a blueprint for building LLM agents that are robust, reliable, and adaptable. It will empower you to not only understand the concepts but to apply them effectively as you learn about JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

more complex topics later in the book. However, if you haven’t worked with state machines recently, don’t worry. You’ll see how the patterns translate directly in the code examples. You can think of the following vocabulary as the control grammar behind the tool-use and multi-agent examples that appear throughout the book. Finite State Machines A finite state machine describes the modes your system can be in and the allowed transitions between them. Each transition is triggered by an event, may be guarded by a predicate, and may perform actions that update state. Start and end are distinguished states with special meaning. The core vocabulary is: State A compact snapshot that captures what the system knows at a given moment. For an agent, this might be the message list, routing hints, or progress markers. Event Something that happens since the last decision, such as a tool being invoked or returning a result. Guard A check that decides which transition to take, based on the current state and latest event. Action Work performed during a transition, like invoking a tool, appending a message, saving a checkpoint, or requesting human input. Termination A condition that signals the process has reached the end. Figure 1-2 illustrates a finite state machine for tool use. Each transition is defined by an event, a guard, and an action. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Figure 1-2. High level overview of a finite state machine for tool use. An FSM is ideal when behavior alternates among a few stable modes and when recoverability matters, since implementations can checkpoint at each node or state boundary and resume after a crash from the last completed step. Thinking in terms of states lets you decompose a complex task into distinct steps. For example, instead of a single prompt for “write a blog post”, you might define states such as topic ideation, outline generation, drafting, editing, and SEO optimization. Each state can link to specific tools your agent will use. Transitions make you specify the conditions under which one task hands off to the next. For instance, what output from the outline generation state triggers the drafting state. This structure counters unstructured, one-shot prompts that often lead to inconsistent results with LLMs alone. As soon as you add more modes, such as planning, reflection, approval gates, and retries, an FSM can force duplication or create spaghetti routing. You need to keep control of the edges, since they are decision points: each one encodes the event, guard, and action that drives functionality. This is where hierarchy helps. Hierarchical State Machines Hierarchical state machines let states contain other states. Superstates capture shared entry/exit behavior and shared guards. Substates inherit those rules and add their own. History nodes remember which substate you were in when you left, so you can resume exactly there later. These new additions to the state machine are explained below. Superstate JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

A state that groups a set of child states and their common policies. Entering the superstate runs shared logic once; exiting it runs shared cleanup once. Substate A concrete mode that lives inside a superstate. It has its own edges but inherits the superstate’s guards and actions. History A “remember where I was” marker. Shallow history resumes at the last active child; deep history resumes inside nested grandchildren. In agent terms, this is a checkpoint for a portion of the graph. Parallel region Two or more child regions of a superstate that advance independently. Useful for fanning out tool calls or agent roles and then joining. Figure 1-3 shows a hierarchical state machine with a WORKING superstate. PLAN, ACT, and REFLECT are substates. H is a shallow history marker that records the last active substate so execution can resume there if the superstate is re-entered. Figure 1-3. Overview of a hierarchical state machine with history marker hat records the last active substate. HSMs are particularly relevant to more advanced MAS. An HSM allows for “states within states”, which reduces duplication and makes policies obvious. For example, you can attach rate limits, safety filters, or circuit breakers to a WORKING superstate, so planning, acting, and reflecting all inherit the same safeguards. You define a guard or router once and reuse it across nodes. Consider a research agent. Instead of treating research as a single monolithic step, you can break it into a sub- JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

machine with states such as searching for keywords, gathering URLs, scraping data, and summarizing findings. Each of these states can even map to a dedicated agent with its own capabilities, contributing to the larger team of agents. This nested structure helps manage complexity, keeps policies consistent, and makes the overall system easier to reason about and maintain. Mapping FSM/HSM to Agent Frameworks LangGraph already speaks the language of state machines and hierarchy. You implement it with subgraphs and shared keys. The router or decision-making agent that selects the next action (such as calling the next agent or tool) is analogous to the event handler in a classic state machine. It’s the part of the system that takes the current state and the LLM’s output (the event or payload) and determines the next logical step. The direct mapping is shown below. State schema The parent graph’s TypedDict defines the shared “bus”. Subgraphs declare which keys they read and write. Overlapping keys are the superstate’s interface. class State(TypedDict): messages: List[BaseMessage] working_last: str | None Shared bus across parent and subgraphs. Optional shallow history marker. Nodes and subgraphs A subgraph is an HSM superstate. Its internal nodes are substates. Entering the subgraph runs its entry node; leaving it runs its exit edge back to the parent. def plan_node(state: State) -> State: ai = AIMessage(content="Plan next step") return {"messages": [ai], "working_last": "plan"} def act_node(state: State) -> State: ai = AIMessage(content="Act on plan") return {"messages": [ai], "working_last": "act"} subgraph_builder = StateGraph(State) subgraph_builder.add_node("PLAN", plan_node) subgraph_builder.add_node("ACT", act_node) subgraph_builder.add_edge(START, "PLAN") subgraph = subgraph_builder.compile() Subgraph is the superstate. Guards and conditional edges JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Parent-level guards enforce global policy (budget, safety). Child-level guards route among PLAN/ACT/REFLECT. Because guards are plain Python, they remain deterministic and auditable. def route_within_working(state: State): last = state.get("working_last") return "ACT" if last == "plan" else END subgraph_builder.add_conditional_edges("PLAN", route_within_working, {"ACT": "ACT", END: END}) parent = StateGraph(State) parent.add_node("WORKING", subgraph) def global_guard(state: State): tail = "".join([ getattr(m, "content", "") for m in state["messages"][-3:] ]).lower() return END if "stop" in tail else "WORKING" parent.add_edge(START, "WORKING") parent.add_conditional_edges("WORKING", global_guard, {"WORKING": "WORKING", END: END}) Add conditional edge inside subgraph. Parent graph with a global guard. Example policy: end if user said stop. History and checkpointing MemorySaver gives you shallow and deep history for free: resuming a thread recreates the subgraph at the last completed node. If you want explicit “return to where I was inside WORKING,” persist a small marker (for example, state["working_last"] = “reflect”) and branch on it when re-entering. def reflect_node(state: State) -> State: ai = AIMessage(content="Reflect on result") return {"messages": [ai], "working_last": "reflect"} checkpointer = MemorySaver() app = parent.compile(checkpointer=checkpointer) cfg = {"configurable": {"thread_id": "session-1"}} app.invoke({"messages": [HumanMessage(content="start")], "working_last": None}, config=cfg) Substates set working_last on exit. Compile with checkpointing. Run in a thread. MemorySaver restores last completed node on resume. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

FSMs give you a way to structure tasks into stable, recoverable steps. HSMs extend this by reducing duplication and centralizing policy when workflows grow more complex. Both patterns prepare you for the next step: turning static LLM predictions into dynamic, stateful agents. Foundations: From Static Models to Dynamic Agents As you saw in the previous section, adding states and control flows is essential for automation. The same principle applies to agents. To move from static LLM usage to agentic systems, you need explicit state and control flow. State turns one-off tool calls into a stateful process where prior actions, tool results, and routing decisions can be preserved across steps. Control flow adds guards and transitions that govern when to plan, when to act, and when to stop. The reason to add this to LLMs is simple: LLMs are static predictors. They generate the next token given a prompt, drawing only on patterns encoded in their training data. This makes them powerful reasoners but leaves them confined to what they already know. They cannot update knowledge, verify claims, or interact with an environment. Agency emerges when reasoning is coupled with action. Reasoning is the internal process of planning or deciding the next step. Action extends this by invoking tools, retrieving data, executing code, or calling external services. Together they form an iterative loop: reason, act, receive feedback, and adjust. This loop is the foundation of agentic behavior. Table 1-1 compares static LLMs to agentic systems. These contrasts show not only why agents are necessary, but also how their design shifts traditional machine learning workflows into dynamic, tool augmented systems. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Table 1-1. Comparison of Static LLMs and Agentic Systems Static LLM (Traditional Use) Agentic System Single-pass token generator Iterative reasoning–action–feedback system Confined to training data Can retrieve, verify, and update knowledge No memory, stateless Stateful with short- and long-term memory Linear prompt–response Iterative and adaptive workflow No tool use Tool-augmented (retrieval, code, APIs) Fixed interpretation of prompt Can refine or reinterpret goals dynamically No external validation Actively checks, corrects, and improves output Figure 1-4 illustrates how modules turn a static LLM into a dynamic agent. The user request enters the core, planning and memory shape its reasoning, and tools enable concrete action. The cycle of reasoning, acting, and adapting emerges from the coordination of these elements. Concrete memory architectures, storage choices, retrieval strategies, and persistence mechanisms are covered in Chapter 10. Figure 1-4. Baseline agent architecture: an LLM alone can only predict text, but with planning, memory, and tools it becomes a usable agentic system. Planning modules guide the agent’s reasoning, ranging from simple chain-of-thought1 traces to more advanced approaches such as trees of thought2 or self-critique. Memory modules provide continuity, allowing the agent to recall past steps, reuse prior knowledge, or persist information across sessions. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Tools connect the agent to its environment, whether through search, retrieval, code execution, or custom APIs. Moreover, a key difference between traditional LLM usage and agentic systems is whether the workflow is stateless or stateful. A stateless call treats the model like a black box: it takes a prompt, predicts the next tokens, and returns a result. Nothing is remembered, no actions are taken, and no feedback loop exists. Example 1-1 illustrates a stateless LLM call. Example 1-1. Stateless LLM call with LangChain llm = ChatOpenAI(model="gpt-5-mini") response = llm.invoke("What are AI agents?") print(response.content) In addition, the moment you need facts from outside training data, or you want the model to act (search, compute), you introduce tools. In the following code (Examples 1-2 - 1-4) is a minimal stateless tool-using run: first you bind two tools to the LLM, and then run a tiny loop that executes any tool calls the model needs to fulfill its task. First, you need to define your tools. Example 1-2 shows an implementation of tools. Example 1-2. Defining tools for a stateless run @tool("internet_search") def internet_search(query: str) -> str: """Search Google via SerpAPI for up to date information.""" serp_api_key = os.environ["SERPAPI_API_KEY"] params = {"engine": "google", "gl": "us", "hl": "en"} search = SerpAPIWrapper(params=params, serpapi_api_key=serp_api_key) return search.run(query) @tool("calculator") def calculator(expression: str) -> str: """Evaluate a single line mathematical expression with numexpr.""" local_dict = {"pi": math.pi, "e": math.e} out = numexpr.evaluate( expression.strip(), global_dict={}, local_dict=local_dict, ) return str(out) tools = [internet_search, calculator] tool_map: Dict[str, Any] = {t.name: t for t in tools} LangChain’s @tool decorator uses the function name, typed signature, and docstring to build the schema the model sees. The docstring is mandatory: if it is missing, LangChain will raise an error when constructing the tool schema or binding tools to the model. Constrain evaluation: deny globals and expose only the constants you want. This keeps the calculator deterministic and safe. With LangChain’s @tool decorator, the function signature and docstring are used to build the schema the model sees. You need to keep these precise, consistent and as explanatory as possible for the task-to-be-done efficiently. The docstring acts as the tool description and is a hard requirement JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

from LangChain, if it’s missing, you get an error. In addition, if the docstring is vague the model is more likely to produce invalid calls. Now, to give the LLM access to the tool, you just bind the tools to the LLM as shown in Example 1- 3. Example 1-3. Bind tools to LLM llm = ChatOpenAI(model="gpt-4o", temperature=0, max_tokens=800).bind_tools(tools, tool_choice="auto") Bind tools to the model. tool_choice="auto” lets the model decide whether and which tool to call. MODERN LLMS ARE BUILT FOR AGENTIC TASKS Modern LLMs are not only capable of calling tools, they are increasingly trained and optimized specifically for agentic tasks. For example, models such as Kimi K2, Llama 4, and Qwen3.6 are explicitly tuned for coding, tool use, and powering agentic systems. These models go beyond answering questions in a chat window. They are instruction tuned to act, invoking APIs, running code, or retrieving data as part of their core design. With tools bound, you can run a minimal stateless tool loop. The code in Example 1-4 executes any tool calls the model requests and feeds the tool results back as observations. In a tool-calling workflow, the assistant may produce a tool call instead of a final answer. After your code executes that tool, the latest message is the tool observation, and the model still needs one more turn to interpret the result and write the final response. For this reason, the loop includes a short wrap-up instruction that forces a final assistant message if the run reaches the step limit after a tool observation. Example 1-4. Stateless single run with tool loop def run_once(prompt: str, max_steps: int = 8) -> str: messages: List[HumanMessage | AIMessage | ToolMessage] = [ HumanMessage(content=prompt)] last_ai: AIMessage | None = None for _ in range(max_steps): ai: AIMessage = llm.invoke(messages) messages.append(ai) last_ai = ai calls = getattr(ai, "tool_calls", None) or [] if not calls: return messages[-1].content for call in calls: name = call["name"] args = call.get("args", {}) or {} result = tool_map[name].invoke(args) messages.append(ToolMessage( content=str(result), name=name, tool_call_id=call.get("id") )) messages.append(HumanMessage(content=""" Finish now. Give a short final answer in this exact format: Current temperature: JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Square of current temperature: """.strip())) final_ai: AIMessage = llm.invoke(messages) return final_ai.content print(run_once("""Two step task. Step 1: Use internet_search to get the current air temperature in New York City today. Show the exact query you used, the top source title and snippet, and extract a numeric temperature in Celsius. Return this temperature as feedback for Step 2. Step 2: Using the Celsius value from Step 1, compute its square with calculator. Show the exact expression you used and the numeric result. Important: Give a short final answer in this format: Current temperature: Square of current temperature:""")) Local short term memory for this run only. This makes the workflow stateless across runs. A simple step bound protects against runaway loops: LLMs cycling with no termination. One model step. The assistant may choose to call tools. Read structured tool calls. Attach the observation with the matching tool_call_id so the model can correlate results to calls. Tool cycles can end on an observation. This final instruction guarantees a closing assistant message in your requested format. Giving the LLM a State The first step in turning an LLM into an agent is to place it inside a stateful workflow. Instead of treating each model call as an isolated prompt and response, the workflow records where the agent is, what has already happened, and what information should carry forward. In this book, I use state to mean the structured runtime snapshot of that workflow, not just the text context passed into the model. The model’s context may be one part of the state, but state can also include routing information, tool results, checkpoints, branch identifiers, and other control data. Figure 1-5 compares stateless vs. stateful agent flows. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Figure 1-5. A stateless agent treats every call independently, while a stateful agent reads from and writes to shared state at every step. Based on input and logic, it transitions to the next state, and each state has an associated function or behavior. In LangGraph, this can be any Python type, but is typically a TypedDict or Pydantic BaseModel. The following outlines LangGraph’s key concepts to build stateful, adaptable AI agents. State A shared data structure that represents the current snapshot of an application. Nodes Functions that encode the agent’s logic. They take the current state, perform computation or side effects, and return an updated state. Edges Rules that select the next node based on the current state. They can be conditional branches or fixed transitions and should also detect when a finish condition is met. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Command An object that combines control flow and state updates to support multi-actor communication. A node can both update the state and choose the next node by returning a Command. So far, the tools you created earlier have only been used in stateless loops. Each run was independent: the model could call tools, get answers, and wrap up, but as soon as the run ended, all context was lost. Now, if you want to combine LangGraph’s concepts to build stateful agents, you need to embed the tools into a stateful workflow. Instead of discarding messages after each turn, you keep them in a structured state. This makes the system agentic: The LLM node reasons about the next step. The tool node executes external actions. Edges route control flow based on whether tools were requested. Memory checkpoints persist the conversation, so the agent can reuse earlier results in later turns. This transforms tool use from being a one-off helper into a continuous reasoning–acting–adapting cycle. In the next code listings (Examples 1-5 – 1-13), you’ll build such a stateful agent with LangGraph, starting from the same tools and LLM binding introduced in Examples 1-3 - 1-4. Example 1-5 shows how a state object can be implemented in LangGraph to carry the conversation and coordinate control flow. You can reuse the tools and the bound llm from Example 1-3 and add only what is new: a typed state, an LLM node that appends assistant messages, a tool node that executes calls, and a router that decides whether to continue or stop based on tool calls. Example 1-5. Build a minimal LangGraph with state, nodes, and routing class AgentState(TypedDict): messages: Annotated[List[BaseMessage], add_messages] def llm_node(state: AgentState) -> AgentState: ai = llm.invoke(state["messages"]) return {"messages": [ai]} tool_node = ToolNode(tools=tools) graph = StateGraph(AgentState) graph.add_node("llm", llm_node) graph.add_node("tools", tool_node) graph.add_edge(START, "llm") def route(state: AgentState): last = state["messages"][-1] calls = getattr(last, "tool_calls", None) or [] return "tools" if calls else END graph.add_conditional_edges("llm", route, {"tools": "tools", END: END}) graph.add_edge("tools", "llm") The state carries the conversation as a message list. add_messages handles safe merging across node updates. JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh JOIN THIS DISCORD SERVER IF YOU WANT THE COMPLETE PDF BOOK: https://discord.gg/Bk3Qz8u5kh

Statistics

Uploader

AI Agents the Definitive Guide (Early Release) - Design, Deployment, and Evaluation (Nicole Koenigstein)（Z-Library）

AI Reading Assistant

Passage locations

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Recommended for You

Statistics

Uploader

AI Agents the Definitive Guide (Early Release) - Design, Deployment, and Evaluation (Nicole Koenigstein)（Z-Library）

AI Reading Assistant

Passage locations

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Reply to Comment

Edit Comment

Recommended for You