Patterns for Building AI Agents (SAM BHAGWATMICHELLE GIENOW) (z-library.sk, 1lib.sk, z-lib.sk)

for Building AI Agents Patterns for Building A I A gents

PATTERNS FOR BUILDING AI AGENTS SAM BHAGWAT MICHELLE GIENOW

Copyright © 2025 by Sam Bhagwat & Michelle Gienow All rights reserved. No part of this book may be reproduced in any form or by any electronic or mechanical means, including information storage and retrieval systems, without written permission from the author, except for the use of brief quotations in a book review. Formatted with Vellum

CONTENTS Introduction v PART I CONFIGURE YOUR AGENTS From wishlist to working agent 3 1. Whiteboard Agent Capabilities 5 2. Evolve Your Agent Architecture 8 3. Dynamic Agents 12 4. Human-in-the-Loop 14 PART II ENGINEER AGENT CONTEXT Intro to context engineering 19 5. Parallelize Carefully 21 6. Share Context Between Subagents 24 7. Avoid Context Failure Modes 26 8. Compress Context 29 9. Feed Errors Into Context 33 PART III EVALUATE AGENT RESPONSES From MVP to production 37 10. List Failure Modes 39 11. List Critical Business Metrics 41 12. Cross-Reference Failure Modes and Success Metrics 43 13. Iterate Against Your Evals 46 14. Create an Eval Test Suite 48 15. Have SMEs Label Data 51 16. Create Datasets from Production Data 54 17. Evaluate Production Data 57

PART IV SECURE YOUR AGENTS Autonomy is a two-edged sword 63 18. Prevent the Lethal Trifecta 65 19. Sandbox Code Execution 68 20. Granular Agent Access Control 70 21. Agent Guardrails 72 PART V THE FUTURE OF AGENTS 22. What’s Next(ish) 77 Notes 79 Also by Sam Bhagwat 85

INTRODUCTION Here at Mastra, the open source Typescript framework for building AI agents, we’ve had a front-row seat to how people are building agents. Back in February 2025 (practically ancient times, in AI world), we published the popular guide Principles of Building AI Agents. In May, we updated the guide to include MCP, agentic RAG, and a few other emerging principles. But our work was far from finished. 2025 is the year of agents and, over the summer, we began to see a set of stories and guides emerging from prominent AI companies, model labs, and early-stage AI startups. The people pushing agents into production were publicly describing their successes (and failures). Principles was textbook-style knowledge. This was messier and rough, expressed in ah-ha! moments, lessons learned, and retrospectives — knowledge that sprawled

across social media posts, Substacks, eng blogs, and Git repos. Collecting and wrangling these lessons for our users eventually led to this book: Patterns for Building AI Agents — now Volume 2 of an eventual trilogy. Principles are conceptual, patterns are pragmatic. While Principles of Building AI Agents covered what to build, Patterns covers how to build. Principles will get you through the first few weeks of building, but Patterns should be on your desk until its contents are imprinted in your mind. We start off by sharing patterns for agent design and architecture, then dive into the art and science of context engineering. Next, we dig into the discipline of evals, the standard way for iterating on and refining agent quality. Finally, we talk about agent security, a field evolving in response to novel attack patterns. Agents are in the hands of early adopters — and attackers are enthusiastic early adopters! Thanks for coming along as we all learn together. This book is a work in progress and a living document. Perhaps as you build, you’ll discover a new pattern that makes it into our next edition! Sam Bhagwat San Francisco, CA October 2025 vi Introduction

PART I CONFIGURE YOUR AGENTS

(This page has no text content)

Building AI agents often starts with a whiteboard full of possibilities: dozens of processes to automate and tasks to offload amid grand visions of “AI can solve everything!” efficiency. How do you go from wishlist to working agent? Teams struggle not because an agent can’t handle their use cases, but because they didn’t break down the problem in a way that maps to buildable systems. The agent design patterns in Part I address the funda- mental configuration challenges that determine whether your agent will succeed or stall: Organizing dozens of capabilities into a coherent agent architecture. Building everything at once vs. discovering your system iteratively. Facing the reality that different users need different agent behaviors. Letting agents run autonomously vs. with human checkpoints. These patterns may seem simple, but if you follow them you can build agents that are not just powerful but also reli- able, maintainable, and trusted by their users.

(This page has no text content)

W 1 WHITEBOARD AGENT CAPABILITIES e’re in the decade of agents. There’s a huge number of valuable processes, currently performed by humans, that could be performed with some amount of AI assistance and automation. Deciding where to start, and how to build, is where the rubber meets the road. Problem: Agent feature overload There are two ways of specing out an agent: outside-in and inside-out.1 The outside-in view is the grand view that generates enthusiasm: seeing dozens or hundreds of busi- ness problems and processes that you could potentially build or automate. The inside-out one, though, is the one that gets the job done. Often, it arrives by way of the exec team putting a massive wish list in front of an engineer who’s like, Yo, hold your horses.

Solution: Organizational design, for your agents Imagine you were hiring a human team. What are the tasks you want performed, and how would you map them to the distinct roles that you need to fill? You’d end up doing some sort of organizational design, where you listed, sorted, and grouped capabilities, and wrote job specs from there. It turns out that designing an agent architecture works the same way! We’ve done over 50 of these exercises, and they go roughly like this:2 Write down everything you want your agent to do: Comprehensiveness is important. Keep asking, “What are we missing?” Group similar capabilities together: Pulling from the same data sources Could be performed by the same job title Returned by the same API call Figure out the natural divisions: Different responsible departments Type of task (e.g., data fetching vs. synthesis vs. triggering actions) Different steps in a business process Group related capabilities into agents. There’s something magical to this exercise; we’ve done it with dozens of teams on whiteboards both physical and virtual. When the exercise starts, the structure isn’t clear. By the end, you typically have a list of agents with tools, rank ordered by priority. 6 SAM BHAGWAT & MICHELLE GIENOW

Example: Building a sales agent An exec approaches the new head of AI: “We need to build agents to help our customer-facing staff spend less time in the CRM.” The head of AI goes around the org and pulls out a dozen desired capabilities: This agent should help do customer research, query data in Salesforce, search conver- sation transcripts, categorize accounts based on a specific sales methodology, search product knowledge bases, answer support tickets, and a few others. They sit down with their tech lead and a whiteboard. The support functionality is easy to pull out first. The sales functionality breaks down into three different stages of the process: customer discovery and research, account synthesis, and determining next steps. They now have their agent architecture: a support agent, and a sales agent with three subagents. This design enables a focused toolset per agent while also boosting agent maintainability and scalability. Related patterns Evolve Your Agent Architecture Patterns for Building AI Agents 7

C 2 EVOLVE YOUR AGENT ARCHITECTURE omplex AI workflows benefit from the same "divide and conquer" principles that work well in traditional software engineering. Problem: Monolithic mega-agents As you add more tasks to your agent, over time you can end up with a mega-agent that attempts to handle every conceiv- able task. Like Michael Scott in The Office, it will perform poorly and be prone to failure. The likelihood your agent will choose a wrong tool goes up with the number of tools. And the more complex a task, the greater number of choices the agent has to make — again increasing odds of failure. Solution: Group agent functionality together The best agent architectures are discovered by iterating:1

1. List the tasks you want your agent to perform. 2. Start with the one burning problem. 3. Build that agent really well. 4. Notice what users ask for next. 5. If it’s separate, build a new agent. 6. If your agent becomes unwieldy, split it. 7. If you have multiple agents, add routing logic. 8. Repeat. Each specialized agent has a cohesive toolchain, focused only on a specific domain, specific tools for its job, and clear success criteria. Most agents in production today have been built not as one mega-agent, but as orchestrated specialists. Example: Content creation agent Let’s say you’re an engineer working for a software company. Iteration 1: Your product marketing manager (PMM) asks you to write LinkedIn posts about the latest feature you shipped. You hate writing these, so instead you start hacking on a LinkedIn post writer agent that creates LinkedIn posts in your org’s voice. You give it a brand guidelines doc and a tone analyzer. It writes your posts. Your PMM is jazzed. Patterns for Building AI Agents 9

Iteration 2: One Tuesday, you accidentally mention what you did. On Wednesday, your PMM asks: Can you also write social media posts to promote these features? Rather than overload your LinkedIn post writer, you build a social media agent that writes more casual short- form posts. Tools include a hashtag database, character counter, and emoji library. Iteration 3: Rather than two different agents with two different frontends, you figure you’ll add a router agent that reads requests (“I need social posts for our product launch”), asks "Which platforms?" (or detects from context), and then routes to the LinkedIn agent OR social media agent OR both. Iteration 4: After proudly presenting your social media agent to your PMM, they immediately give you another request: “Can you write a blog post to link from these social posts?” You sigh, then begin work on a blog writer agent. This agent researches keywords, references your docs, and writes long form. Tools include an SEO keyword API, competitor content analyzer (because you’re getting confident now!), and a style guide. Blog traffic increases, but there’s a hiccup. Every other post, the agent will hallucinate some feature you didn’t actu- ally ship. 10 SAM BHAGWAT & MICHELLE GIENOW

Iteration 5: So you add a content coordination step in front of the router agent. It extracts key messages/features from product briefs, then passes consistent talking points to the router, which passes them to specialist agents. Now you have sequential chaining: Coordinator → Router → Specialists. Your PMM thanks you and then promptly decamps to Burning Man, where they organize a decentralized autonomous art collective on The Playa. Your final architecture: Given a request, the content coordinator adds context and hands off to router agent, which directs the task to appropriate LinkedIn, social media, and/or blog agents for them to work in parallel or in sequence. The magic: Each agent is really good at its format. You never built some "master content agent" that writes medi- ocre everything. You discovered the architecture by solving one problem at a time. Related patterns Whiteboard Agent Capabilities Patterns for Building AI Agents 11

A 3 DYNAMIC AGENTS gents need to be both scalable *and* personal. Your agent’s configuration may need to change based on things like the user’s query or the user’s identity. Problem: User realities change (constantly) Agents have system prompts, tools, memory, and model configurations. In some cases, these should hold for all queries. In others, the configurations should be more responsive. You have a full spectrum of different user types and agent scenarios, but you may not want to create new agents, or maintain multiple versions, for each one. Solution: Agents that can adapt Allow agents’ capabilities to adapt dynamically at runtime. A dynamic agent1 can adjust things like how it reasons,

which tools it uses, how much memory it keeps, and which model it invokes — all based on runtime signals like user roles, preferences, or system state. (Some agent frameworks offer tools to help with this.) This reduces redundancy, increases customization potential, and allows cost/behavior trade-offs, but intro- duces complexity in logic, testing, and consistency. Example: User differentiation A support agent capable of differentiating between free tier users, “pro” users, and enterprise accounts can dynamically scale customer support: Free tier users get basic support with documentation links. Pro users receive detailed technical support. Enterprise users get priority support escalated to humans with custom solutions. The same agent can dynamically prioritize tool selection and scale model access: Free tier and pro users get semanticRecall topK = 8, but enterprise users get topK = 15. If userTier is “enterprise” use GPT-5, else use GPT-3.5. Mastra’s runtimeContext API allows access to external values like user metadata, session state, and envi- ronment variables, which are passed into the agent so it can make decisions. Patterns for Building AI Agents 13

Statistics

Uploader

Patterns for Building AI Agents (SAM BHAGWATMICHELLE GIENOW) (z-library.sk, 1lib.sk, z-lib.sk)

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Comments 0

Recommended for You

Statistics

Uploader

Patterns for Building AI Agents (SAM BHAGWATMICHELLE GIENOW) (z-library.sk, 1lib.sk, z-lib.sk)

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Comments 0

Reply to Comment

Edit Comment

Recommended for You