Statistics
79
Views
0
Downloads
0
Donations
Support
Share
Uploader

高宏飞

Shared on 2025-11-22

AuthorMicheal Lanham

Create LLM-powered autonomous agents and intelligent assistants tailored to your business and personal needs. From script-free customer service chatbots to fully independent agents operating seamlessly in the background, AI-powered assistants represent a breakthrough in machine intelligence. In AI Agents in Action, you'll master a proven framework for developing practical agents that handle real-world business and personal tasks. Author Micheal Lanham combines cutting-edge academic research with hands-on experience to help you: - Understand and implement AI agent behavior patterns - Design and deploy production-ready intelligent agents - Leverage the OpenAI Assistants API and complementary tools - Implement robust knowledge management and memory systems - Create self-improving agents with feedback loops - Orchestrate collaborative multi-agent systems - Enhance agents with speech and vision capabilities You won't find toy examples or fragile assistants that require constant supervision. AI Agents in Action teaches you to build trustworthy AI capable of handling high-stakes negotiations. You'll master prompt engineering to create agents with distinct personas and profiles, and develop multi-agent collaborations that thrive in unpredictable environments. Beyond just learning a new technology, you'll discover a transformative approach to problem-solving. About the book In AI Agents in Action, you’ll learn how to build production-ready assistants, multi-agent systems, and behavioral agents. You’ll master the essential parts of an agent, including retrieval-augmented knowledge and memory, while you create multi-agent applications that can use software tools, plan tasks autonomously, and learn from experience. As you explore the many interesting examples, you’ll work with state-of-the-art tools like OpenAI Assistants API, GPT Nexus, LangChain, Prompt Flow, AutoGen, and CrewAI. About the reader For intermediate Python programmers.

Tags
No tags
ISBN: 1633436349
Publish Year: 2025
Language: 英文
Pages: 345
File Format: PDF
File Size: 30.0 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

M A N N I N G Micheal Lanham
The differences between the LLM interactions from direct action compared to using proxy agents, agents, and autonomous agents Please explain the definition of agent. Large language model (ChatGPT) LLM: The definition of agent is... Show an illustration of an agent. Large language model (ChatGPT) "An image of a female secret agent of Hispanic descent in a nighttime urban setting. . . Image generation model (DALL-E 3) No agent or assistant direct connection to LLM Agent/assistant proxy for image generator What is the temperature in Calgary today? Large language model (ChatGPT) LLM identifies an external function API to call and parameters to connect to a weather service. Agent/assistant acting on behalf of user User confirms execution okay. Asks user if it’s okay to execute the function on their behalf. Executes the function and returns weather information. Filter my emails by importance and notify me of the top 5 most important emails. Large language model (ChatGPT) LLM identifies an external function API to call and parameters to connect to an email service. Notifies the user of important emails. Autonomous agent making decisions on behalf of user LLM reads and sorts emails by what it deems to be important. Decision step LLM reformulates weather information and responds to the user. .
AI Agents in Action
(This page has no text content)
AI Agents in Action MICHEAL LANHAM MANN I NG SHELTER ISLAND
For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: orders@manning.com ©2025 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. The authors and publisher have made every effort to ensure that the information in this book was correct at press time. The authors and publisher do not assume and hereby disclaim any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause, or from any usage of the information herein. Manning Publications Co. Development editor: Becky Whitney 20 Baldwin Road Technical editor: Ross Turner PO Box 761 Review editor: Kishor Rit Shelter Island, NY 11964 Production editor: Keri Hales Copy editor: Julie McNamee Proofreader: Katie Tennant Technical proofreader: Ross Turner Typesetter: Dennis Dalinnik Cover designer: Marija Tudor ISBN: 9781633436343 Printed in the United States of America
I dedicate this book to all the readers who embark on this journey with me. Books are a powerful way for an author to connect with readers on a deeply personal level, chapter by chapter, page by page. In that shared experience of learning, exploring, and growing together, I find true meaning. May this book inspire you and challenge you, and help you see the incredible potential that AI agents hold— not just for the future but also for today.
vi brief contents 1 ■ Introduction to agents and their world 1 2 ■ Harnessing the power of large language models 14 3 ■ Engaging GPT assistants 39 4 ■ Exploring multi-agent systems 68 5 ■ Empowering agents with actions 98 6 ■ Building autonomous assistants 129 7 ■ Assembling and using an agent platform 160 8 ■ Understanding agent memory and knowledge 180 9 ■ Mastering agent prompts with prompt flow 212 10 ■ Agent reasoning and evaluation 244 11 ■ Agent planning and feedback 272
contents preface xiii acknowledgments xv about this book xvii about the author xxi about the cover illustration xxii 1 Introduction to agents and their world 1 1.1 Defining agents 1 1.2 Understanding the component systems of an agent 4 1.3 Examining the rise of the agent era: Why agents? 9 1.4 Peeling back the AI interface 11 1.5 Navigating the agent landscape 12 2 Harnessing the power of large language models 14 2.1 Mastering the OpenAI API 16 Connecting to the chat completions model 16 ■ Understanding the request and response 18 2.2 Exploring open source LLMs with LM Studio 20 Installing and running LM Studio 20 ■ Serving an LLM locally with LM Studio 23vii
CONTENTSviii2.3 Prompting LLMs with prompt engineering 25 Creating detailed queries 28 ■ Adopting personas 29 Using delimiters 30 ■ Specifying steps 31 ■ Providing examples 32 ■ Specifying output length 33 2.4 Choosing the optimal LLM for your specific needs 34 2.5 Exercises 36 3 Engaging GPT assistants 39 3.1 Exploring GPT assistants through ChatGPT 40 3.2 Building a GPT that can do data science 44 3.3 Customizing a GPT and adding custom actions 49 Creating an assistant to build an assistant 49 ■ Connecting the custom action to an assistant 53 3.4 Extending an assistant’s knowledge using file uploads 56 Building the Calculus Made Easy GPT 56 ■ Knowledge search and more with file uploads 58 3.5 Publishing your GPT 61 Expensive GPT assistants 62 ■ Understanding the economics of GPTs 63 ■ Releasing the GPT 63 3.6 Exercises 65 4 Exploring multi-agent systems 68 4.1 Introducing multi-agent systems with AutoGen Studio 69 Installing and using AutoGen Studio 70 ■ Adding skills in AutoGen Studio 72 4.2 Exploring AutoGen 77 Installing and consuming AutoGen 77 ■ Enhancing code output with agent critics 79 ■ Understanding the AutoGen cache 81 4.3 Group chat with agents and AutoGen 82 4.4 Building an agent crew with CrewAI 84 Creating a jokester crew of CrewAI agents 84 ■ Observing agents working with AgentOps 87 4.5 Revisiting coding agents with CrewAI 90 4.6 Exercises 95
CONTENTS ix5 Empowering agents with actions 98 5.1 Defining agent actions 99 5.2 Executing OpenAI functions 101 Adding functions to LLM API calls 101 ■ Actioning function calls 103 5.3 Introducing Semantic Kernel 107 Getting started with SK semantic functions 108 ■ Semantic functions and context variables 109 5.4 Synergizing semantic and native functions 111 Creating and registering a semantic skill/plugin 111 ■ Applying native functions 115 ■ Embedding native functions within semantic functions 117 5.5 Semantic Kernel as an interactive service agent 118 Building a semantic GPT interface 119 ■ Testing semantic services 121 ■ Interactive chat with the semantic service layer 122 5.6 Thinking semantically when writing semantic services 125 5.7 Exercises 127 6 Building autonomous assistants 129 6.1 Introducing behavior trees 130 Understanding behavior tree execution 131 ■ Deciding on behavior trees 132 ■ Running behavior trees with Python and py_trees 134 6.2 Exploring the GPT Assistants Playground 136 Installing and running the Playground 136 ■ Using and building custom actions 138 ■ Installing the assistants database 140 ■ Getting an assistant to run code locally 140 Investigating the assistant process through logs 142 6.3 Introducing agentic behavior trees 143 Managing assistants with assistants 143 ■ Building a coding challenge ABT 145 ■ Conversational AI systems vs. other methods 149 ■ Posting YouTube videos to X 150 Required X setup 151 6.4 Building conversational autonomous multi-agents 153
CONTENTSx6.5 Building ABTs with back chaining 155 6.6 Exercises 156 7 Assembling and using an agent platform 160 7.1 Introducing Nexus, not just another agent platform 161 Running Nexus 162 ■ Developing Nexus 163 7.2 Introducing Streamlit for chat application development 165 Building a Streamlit chat application 165 ■ Creating a streaming chat application 168 7.3 Developing profiles and personas for agents 170 7.4 Powering the agent and understanding the agent engine 172 7.5 Giving an agent actions and tools 174 7.6 Exercises 178 8 Understanding agent memory and knowledge 180 8.1 Understanding retrieval in AI applications 181 8.2 The basics of retrieval augmented generation (RAG) 182 8.3 Delving into semantic search and document indexing 184 Applying vector similarity search 184 ■ Vector databases and similarity search 188 ■ Demystifying document embeddings 189 Querying document embeddings from Chroma 190 8.4 Constructing RAG with LangChain 192 Splitting and loading documents with LangChain 192 ■ Splitting documents by token with LangChain 195 8.5 Applying RAG to building agent knowledge 196 8.6 Implementing memory in agentic systems 200 Consuming memory stores in Nexus 202 ■ Semantic memory and applications to semantic, episodic, and procedural memory 204 8.7 Understanding memory and knowledge compression 207 8.8 Exercises 209
CONTENTS xi9 Mastering agent prompts with prompt flow 212 9.1 Why we need systematic prompt engineering 213 9.2 Understanding agent profiles and personas 216 9.3 Setting up your first prompt flow 217 Getting started 218 ■ Creating profiles with Jinja2 templates 222 ■ Deploying a prompt flow API 223 9.4 Evaluating profiles: Rubrics and grounding 224 9.5 Understanding rubrics and grounding 228 9.6 Grounding evaluation with an LLM profile 230 9.7 Comparing profiles: Getting the perfect profile 232 Parsing the LLM evaluation output 232 ■ Running batch processing in prompt flow 235 ■ Creating an evaluation flow for grounding 238 ■ Exercises 242 10 Agent reasoning and evaluation 244 10.1 Understanding direct solution prompting 245 Question-and-answer prompting 246 ■ Implementing few-shot prompting 248 ■ Extracting generalities with zero-shot prompting 250 10.2 Reasoning in prompt engineering 252 Chain of thought prompting 253 ■ Zero-shot CoT prompting 257 ■ Step by step with prompt chaining 258 10.3 Employing evaluation for consistent solutions 261 Evaluating self-consistency prompting 262 ■ Evaluating tree of thought prompting 266 10.4 Exercises 270 11 Agent planning and feedback 272 11.1 Planning: The essential tool for all agents/assistants 273 11.2 Understanding the sequential planning process 277 11.3 Building a sequential planner 278 11.4 Reviewing a stepwise planner: OpenAI Strawberry 285
CONTENTSxii11.5 Applying planning, reasoning, evaluation, and feedback to assistant and agentic systems 288 Application of assistant/agentic planning 288 ■ Application of assistant/agentic reasoning 290 ■ Application of evaluation to agentic systems 291 ■ Application of feedback to agentic/assistant applications 293 11.6 Exercises 296 appendix A Accessing OpenAI large language models 299 appendix B Python development environment 305 index 311
preface My journey into the world of intelligent systems began back in the early 1980s. Like many people then, I believed artificial intelligence (AI) was just around the corner. It always seemed like one more innovation and technological leap would lead us to the intelligence we imagined. But that leap never came. Perhaps the promise of HAL, from Stanley Kubrick’s 2001: A Space Odyssey, capti- vated me with the idea of a truly intelligent computer companion. After years of effort, trial, and countless errors, I began to understand that creating AI was far more com- plex than we humans had imagined. In the early 1990s, I shifted my focus, applying my skills to more tangible goals in other industries. Not until the late 1990s, after experiencing a series of challenging and transforma- tive events, did I realize my passion for building intelligent systems. I knew these sys- tems might never reach the superintelligence of HAL, but I was okay with that. I found fulfillment in working with machine learning and data science, creating models that could learn and adapt. For more than 20 years, I thrived in this space, tackling problems that required creativity, precision, and a sense of possibility. During that time, I worked on everything from genetic algorithms for predicting unknown inputs to developing generative learning models for horizontal drilling in the oil-and-gas sector. These experiences led me to write, where I shared my knowl- edge by way of books on various topics—reverse-engineering Pokémon Go, building augmented and virtual reality experiences, designing audio for games, and applying reinforcement learning to create intelligent agents. I spent years knuckles-deep in code, developing agents in Unity ML-Agents and deep reinforcement learning.xiii
PREFACExiv Even then, I never imagined that one day I could simply describe what I wanted to an AI model, and it would make it happen. I never imagined that, in my lifetime, I would be able to collaborate with an AI as naturally as I do today. And I certainly never imagined how fast—and simultaneously how slow—this journey would feel. In November 2022, the release of ChatGPT changed everything. It changed the world’s perception of AI, and it changed the way we build intelligent systems. For me, it also altered my perspective on the capabilities of these systems. Suddenly, the idea of agents that could autonomously perform complex tasks wasn’t just a far-off dream but instead a tangible, achievable reality. In some of my earlier books, I had described agentic systems that could undertake specific tasks, but now, those once-theoretical ideas were within reach. This book is the culmination of my decades of experience in building intelligent systems, but it’s also a realization of the dreams I once had about what AI could become. AI agents are here, poised to transform how we interact with technology, how we work, and, ultimately, how we live. Yet, even now, I see hesitation from organizations when it comes to adopting agen- tic systems. I believe this hesitation stems not from fear of AI but rather from a lack of understanding and expertise in building these systems. I hope that this book helps to bridge that gap. I want to introduce AI agents as tools that can be accessible to everyone—tools we shouldn’t fear but instead respect, manage responsibly, and learn to work with in harmony.
acknowledgments I want to extend my deepest gratitude to the machine learning and deep learning communities for their tireless dedication and incredible work. Just a few short years ago, many questioned whether the field was headed for another AI winter—a period of stagnation and doubt. But thanks to the persistence, brilliance, and passion of countless individuals, the field not only persevered but also flourished. We’re standing on the threshold of an AI-driven future, and I am endlessly grateful for the contribu- tions of this talented community. Writing a book, even with the help of AI, is no small feat. It takes dedication, col- laboration, and a tremendous amount of support. I am incredibly thankful to the team of editors and reviewers who made this book possible. I want to express my heartfelt thanks to everyone who took the time to review and provide feedback. In particular, I want to thank Becky Whitney, my content editor, and Ross Turner, my technical editor and chief production and technology officer at OpenSC, for their dedication, as well as the whole production team at Manning for their insight and unwavering support throughout this journey. To my partner, Rhonda—your love, patience, and encouragement mean the world to me. You’ve been the cornerstone of my support system, not just for this book but for all the books that have come before. I truly couldn’t have done any of this without you. Thank you for being my rock, my partner, and my inspiration. Many of the early ideas for this book grew out of my work at Symend. It was during my time there that I first began developing the concepts and designs for agentic sys- tems that laid the foundation for this book. I am deeply grateful to my colleagues atxv
ACKNOWLEDGMENTSxviSymend for their collaboration and contributions, including Peh Teh, Andrew Wright, Ziko Rajabali, Chris Garrett, Kouros, Fatemeh Torabi Asr, Sukh Singh, and Hanif Joshaghani. Your insights and hard work helped bring these ideas to life, and I am honored to have worked alongside such an incredible group of people. Finally, I would like to thank all the reviewers: Anandaganesh Balakrishnan, Aryan Jadon, Chau Giang, Dan Sheikh, David Curran, Dibyendu Roy Chowdhury, Divya Bhargavi, Felipe Provezano Coutinho, Gary Pass, John Williams, Jose San Leandro, Laurence Giglio, Manish Jain, Maxim Volgin, Michael Wang, Mike Metzger, Piti Champeethong, Prashant Dwivedi, Radhika Kanubaddhi, Rajat Kant Goel, Ramaa Vissa, Richard Vaughan, Satej Kumar Sahu, Sergio Gtz, Siva Dhandapani, Annamaneni Sriharsha, Sri Ram Macharla, Sumit Bhattacharyya, Tony Holdroyd, Vidal Graupera, Vidhya Vinay, and Vinoth Nageshwaran. Your suggestions helped make this a better book.
about this book AI Agents in Action is about building and working with intelligent agent systems—not just creating autonomous entities but also developing agents that can effectively tackle and solve real-world problems. The book starts with the basics of working with large language models (LLMs) to build assistants, multi-agent systems, and agentic behav- ioral agents. From there, it explores the key components of agentic systems: retrieval systems for knowledge and memory augmentation, action and tool usage, reasoning, planning, evaluation, and feedback. The book demonstrates how these components empower agents to perform a wide range of complex tasks through practical examples. This journey isn’t just about technology; it’s about reimagining how we approach problem solving. I hope this book inspires you to see intelligent agents as partners in innovation, capable of transforming ideas into actions in ways that were once thought impossible. Together, we’ll explore how AI can augment human potential, enabling us to achieve far more than we could alone. Who should read this book This book is for anyone curious about intelligent agents and how to develop agentic systems—whether you’re building your first helpful assistant or diving deeper into complex multi-agent systems. No prior experience with agents, agentic systems, prompt engineering, or working with LLMs is required. All you need is a basic under- standing of Python and familiarity with GitHub repositories. My goal is to make these concepts accessible and engaging, empowering anyone who wants to explore the world of AI agents to do so with confidence.xvii
ABOUT THIS BOOKxviii Whether you’re a developer, researcher, or hobbyist or are simply intrigued by the possibilities of AI, this book is for you. I hope that in these pages you’ll find inspira- tion, practical guidance, and a new appreciation for the remarkable potential of intel- ligent agents. Let this book guide understanding, creating, and unleashing the power of AI agents in action. How this book is organized: A road map This book has 11 chapters. Chapter 1, “Introduction to agents and their world,” begins by laying a foundation with fundamental definitions of large language models, chat systems, assistants, and autonomous agents. As the book progresses, the discus- sion shifts to the key components that make up an agent and how these components work together to create truly effective systems. Here is a quick summary of chapters 2 through 11:  Chapter 2, “Harnessing the power of large language models”—We start by exploring how to use commercial LLMs, such as OpenAI. We then examine tools, such as LM Studio, that provide the infrastructure and support for running various open source LLMs, enabling anyone to experiment and innovate.  Chapter 3, “Engaging GPT assistants”—This chapter dives into the capabilities of the GPT Assistants platform from OpenAI. Assistants are foundational agent types, and we explore how to create practical and diverse assistants, from culi- nary helpers to intern data scientists and even a book learning assistant.  Chapter 4, “Exploring multi-agent systems”—Agentic tools have advanced signifi- cantly quickly. Here, we explore two sophisticated multi-agent systems: CrewAI and AutoGen. We demonstrate AutoGen’s ability to develop code autono- mously and see how CrewAI can bring together a group of joke researchers to create humor collaboratively.  Chapter 5, “Empowering agents with actions”—Actions are fundamental to any agentic system. This chapter discusses how agents can use tools and functions to execute actions, ranging from database and application programming interface (API) queries to generating images. We focus on enabling agents to take mean- ingful actions autonomously.  Chapter 6, “Building autonomous assistants”—We explore the behavior tree—a sta- ple in robotics and game systems—as a mechanism to orchestrate multiple coordinated agents. We’ll use behavior trees to tackle challenges such as code competitions and social media content creation.  Chapter 7, “Assembling and using an agent platform”—This chapter introduces Nexus, a sophisticated platform for orchestrating multiple agents and LLMs. We discuss how Nexus facilitates agentic workflows and enables complex interactions between agents, providing an example of a fully functioning multi-agent environment.  Chapter 8, “Understanding agent memory and knowledge”—Retrieval-augmented generation (RAG) has become an essential tool for extending the capabilities