The Developers Playbook for Large Language Model Security Building Secure AI Applications (Steve Wilson)（Z-Library）

Author: Steve Wilson

科学

Large language models (LLMs) are not just shaping the trajectory of AI, they're also unveiling a new era of security challenges. This practical book takes you straight to the heart of these threats. Author Steve Wilson, chief product officer at Exabeam, focuses exclusively on LLMs, eschewing generalized AI security to delve into the unique characteristics and vulnerabilities inherent in these models. Complete with collective wisdom gained from the creation of the OWASP Top 10 for LLMs list—a feat accomplished by more than 400 industry experts—this guide delivers real-world guidance and practical strategies to help developers and security teams grapple with the realities of LLM applications. Whether you're architecting a new application or adding AI features to an existing one, this book is your go-to resource for mastering the security landscape of the next frontier in AI. You'll learn Why LLMs present unique security challenges How to navigate the many risk conditions associated with using LLM technology The threat landscape pertaining to LLMs and the critical trust boundaries that must be maintained How to identify the top risks and vulnerabilities associated with LLMs Methods for deploying defenses to protect against attacks on top vulnerabilities Ways to actively manage critical trust boundaries on your systems to ensure secure execution and risk minimization

📄 File Format: PDF

💾 File Size: 4.9 MB

Views

Downloads

0.00

Total Donations

📖 Read Online ⬇️ Download

📄 Text Preview (First 20 pages)

ℹ️

Registered users can read the full content for free

📄 Page 1

(This page has no text content)

📄 Page 2

COMPUTER NET WORK SECURIT Y “Steve Wilson’s playbook is essential for AI developers and red teamers. It transforms the enormous risks into manageable challenges, providing the expertise to secure LLM-based apps.” —Marten Mickos CEO, HackerOne “A must-read for innovators, delivered by the father of LLM security, Steve Wilson.” —Sherri Douville CEO, Medigram “Steve Wilson’s invaluable industry expertise, paired with his unique dynamic approach to a rapidly shifting landscape, makes this a must-read.” —Ads Dawson Senior security engineer, Cohere The Developer’s Playbook for Large Language Model Security linkedin.com/company/oreilly-media youtube.com/oreillymedia Large language models (LLMs) aren’t just shaping the trajectory of AI; they’re also unveiling a new era of security challenges. This practical book takes you straight to the heart of these threats. Author Steve Wilson, the project lead for the OWASP Top 10 for LLM applications, focuses on the unique characteristics and vulnerabilities you must handle when building software using LLMs. This handbook for developers and security teams delivers real-world guidance and actionable strategies to help you grapple with LLM applications. Whether you’re architecting a new application or adding AI features to an existing one, this book is your go-to resource for mastering the security landscape of the next frontier in AI. You’ll learn: • Why LLMs present unique security challenges • How to navigate risks associated with using LLM technology • The threat landscape pertaining to LLMs and the critical trust boundaries that must be maintained • Methods for deploying defenses to protect against attacks on top vulnerabilities • Ways to improve your software development process to ensure you’re building safe and secure AI applications Steve Wilson is the chief product officer at Exabeam and a recognized leader in the fields of AI and cybersecurity. He has over 25 years of experience building software platforms at multibillion-dollar technology companies such as Citrix, Oracle, and Sun Microsystems. Steve is the author of Java Platform Performance: Strategies and Tactics. US $79.99 CAN $99.99 ISBN: 978-1-098-16220-7 The D evelop er’s Pla yb ook for La rg e La ng ua g e M od el Security The D evelop er’s Pla yb ook for La rg e La ng ua g e M od el Security

📄 Page 3

Praise for The Developer’s Playbook for Large Language Model Security Steve Wilson’s playbook is essential for AI developers and red teamers. It transforms the enormous risks into manageable challenges, providing the expertise to secure customer-facing and internal LLM-based apps. —Marten Mickos, CEO, HackerOne A must-read for innovators, delivered by the father of LLM Security, Steve Wilson. Essential for leaders, this book delivers crucial insights into securing LLM technologies. —Sherri Douville, CEO, Medigram Steve Wilson’s invaluable industry expertise, paired with his unique dynamic approach to a rapidly shifting landscape, makes this a must-read. Drawing from my experience in AI red teaming, I wholeheartedly advocate for this book’s pinnacle full-stack approach and rigorous, multi-faceted insights. —Ads Dawson, senior security engineer, Cohere The Developer’s Playbook for Large Language Model Security is a critical and comprehensive guide for the security industry as we race to keep pace with the rapid adoption of GenAI and LLMs and ensure secure organizational outcomes. —Chris Hughes, president, Aquia & founder, Resilient Cyber

📄 Page 4

This book is insightful, clear, crisp, and succinct, yet detailed. It explores the spectrum of crucial topics, including LLM architectures, trust boundaries, RAG, prompt injection, and excessive agency. If you are working with LLMs, you need to read and understand this book. —Krishna Sankar, Distinguished AI engineer & NIST AI Safety Institute principal investigator In The Developer’s Playbook for Large Language Model Security, readers embark on an entertaining and exciting journey to the LLM security frontier. Steve Wilson provides a compass to navigate LLM security, where the thrill of innovation meets high stakes and real-world consequences. —Sandy Dunn, CISO, Brand Engagement Networks

📄 Page 5

Steve Wilson The Developer’s Playbook for Large Language Model Security Building Secure AI Applications Boston Farnham Sebastopol TokyoBeijing

📄 Page 6

978-1-098-16220-7 [LSI] The Developer’s Playbook for Large Language Model Security by Steve Wilson Copyright © 2024 Stephen Wilson. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisition Editor: Nicole Butterfield Development Editor: Jeff Bleiel Production Editor: Aleeya Rahman Copyeditor: Penelope Perkins Proofreader: Piper Editorial Consulting, LLC Indexer: WordCo Indexing Services, Inc. Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea September 2024: First Edition Revision History for the First Edition 2024-09-03: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098162207 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. The Developer’s Playbook for Large Language Model Security, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

📄 Page 7

Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1. Chatbots Breaking Bad. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Let’s Talk About Tay 1 Tay’s Rapid Decline 2 Why Did Tay Break Bad? 3 It’s a Hard Problem 4 2. The OWASP Top 10 for LLM Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 About OWASP 8 The Top 10 for LLM Applications Project 9 Project Execution 9 Reception 10 Keys to Success 10 This Book and the Top 10 List 12 3. Architectures and Trust Boundaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 AI, Neural Networks, and Large Language Models: What’s the Difference? 13 The Transformer Revolution: Origins, Impact, and the LLM Connection 14 Origins of the Transformer 15 Transformer Architecture’s Impact on AI 15 Types of LLM-Based Applications 16 LLM Application Architecture 18 Trust Boundaries 19 The Model 21 User Interaction 22 Training Data 23 Access to Live External Data Sources 24 v

📄 Page 8

Access to Internal Services 25 Conclusion 26 4. Prompt Injection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Examples of Prompt Injection Attacks 28 Forceful Suggestion 28 Reverse Psychology 29 Misdirection 29 Universal and Automated Adversarial Prompting 31 The Impacts of Prompt Injection 31 Direct Versus Indirect Prompt Injection 32 Direct Prompt Injection 33 Indirect Prompt Injection 33 Key Differences 34 Mitigating Prompt Injection 34 Rate Limiting 35 Rule-Based Input Filtering 35 Filtering with a Special-Purpose LLM 36 Adding Prompt Structure 36 Adversarial Training 38 Pessimistic Trust Boundary Definition 39 Conclusion 40 5. Can Your LLM Know Too Much?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Real-World Examples 41 Lee Luda 42 GitHub Copilot and OpenAI’s Codex 43 Knowledge Acquisition Methods 44 Model Training 45 Foundation Model Training 45 Security Considerations for Foundation Models 46 Model Fine-Tuning 47 Training Risks 47 Retrieval-Augmented Generation 49 Direct Web Access 50 Accessing a Database 54 Learning from User Interaction 58 Conclusion 60 6. Do Language Models Dream of Electric Sheep?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Why Do LLMs Hallucinate? 62 Types of Hallucinations 63 vi | Table of Contents

📄 Page 9

Examples 63 Imaginary Legal Precedents 63 Airline Chatbot Lawsuit 65 Unintentional Character Assassination 66 Open Source Package Hallucinations 67 Who’s Responsible? 68 Mitigation Best Practices 69 Expanded Domain-Specific Knowledge 70 Chain of Thought Prompting for Increased Accuracy 71 Feedback Loops: The Power of User Input in Mitigating Risks 72 Clear Communication of Intended Use and Limitations 74 User Education: Empowering Users Through Knowledge 75 Conclusion 77 7. Trust No One. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Zero Trust Decoded 80 Why Be So Paranoid? 81 Implementing a Zero Trust Architecture for Your LLM 81 Watch for Excessive Agency 83 Securing Your Output Handling 85 Building Your Output Filter 88 Looking for PII with Regex 88 Evaluating for Toxicity 89 Linking Your Filters to Your LLM 90 Sanitize for Safety 91 Conclusion 92 8. Don’t Lose Your Wallet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 DoS Attacks 94 Volume-Based Attacks 94 Protocol Attacks 95 Application Layer Attacks 95 An Epic DoS Attack: Dyn 96 Model DoS Attacks Targeting LLMs 96 Scarce Resource Attacks 97 Context Window Exhaustion 98 Unpredictable User Input 99 DoW Attacks 100 Model Cloning 101 Mitigation Strategies 101 Domain-Specific Guardrails 102 Input Validation and Sanitization 102 Table of Contents | vii

📄 Page 10

Robust Rate Limiting 102 Resource Use Capping 103 Monitoring and Alerts 103 Financial Thresholds and Alerts 103 Conclusion 104 9. Find the Weakest Link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Supply Chain Basics 106 Software Supply Chain Security 107 The Equifax Breach 107 The SolarWinds Hack 108 The Log4Shell Vulnerability 110 Understanding the LLM Supply Chain 111 Open Source Model Risk 112 Training Data Poisoning 113 Accidentally Unsafe Training Data 113 Unsafe Plug-ins 114 Creating Artifacts to Track Your Supply Chain 114 Importance of SBOMs 115 Model Cards 115 Model Cards Versus SBOMs 117 CycloneDX: The SBOM Standard 118 The Rise of the ML-BOM 119 Building a Sample ML-BOM 121 The Future of LLM Supply Chain Security 123 Digital Signing and Watermarking 123 Vulnerability Classifications and Databases 124 Conclusion 128 10. Learning from Future History. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Reviewing the OWASP Top 10 for LLM Apps 129 Case Studies 130 Independence Day: A Celebrated Security Disaster 131 2001: A Space Odyssey of Security Flaws 133 Conclusion 137 11. Trust the Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 The Evolution of DevSecOps 139 MLOps 140 LLMOps 141 Building Security into LLMOps 141 Security in the LLM Development Process 142 viii | Table of Contents

📄 Page 11

Securing Your CI/CD 142 LLM-Specific Security Testing Tools 143 Managing Your Supply Chain 145 Protect Your App with Guardrails 146 The Role of Guardrails in an LLM Security Strategy 147 Open Source Versus Commercial Guardrail Solutions 148 Mixing Custom and Packaged Guardrails 148 Monitoring Your App 149 Logging Every Prompt and Response 149 Centralized Log and Event Management 149 User and Entity Behavior Analytics 149 Build Your AI Red Team 150 Advantages of AI Red Teaming 151 Red Teams Versus Pen Tests 152 Tools and Approaches 153 Continuous Improvement 154 Establishing and Tuning Guardrails 154 Managing Data Access and Quality 154 Leveraging RLHF for Alignment and Security 155 Conclusion 156 12. A Practical Framework for Responsible AI Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Power 158 GPUs 159 Cloud 160 Open Source 161 Multimodal 163 Autonomous Agents 164 Responsibility 165 The RAISE Framework 165 The RAISE Checklist 172 Conclusion 173 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Table of Contents | ix

📄 Page 12

(This page has no text content)

📄 Page 13

Preface Everywhere in the world, we’re riding the large language model (LLM) wave, and it’s exhilarating! When ChatGPT burst onto the scene, it didn’t just walk into the record books; it smashed them, becoming the fastest-adopted application in history. Now, it’s as if every software vendor on the planet is racing to embed generative AI and LLM technologies into their stack, pushing us into uncharted territories. The buzz is real, the hype is justified, and the possibilities seem limitless. But hold on because there’s a twist. As we marvel at these technological wonders, their security scaffolding is, to put it mildly, a work in progress. The hard truth? Many developers are stepping into this new era without a map, largely unaware of the security and safety quicksand beneath the surface. It’s almost routine now: every week, we’re hit with another headline screaming about an LLM hiccup. The fallout from these individual incidents has been moderate so far, but make no mistake— we’re flirting with disaster. The risks aren’t just hypothetical; they’re as real as it gets, and the clock is ticking. Without a deep dive into the murky waters of LLM security risks and how to navigate them, we’re not just risking minor glitches; we’re courting major catastrophes. It’s time for developers to gear up, get informed, and get ahead of the curve. Fast! Who Should Read This Book The primary audience for this book is development teams that are building custom applications that embed LLM technologies. Through my recent work in this area, I’ve come to understand that these teams are often large and their members include an incredibly diverse set of backgrounds. These include software developers skilled in “web app” technologies who are taking their first steps with AI. These teams may also consist of AI experts who are bringing their craft out of the back office for the first time and into the limelight, where the security risks are much different. They also include application security pros and data science specialists. xi

📄 Page 14

Beyond that core audience, I’ve learned that others have found much of this informa‐ tion useful. This includes the extended teams involved in these projects, who want to understand the underpinnings of the technologies to help mitigate the critical risks of adopting these new technologies. These include software development executives, chief information security officers (CISOs), quality engineers, and security operations teams. Why I Wrote This Book I’ve always been fascinated by artificial intelligence. As a preteen, I fondly remember writing video games on my Atari 400 home computer. Circa 1980, this little machine had only 8 kilobytes of RAM. But I still managed to cram a complete clone of the Tron Lightcycles game onto that machine, complete with a simple but effective AI to drive one of the cycles when you were playing in single-player mode. In my professional career, I’ve been involved with several AI-related projects. After college, my best friend Tom Santos and I started an AI software company based on a few thousand lines of handcrafted C++ code that solved seemingly intractable prob‐ lems with genetic algorithms. I’d later help build a large-scale machine learning sys‐ tem at Citrix with my friends Kedar Poduri and Ebenezer Schubert. But when I saw ChatGPT for the first time, I knew everything had changed. When I first encountered LLMs, I worked at a company that built cybersecurity soft‐ ware. My job was helping large companies find and track vulnerabilities in their soft‐ ware. It quickly became apparent that LLMs offered unique and serious security vulnerabilities. Over the next few months, I retooled my career to go after this disrup‐ tion. I started a popular open source project around LLM security, which you’ll hear more about later. I even switched jobs to join Exabeam, a company that works at the intersection of AI and cybersecurity. When an editor from O’Reilly approached me about writing a book on this topic, I knew I had to jump at the chance. Navigating This Book This book has 12 chapters that are divided into three logical sections. I’ll sketch out each section and chapter here to give you an idea of the approach and so you’ll know what’s coming as you read. Section 1: Laying the Foundation (Chapters 1–3) The initial chapters of this book establish the groundwork for understanding the security posture of LLM-based applications. They should give you the grounding you can use to confidently unpack the issues facing the development of apps using LLMs: xii | Preface

📄 Page 15

• Chapter 1, “Chatbots Breaking Bad”, walks through a real-world case study whereby amateur hackers destroyed an expensive and promising chatbot project from one of the world’s largest software companies. This will set the stage for your forthcoming battles in this arena. • Chapter 2, “The OWASP Top 10 for LLM Applications”, introduces a project I founded in 2023 that aims to identify and address the unique security challenges posed by LLMs. The knowledge gained working on that directly led to my writ‐ ing this book. • Chapter 3, “Architectures and Trust Boundaries”, explores the structure of appli‐ cations using LLMs, emphasizing the importance of controlling the various data flows within the application. Section 2: Risks, Vulnerabilities, and Remediations (Chapters 4–9) In these chapters, we’ll break down the significant risk areas you face when develop‐ ing LLM applications. These risks include issues with flavors familiar to any applica‐ tion security practitioner, such as injection attacks, sensitive information leakage, and software supply chain risk. You’ll also be introduced to classes of vulnerabilities well known to machine learning aficionados but less familiar in web development, such as training data poisoning. Along the way, you’ll also learn about all-new security and safety concerns plaguing these new generative AI systems, such as hallucinations, overreliance, and excessive agency. I’ll walk you through real-world case studies to help you understand the risks and implications and advise you on how to prevent or mitigate these risks on a case- by-case basis: • Chapter 4, “Prompt Injection”, explores how attackers can manipulate LLMs by crafting specific inputs that cause them to perform unintended actions. • Chapter 5, “Can Your LLM Know Too Much?”, dives into the risks of sensitive information leakage, showcasing how LLMs can inadvertently expose data they’ve been trained on and how to safeguard against this vulnerability. • Chapter 6, “Do Language Models Dream of Electric Sheep?”, examines the unique phenomenon of “hallucinations” in LLMs—instances where models gen‐ erate false or misleading information. • Chapter 7, “Trust No One”, focuses on the principle of zero trust, explaining the importance of not taking any output at face value and ensuring rigorous valida‐ tion processes are in place to handle LLM outputs. • Chapter 8, “Don’t Lose Your Wallet”, tackles the economic risks of deploying LLM technologies, focusing on denial-of-service (DoS), denial-of-wallet (DoW), Preface | xiii

📄 Page 16

and model cloning attacks. These threats exploit similar vulnerabilities to impose financial burdens, disrupt services, or steal intellectual property. • Chapter 9, “Find the Weakest Link”, highlights the vulnerabilities within the soft‐ ware supply chain and the critical steps needed to secure it from potential breaches that could compromise the entire application. By understanding and addressing these risks, developers can better secure their appli‐ cations against an evolving landscape of threats. Section 3: Building a Security Process and Preparing for the Future (Chapters 10–12) The chapters in Section 2 will give you the tools you need to understand and address the various individual threats you’ll see in this space. This last section is about bringing it all together: • In Chapter 10, “Learning from Future History”, I’ll use some famous science fic‐ tion anecdotes to illustrate how multiple weaknesses and design issues can stitch together to spell disaster. By explaining these futuristic case studies, I hope to help you prevent a future like this from ever occurring. • In Chapter 11, “Trust the Process”, we’ll get down to the serious business of building LLM-savvy security practices into your software factory—without this, I do not believe you can successfully secure this type of software at scale. • Finally, in Chapter 12, “A Practical Framework for Responsible AI Security”, we’ll examine the trajectory of LLM and AI technologies to see where they’re taking us and the likely implications to security and safety requirements. I’ll also introduce you to the Responsible Artificial Intelligence Software Engineering (RAISE) framework that will give you a simple, checklist-based approach to ensuring you’re putting into practice the most important tools and lessons to keep your software safe and secure. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. xiv | Preface

📄 Page 17

Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com. Preface | xv

📄 Page 18

How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-889-8969 (in the United States or Canada) 707-827-7019 (international or local) 707-829-0104 (fax) support@oreilly.com https://www.oreilly.com/about/contact.html We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/the-developers-playbook. For news and information about our books and courses, visit https://oreilly.com. Find us on LinkedIn: https://linkedin.com/company/oreilly-media. Watch us on YouTube: https://youtube.com/oreillymedia. Acknowledgments I’d like to thank all the friends, family, and colleagues who encouraged me or pro‐ vided feedback on various aspects of the project: Will Chilcutt, Fabrizio Cilli, Ads Dawson, Ron Del Rosario, Sherri Douville, Sandy Dunn, Ken Huang, Gavin Klondike, Marko Lihter, Marten Mickos, Eugene Neelou, Chase Peterson, Karla Roland, Jason Ross, Tom Santos, Robert Simonoff, Yuvraj Singh, Rachit Sood, Seth Summersett, Darcie Tuuri, Ashish Verma, Jeff Williams, Alexa Wilson, Dave Wilson, and Zoe Wilson. I want to thank the team at O’Reilly for supporting and guiding me on this project. I also owe a tremendous debt of gratitude to Nicole Butterfield, who approached me with the idea for this book and guided me through the proposal phase. I also want to express my appreciation for Jeff Bleiel, my editor, whose patience, skills, and expertise significantly impacted the book. Special thanks to our technical reviewers: Pamela Isom, Chenta Lee, Thomas Nield, and Matteo Dora. xvi | Preface

📄 Page 19

CHAPTER 1 Chatbots Breaking Bad Large language models and generative AI jumped to the forefront of public con‐ sciousness with the release of ChatGPT on November 30, 2022. Within five days, it went viral on social media and attracted its first million users. By January, ChatGPT surpassed one hundred million users, making it the fastest-growing internet service in history. However, a steady stream of security concerns emerged in the following months. These included privacy and security issues that caused companies like Samsung and countries like Italy to ban its usage. In this book, we’ll explore what underlies these concerns and how you can mitigate these issues. However, to best understand what’s going on here and why these problems are so challenging to solve, in this chapter, we will briefly rewind further in time. In doing so, we’ll see these types of issues aren’t new and understand why they will be so hard to fix permanently. Let’s Talk About Tay In March 2016, Microsoft announced a new project called Tay. Microsoft intended Tay to be “a chatbot created for 18- to 24-year-olds in the U.S. for entertainment pur‐ poses.” It was a cute name for a fluffy, early experiment in AI. Tay was designed to mimic a 19-year-old American girl’s language patterns and learn from interacting with human users of Twitter, Snapchat, and other social apps. It was built to conduct real-world research on conversational understanding. While the original announcement of this project seems impossible to find now on the internet, a TechCrunch article from its launch date does an excellent job of summa‐ rizing the goals of the project: 1

📄 Page 20

For example, you can ask Tay for a joke, play a game with Tay, ask for a story, send a picture to receive a comment back, ask for your horoscope, and more. Plus, Microsoft says the bot will get smarter the more you interact with it via chat, making for an increasingly personalized experience as time goes on. A big part of the experiment was that Tay could “learn” from conversations and extend her knowledge based on these interactions. Tay was designed to use these chat interactions to capture user input and integrate it as training data to make herself more capable—a laudable research goal. However, this experiment quickly went wrong. Tay’s life was tragically cut short after less than 24 hours. Let’s look at what happened and see what we can learn. Tay’s Rapid Decline Tay’s lifetime started off simply enough with a tweet following the well-known Hello World pattern that new software systems have been using to introduce themselves since the beginning of time: hellooooooo w rld!!! (TayTweets [@TayandYou] March 23, 2016) But within hours of Tay’s release, it became clear that maybe something wasn’t right. TechCrunch noted, “As for what it’s like to interact with Tay? Well, it’s a little bizarre. The bot certainly is opinionated, not afraid to curse.” Tweets like this started to appear in public in just the first hours of Tay’s lifetime: @AndrewCosmo kanye west is is one of the biggest dooshes of all time, just a notch below cosby (TayTweets [@TayandYou] March 23, 2016) It’s often said that the internet isn’t safe for children. With Tay being less than a day old, the internet once again confirmed this, and pranksters began chatting with Tay about political, sexual, and racist topics. As she was designed to learn from such exchanges, Tay delivered on her design goals. She learned very quickly—maybe just not what her designers wanted her to learn. In less than a day, Tay’s tweets started to skew to extremes, including sexism, racism, and even calls to violence. By the next day, articles appeared all over the internet, and these headlines would not make Microsoft, Tay’s corporate benefactor, happy. A sampling of the highly visible, mainstream headlines included: • Microsoft Shuts Down AI Chatbot After it Turned into a Nazi (CBS News) • Microsoft Created a Twitter Bot to Learn from Users. It Quickly Became a Racist Jerk (New York Times) 2 | Chapter 1: Chatbots Breaking Bad

The above is a preview of the first 20 pages. Register to read the complete e-book.

💝 Support Author

0.00

Total Amount (¥)

Donation Count

Recommended for You

Loading recommended books...

Failed to load, please try again later

← Back to List

The Developers Playbook for Large Language Model Security Building Secure AI Applications (Steve Wilson)（Z-Library）

📄 Text Preview (First 20 pages)

Registered users can read the full content for free

💝 Support Author

Recommended for You

{{title}}