(This page has no text content)
(This page has no text content)
Developing Apps with GPT-4 and ChatGPT SECOND EDITION Build Intelligent Chatbots, Content Generators, and More Olivier Caelen and Marie-Alice Blete
Developing Apps with GPT-4 and ChatGPT by Olivier Caelen and Marie-Alice Blete Copyright © 2024 Olivier Caelen and Marie-Alice Blete. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Nicole Butterfield Development Editor: Corbin Collins Production Editor: Jonathon Owen Copyeditor: Arthur Johnson
Proofreader: Kim Cofer Indexer: Judith McConville Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea September 2023: First Edition July 2024: Second Edition Revision History for the Second Edition 2024-07-09: First Release See https://oreilly.com/catalog/errata.csp?isbn=9781098168100 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Developing Apps with GPT-4 and ChatGPT, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors and do not represent the publisher’s views. While the publisher and
the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 978-1-098-16810-0 [LSI]
Preface Within a mere five days of its release, ChatGPT reached an impressive one million users, sending shock waves throughout the tech industry and beyond. As a side effect, the OpenAI API for AI-powered text generation was suddenly brought to light, despite having been available for three years. The ChatGPT interface showcased the potential of such language models, and suddenly developers and inventors began to realize the incredible possibilities at their fingertips. Since the release of this book’s first edition, OpenAI has updated its API, adding vision capabilities to GPT-4. But more importantly, coders, developers, software engineers, and architects have wholeheartedly embraced large language model (LLM) technologies, and in recent months we have witnessed a surge in the number of tools, frameworks, design patterns, best practices, and the like. These innovations are empowering everyone to bring ideas and research concepts to fruition as robust projects that can effectively bring value to businesses. The field of natural language processing (NLP) has made incredible technical progress over the years, but until recently, use of the technology was limited to an elite few. The OpenAI
API and its accompanying libraries provide a ready-to-use solution for anyone seeking to build AI-powered applications. There is no need to have powerful hardware or deep knowledge of artificial intelligence; with just a few lines of code, developers can integrate incredible features into their projects at a reasonable cost. This edition builds upon the foundation laid in the first edition, incorporating the latest advancements in AI technology and the progress that has been built collectively by researchers, developers, and enthusiasts who continue to push the boundaries of innovation. We combine our knowledge and experience—Olivier as a data scientist, and Marie-Alice as a software engineer—to give you a broad understanding of how to develop applications with GPT-4 and ChatGPT. In these pages, you will find clear and detailed explanations of AI concepts, as well as user-friendly guidelines on how to integrate the OpenAI services effectively, securely, and cost-consciously. This book is designed to be accessible to all, though some basic Python knowledge will be helpful. Through clear explanations, example projects, and step-by-step instructions, we invite you to
discover how GPT-4 and ChatGPT can transform the way we interact with machines. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords. NOTE This element signifies a general note. TIP This element signifies a tip or suggestion.
WARNING This element indicates a warning or caution. Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://oreil.ly/DevAppsGPT_GitHub. If you have a technical question or a problem using the code examples, please send email to support@oreilly.com. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.
We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Developing Apps with GPT-4 and ChatGPT, 2nd ed., by Olivier Caelen and Marie-Alice Blete (O’Reilly). Copyright 2024 Olivier Caelen and Marie-Alice Blete, 978-1-098- 16810-0.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. O’Reilly Online Learning NOTE For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast
collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com. How to Contact Us O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-889-8969 (in the United States or Canada) 707-827-7019 (international or local) 707-829-0104 (fax) support@oreilly.com https://www.oreilly.com/about/contact.html We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/devAppsGPT2e. For news and information about our books and courses, visit https://oreilly.com. Find us on LinkedIn: https://linkedin.com/company/oreilly-media
Watch us on YouTube: https://youtube.com/oreillymedia Acknowledgments Writing a book on one of the fastest-moving AI topics would not have been possible without the help of many people. We would like to thank the incredible O’Reilly team for their support, advice, and on-point comments, especially Corbin Collins, Nicole Butterfield, Jonathon Owen, Clare Laylock, and Arthur Johnson. The book also benefited from the help of exceptional reviewers who took a lot of time to provide invaluable feedback. Many thanks to Guillaume Coter, Lucas Soares, and Leonie Monigatti. Many thanks to our Worldline Labs colleagues for their insights and never-ending discussions on ChatGPT and the OpenAI services, especially Liyun He Guelton, Guillaume Coter, Luxin Zhang, and Patrik De Boe. A huge thank you as well to Worldline’s team of developer advocates who provided support and encouragement from the start, especially Jean-Francois James, Raphaël Semeteys, and Fanilo Andrianasolo. We would also like to acknowledge the extraordinary tech community, open source contributors, researchers, and LLM
enthusiasts who have contributed to the community’s knowledge and resources and who have pushed together to make LLM-based application development easier and faster, with superior outcomes. And finally, we thank our friends and family for bearing with us during our ChatGPT craze, allowing us to release another edition of this book in such a short time.
Chapter 1. GPT-4 and ChatGPT Essentials The ability to unlock the power of artificial intelligence has never been more accessible for developers. Large language models (LLMs) such as GPT-4 and GPT-3.5 Turbo have showcased their capabilities through ChatGPT. Now we find ourselves in a whirlwind of progress, with a pace that has never been seen before in the software world. OpenAI has made these technological innovations readily available; what transformative applications will you craft with the tools now at your disposal? The implications of these AI models go far beyond chatbots. Thanks to LLMs, developers can now exploit the power of natural language processing (NLP) to create applications that understand users’ needs, transforming what once was science fiction into tangible reality. Moreover, thanks to the new vision capabilities of GPT-4, it is now possible to build software that can interpret and generate text based on images. From innovative customer support systems that learn and adapt to personalized educational tools that understand each student’s
unique learning style, GPT language models open up a whole new world of possibilities. But what are these GPT models? The goal of this chapter is to take a deep dive into their foundations, origins, and key features. By understanding the basics of these AI models, you will be well on your way to building the next generation of LLM-powered applications. Introducing Large Language Models This section lays down the fundamental building blocks that have shaped the development of GPT models. We aim to provide a comprehensive understanding of language models and NLP, the role of Transformer architectures, and the tokenization and prediction processes within these models. However, as we will see, this journey does not stop at text processing. The introduction of GPT-4 Vision marks an extension of the capabilities of LLMs beyond text to include the processing of multimodal input. This means that GPT-4 not only is good at text processing but also can interpret images. Exploring the Foundations of Language
Models and NLP As LLMs, GPT models are among the latest types of models released in the field of NLP, which is itself a subfield of machine learning (ML) and AI. Before we delve into GPT models, it is essential to take a look at NLP and its related fields. There are different definitions of AI, but the consensus, more or less, is that AI is the development of computer systems that can perform tasks that typically require human intelligence. With this definition, many algorithms fall under the AI umbrella. Consider, for example, the traffic prediction task in GPS applications, or the rule-based systems used in strategic video games. In these examples, as seen from the outside, the machine seems to require intelligence to accomplish these tasks. ML, as mentioned, is a subset of AI. In ML, we do not try to directly implement the decision rules used by the AI system. Instead, we try to develop algorithms that allow the system to learn by itself. Since the 1950s, when ML research began, many ML algorithms have been proposed in the scientific literature. Among them, deep learning algorithms have come to the fore. Deep learning is a branch of ML that focuses on algorithms
inspired by the structure of the brain. These algorithms are called artificial neural networks. They can handle very large amounts of data and perform very well on tasks such as image and speech recognition and NLP. The GPT models are based on the Transformer architecture, introduced in the 2017 paper “Attention Is All You Need” by Vaswani et al. from Google. Transformers are like reading machines. They leverage an attention mechanism to prioritize different parts of the text, allowing for a deeper understanding of context and enabling coherent outputs. This approach enables them to grasp the meaning of words within sentences, improving their performance in language translation, question answering, and text generation. Figure 1-1 visually represents these core concepts and their role in enhancing the capabilities of transformer models for various language tasks.
Figure 1-1. A nested set of technologies from AI to transformers NLP is a subfield of AI focused on enabling computers to process, interpret, and generate natural human language. Modern NLP solutions are based on ML algorithms. The goal of NLP is to allow computers to process natural language text. This goal covers a wide range of tasks:
Text classification Categorizing input text into predefined groups. This includes, for example, sentiment analysis and topic categorization. Companies can use sentiment analysis to understand customers’ opinions about their services. Email filtering is an example of topic categorization in which email can be put into categories such as “Personal,” “Social,” “Promotions,” and “Spam.” Automatic translation Automatic translation of text from one language to another. Note that this can include areas such as translating code from one programming language to another, like from Python to C++. Question answering Answering questions based on a given text. For example, an online customer service portal could use an NLP model to answer FAQs about a product, or educational software could use NLP to provide answers to students’ questions about the topic being studied. Text generation Generating a coherent and relevant output text based on a given input text, called a prompt.
Comments 0
Loading comments...
Reply to Comment
Edit Comment