Mastering LLM Applications with LangChain and Hugging Face Practical insights into LLM deployment and use cases (Pathan, HunaidkhanGajjar etc.) (Z-Library)
AIAuthor:Pathan, Hunaidkhan, Gajjar, Nayankumar, & Nayankumar Gajjar
No description
Tags
Support Statistics
¥.00 ·
0times
Text Preview (First 20 pages)
Registered users can read the full content for free
Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.
Page
1
(This page has no text content)
Page
2
(This page has no text content)
Page
3
Mastering LLM Applications with LangChain and Hugging Face Practical insights into LLM deployment and use cases Hunaidkhan Pathan Nayankumar Gajjar www.bpbonline.com
Page
4
First Edition 2025 Copyright © BPB Publications, India ISBN: 978-93-65891-041 All Rights Reserved. No part of this publication may be reproduced, distributed or transmitted in any form or by any means or stored in a database or retrieval system, without the prior written permission of the publisher with the exception to the program listings which may be entered, stored and executed in a computer system, but they can not be reproduced by the means of publication, photocopy, recording, or by any electronic and mechanical means. LIMITS OF LIABILITY AND DISCLAIMER OF WARRANTY The information contained in this book is true to correct and the best of author’s and publisher’s knowledge. The author has made every effort to ensure the accuracy of these publications, but publisher cannot be held responsible for any loss or damage arising from any information in this book. All trademarks referred to in the book are acknowledged as properties of their respective owners but BPB Publications cannot guarantee the accuracy of this information. www.bpbonline.com
Page
5
Dedicated to I dedicate this book to my treasured parents, my beloved wife, my wonderful kids, and my esteemed mentor, Mr. Amit Saraswat. Your unwavering support and guidance have been the cornerstone of my journey. – Hunaidkhan Pathan Almighty, Dr. Amit Saraswat, and My Family – Nayankumar Gajjar
Page
6
About the Authors Hunaidkhan Pathan currently serves as a Data Science Lead for a leading consulting firm with over a decade of experience in the field. Specializing in machine learning and artificial intelligence, he brings a wealth of expertise to his role. Hunaidkhan holds a PGDM in Data Science from Shanti Business School in Ahmedabad and a degree in Electronics and Communication Engineering from Gujarat Technological University. He has significantly contributed to the data science community, with his research papers selected and presented at the prestigious SAS Analytics Conference 2013 in Orlando. The titles of his papers include “Marketing Mix Modeling” as an author and “Predicting market uncertainty with Kalman filter” as a co-author. He was also a LinkedIn Top Voice for Data Science and Artificial Intelligence in 2023. He posts regularly on LinkedIn about Generative AI. Hunaidkhan is an acknowledged Subject Matter Expert (SME) in Generative AI and Natural Language Processing. His diverse experience spans various LLM services such as OpenAI, Nvidia Nemo, Anthropic, GCP Generative AI, and AWS Bedrock, in addition to numerous open-source LLMs. His broad experience and profound knowledge make him a valuable contributor in the domain of data science.
Page
7
Nayankumar Gajjar, has a rich background in Data Science, Machine Learning and Generative AI fields with 9 years of extensive experience as a Data Scientist, Machine Learning Engineer, and Python Developer. Over the years, he has made significant contributions to various high-impact projects, showcasing his expertise in statistical modeling, Generative AI, MLOps, and Cloud Computing. This diverse skill set makes him a versatile and highly skilled professional in the Data Science and Machine Learning domains. He holds a master’s degree in Decision Science, further solidifying his deep understanding of the field. In addition to his professional work, he is a YouTuber and a blogger who shares his experiences and knowledge, offering a complete understanding of statistics and providing detailed coding tutorials. His commitment to education extends to his role as a visiting faculty member, where he has taught Python, SQL, Data Science, and NLP courses. He also co-authored a research paper titled “Thiessen Polygon, A GIS approach for Retail Industry in SAS,” which was presented at the prestigious SAS Analytics Conference 2013 in Orlando.
Page
8
About the Reviewer Vijender Singh is a multi-cloud professional with over six years of expertise, currently working in Luxembourg. He holds an MSc with distinction from Liverpool John Moores University, where his research centered on keyphrase extraction. Vijender boasts an impressive collection of cloud certifications, including Google MLPE, five Azure certifications, two AWS certifications, and TensorFlow certification. His role as a technical reviewer for numerous books reflects his commitment to improving the future.
Page
9
Acknowledgements We would like to express our sincere gratitude to all those who contributed to the completion of this book. First and foremost, we extend our heartfelt appreciation to our mentor, Dr. Amit Saraswat, our family and friends for their unwavering support and encouragement throughout this journey. Their love and encouragement have been a constant source of motivation. We are immensely grateful to BPB Publications for their guidance and expertise in bringing this book to fruition. Their support and assistance were invaluable in navigating the complexities of the publishing process. We would also like to acknowledge the reviewers, technical experts, and editors who provided valuable feedback and contributed to the refinement of this manuscript. Their insights and suggestions have significantly enhanced the quality of the book. Last but not least, we want to express our gratitude to the readers who have shown interest in our book. Your support and encouragement have been deeply appreciated. Thank you to everyone who has played a part in making this book a reality.
Page
10
Preface In earlier days, when AI was in its beginning phase, we used to work with statical modeling, which contains statistical models like regression, random forest, decision tree, etc. At that time, we used to work with numerical data only, and we did not have much to gain from textual data. Gradually, we got a way under the umbrella of Bag of Words (BoW) through which we can work with textual data. The main logic was converting textual data to numerical data. For this, we have a few methods, like count vectors and TF-IDF vectors. These methods create a matrix that shows the occurrence of a word in the given document. Again, these methods were not helping ML models get the context or intent of what had been said in the text. These techniques were helping us to do sentiment analysis and other prediction-based tasks using the above mentioned algorithms. Fast forward to this time, where we have some advanced techniques like transformers having an underlying architecture of neural networks, due to which ML models are able to get the context as well as the intent of what has been said in the text. This has opened up new opportunities and possibilities in the world of Natural Language Processing (NLP) and Natural Language Generation (NLG). Both NLP and NLG are very important fields in the current era of AI. These fields give machines the power to understand and generate texts like human beings. Some of
Page
11
the readers must have heard the term “ChatGPT,” one of the well-known chatbot platforms from OpenAI. If you have ever used ChatGPT, you must have an idea that it can write code for you, provide medical advice as well, do future prediction as well, and again, here you can chat with ChatGPT, similar to talking to a person and the person answering your questions. As time passes, these text generation and understanding models become more advanced and able to perform and understand almost all text related tasks. To create such an advancement in the NLP and NLG areas, we will definitely need people who not only know but also have a better understanding of all the terminologies and concepts of NLP and NLG. Also, they should be aware of the steps and phases of the development and deployment of ML models to be served to end users. As we [authors] are interacting with different people in our day-to-day lives, we have found that there is no one step solution that can provide readers with all the above-mentioned things in one place. If readers get terminologies and concepts, then they will not get steps. If they get steps, then there is no practical exposure. If readers have practical exposure, then how to deploy on the cloud is another question. This book comes into the picture in such scenarios. This book has been written for beginners or people who are stuck at the different stages mentioned in the previous paragraph and do not know about the next steps. This can be divided into three parts. In the first part, you can consider the first three chapters, where we have shown the installation of Python, running Python scripts in different ways, the basic concepts of Python, the installation of editors, and the usage and importance of the virtual environment. In the second part, you can consider chapters 4 and 5, which show the basic and important concepts of NLP and NLG. From chapters 6 to 11, we have shown the
Page
12
usage of important packages like LangChain and Hugging Face. Then we have shown how you can create a chatbot with custom data and integrate it with an application like Telegram. At last, we have shown deployment to an AWS cloud environment. The rest of the chapters are related to future direction and include some useful tips and references. In this book, we have not only discussed the theoretical approach, but we have also implemented and provided practical exposure as well. In the practical implementation, you will learn all the required steps to be performed to make things work. We hope that this book will be helpful to any individual who is looking forward to starting their journey in the NLP and NLG fields. We also hope that this book will provide complete guidance and help readers to the required understanding with practical exposure. Chapter 1: Introduction to Python and Code Editors – In this chapter, readers will learn about Python as a programming language and its history. Readers will get an idea of Python’s features and why it is an important language from an AI/ML perspective. Also, the reader will get an idea about the difference between a code editor and an Integrated Development Environment (IDE). Chapter 2: Installation of Python, Required Packages, and Code Editors – In this chapter, readers will install Python, all the packages we are going to use throughout the entire book, and an IDE to start with coding. Apart from the installation, readers will gain knowledge on the virtual environment, its importance and its usage. Also, readers will gain knowledge and practical exposure to Python programming basics. Chapter 3: Ways to Run Python Scripts – In this chapter, readers will create their first Python script, and
Page
13
then they will get practical hands-on experience on different ways to run any Python script. Chapter 4: Introduction of NLP and its concepts – In this chapter, readers will get exposure to the theoretical concepts and terminologies of NLP, which are essential to start with. Also, readers will get practical hands-on experience with all the important terminologies and concepts. Chapter 5: Introduction to Large Language Models – This chapter contains theoretical concepts. In this chapter, readers will acquire knowledge on LLM history and its evaluation. Apart from the history, readers will also learn important terminologies and concepts of LLMs. Chapter 6: Introduction to LangChain, Usage and Importance – In this chapter, readers will gain knowledge of the LangChain package, which is mainly used for text data Extract, Transform, Load (ETL) tasks to be later used by LLMs for further processing, understanding, and text generation. Readers will get to know LangChain integration with Hugging Face and how to use LLMs available from Hugging Face. In the chapter, readers will also get practical exposure, which will help them practice and gain confidence. Chapter 7: Introduction to Hugging Face, its Usage and Importance – In this chapter, readers will get practical exposure to the different LLMs available on Hugging Face Hub and how to use them. Readers will explore Hugging Face Hub as well, which provides a complete ecosystem for LLM deployment. Chapter 8: Creating Chatbots using Custom Data with Langchain and Hugging Face Hub – In this chapter, readers will create chatbots using the RAG mechanism on custom data using LangChain and Hugging Face
Page
14
combinations. Also, readers will get exposure to the Gradio framework of Hugging Face, through which they can interact with the chatbot created. Chapter 9: Hyperparameter Tuning and Fine Tuning Pre-Trained Models – In this chapter, the user will gain knowledge about the different hyperparameters available for any LLM, their usage, and how they will impact the LLM’s performance. Chapter 10: Integrating LLMs into Real-World Applications: Case Studies – In this chapter, readers will create a Telegram chatbot with the custom data and interact with it. Readers will get step-by-step guide on the implementation. Chapter 11: Deploying LLMs in Cloud Environments for Scalability – In this chapter, readers will get a step-by- step guide to deploying chatbots and LLM models in an AWS cloud environment. Readers will also get an idea about GCP. Chapter 12: Future Directions: Advances in LLMs and Beyond – In this chapter, readers will learn future directions and where to go from here once the book has been completed. Appendix A: Useful Tips for Efficient LLM Experimentation – In this chapter, we have shared some tips to use LLMs more efficiently. Appendix B: Resources and References – In this chapter, we have provided some of the resources and references for the readers to get more depth and detailed knowledge on different models and packages.
Page
15
Code Bundle and Coloured Images Please follow the link to download the Code Bundle and the Coloured Images of the book: https://rebrand.ly/bf9408 The code bundle for the book is also hosted on GitHub at https://github.com/bpbpublications/Mastering-LLM- Applications-with-LangChain-and-Hugging-Face. In case there’s an update to the code, it will be updated on the existing GitHub repository. We have code bundles from our rich catalogue of books and videos available at https://github.com/bpbpublications. Check them out! Errata We take immense pride in our work at BPB Publications and follow best practices to ensure the accuracy of our content to provide with an indulging reading experience to our subscribers. Our readers are our mirrors, and we use their inputs to reflect and improve upon human errors, if any, that may have occurred during the publishing processes involved. To let us maintain the quality and help us reach out to any readers who might be having difficulties due to any unforeseen errors, please write to us at : errata@bpbonline.com
Page
16
Your support, suggestions and feedbacks are highly appreciated by the BPB Publications’ Family. Did you know that BPB offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.bpbonline.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at : business@bpbonline.com for more details. At www.bpbonline.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on BPB books and eBooks. Piracy If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at business@bpbonline.com with a link to the material. If you are interested in becoming an author If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit www.bpbonline.com. We have worked with thousands of developers and tech professionals, just like you, to help them share their insights with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea. Reviews Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions. We at BPB can understand what you think about our products, and our authors can see your feedback on their book. Thank you! For more information about BPB, please visit www.bpbonline.com. Join our book’s Discord space Join the book’s Discord Workspace for Latest updates, Offers, Tech happenings around the world, New Release and Sessions with the Authors:
Page
17
https://discord.bpbonline.com
Page
18
Table of Contents 1. Introduction to Python and Code Editors Introduction Structure Objectives Introduction to Python Introduction to code editors Conclusion References Further reading 2. Installation of Python, Required Packages, and Code Editors Introduction Structure Objectives General instructions Installation of Python on Windows Installation of Python on Linux Installation of Python on MacOS Using Docker for Python Installation of IDE Installation of PyCharm
Page
19
Installation of required packages Virtual environment virtualenv pipenv Folder structure Creating a virtual environment PEP 8 standards Following PEP 8 in PyCharm Object-Oriented Programming concepts in Python Classes in Python Functions in Python For loop in Python While loop in Python If-else in Python Conclusion 3. Ways to Run Python Scripts Introduction Structure Objectives Setting up the project Running Python scripts from PyCharm Running Python Scripts from Terminal Running Python scripts from Jupyter Lab and Notebook Running Python Scripts from Docker Conclusion 4. Introduction of NLP and its concepts Introduction Structure
Page
20
Objectives Natural Language Processing overview Key concepts Corpus N-grams Tokenization Difference in tokens and n-grams Stop words removal Stemming Lemmatization Lowercasing Part-of-speech tagging Named Entity Recognition Bag of words Word embeddings Topic modeling Sentiment analysis Large language models Transfer learning Text classification Prompt engineering Hallucination Syntactic relationship Semantic relationship Conclusion 5. Introduction to Large Language Models Introduction Structure Objectives
Comments 0
Loading comments...
Reply to Comment
Edit Comment