M A N N I N G Ajay Thampi Building explainable machine learning systems
The process of building a robust AI system Model Learned model Validation Evaluation Deployed model 1 2 4 LEARNING TESTING DEPLOYING Test Development Training and dev sets Test set New data Production Prediction Interpretation Interpretation Explanation Monitoring 3 UNDERSTANDING 5 EXPLAINING 6 MONITORING Historical data Training and cross-validation
(This page has no text content)
Interpretable AI BUILDING EXPLAINABLE MACHINE LEARNING SYSTEMS AJAY THAMPI M A N N I N G SHELTER ISLAND
For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: orders@manning.com ©2022 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. The author and publisher have made every effort to ensure that the information in this book was correct at press time. The author and publisher do not assume and hereby disclaim any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause, or from any usage of the information herein. Manning Publications Co. Development editor: Lesley Trites 20 Baldwin Road Technical development editor: Kostas Passadis PO Box 761 Review editor: Mihaela Batinić Shelter Island, NY 11964 Production editor: Deirdre Hiam Copy editor: Pamela Hunt Proofreader: Melody Dolab Technical proofreader: Vishwesh Ravi Shrimali Typesetter: Gordan Salinovic Cover designer: Marija Tudor ISBN 9781617297649 Printed in the United States of America
To Achan, Amma, Ammu, and my dear Miru
(This page has no text content)
vii brief content PART 1 INTERPRETABILITY BASICS...................................................1 1 ■ Introduction 3 2 ■ White-box models 21 PART 2 INTERPRETING MODEL PROCESSING ....................................55 3 ■ Model-agnostic methods: Global interpretability 57 4 ■ Model-agnostic methods: Local interpretability 89 5 ■ Saliency mapping 126 PART 3 INTERPRETING MODEL REPRESENTATIONS ..........................165 6 ■ Understanding layers and units 167 7 ■ Understanding semantic similarity 200 PART 4 FAIRNESS AND BIAS........................................................235 8 ■ Fairness and mitigating bias 237 9 ■ Path to explainable AI 270
(This page has no text content)
ix contents preface xiii acknowledgments xv about this book xvii about the author xx about the cover illustration xxi PART 1 INTERPRETABILITY BASICS .........................................1 1 Introduction 3 1.1 Diagnostics+ AI—an example AI system 4 1.2 Types of machine learning systems 4 Representation of data 5 ■ Supervised learning 6 Unsupervised learning 7 ■ Reinforcement learning 8 Machine learning system for Diagnostics+ AI 9 1.3 Building Diagnostics+ AI 9 1.4 Gaps in Diagnostics+ AI 11 Data leakage 11 ■ Bias 11 ■ Regulatory noncompliance 12 Concept drift 12 1.5 Building a robust Diagnostics+ AI system 12
CONTENTSx 1.6 Interpretability vs. explainability 14 Types of interpretability techniques 15 1.7 What will I learn in this book? 16 What tools will I be using in this book? 18 ■ What do I need to know before reading this book? 19 2 White-box models 21 2.1 White-box models 22 2.2 Diagnostics+—diabetes progression 24 2.3 Linear regression 27 Interpreting linear regression 30 ■ Limitations of linear regression 33 2.4 Decision trees 33 Interpreting decision trees 35 ■ Limitations of decision trees 39 2.5 Generalized additive models (GAMs) 40 Regression splines 42 ■ GAM for Diagnostics+ diabetes 46 Interpreting GAMs 48 ■ Limitations of GAMs 51 2.6 Looking ahead to black-box models 52 PART 2 INTERPRETING MODEL PROCESSING ..........................55 3 Model-agnostic methods: Global interpretability 57 3.1 High school student performance predictor 58 Exploratory data analysis 59 3.2 Tree ensembles 65 Training a random forest 67 3.3 Interpreting a random forest 71 3.4 Model-agnostic methods: Global interpretability 74 Partial dependence plots 74 ■ Feature interactions 80 4 Model-agnostic methods: Local interpretability 89 4.1 Diagnostics+ AI: Breast cancer diagnosis 90 4.2 Exploratory data analysis 91 4.3 Deep neural networks 95 Data preparation 100 ■ Training and evaluating DNNs 101 4.4 Interpreting DNNs 104 4.5 LIME 105
CONTENTS xi 4.6 SHAP 115 4.7 Anchors 119 5 Saliency mapping 126 5.1 Diagnostics+ AI: Invasive ductal carcinoma detection 127 5.2 Exploratory data analysis 128 5.3 Convolutional neural networks 130 Data preparation 135 ■ Training and evaluating CNNs 138 5.4 Interpreting CNNs 140 Probability landscape 140 ■ LIME 141 ■ Visual attribution methods 147 5.5 Vanilla backpropagation 148 5.6 Guided backpropagation 153 5.7 Other gradient-based methods 156 5.8 Grad-CAM and guided Grad-CAM 157 5.9 Which attribution method should I use? 161 PART 3 INTERPRETING MODEL REPRESENTATIONS ................165 6 Understanding layers and units 167 6.1 Visual understanding 168 6.2 Convolutional neural networks: A recap 169 6.3 Network dissection framework 171 Concept definition 173 ■ Network probing 175 ■ Quantifying alignment 177 6.4 Interpreting layers and units 178 Running network dissection 179 ■ Concept detectors 183 Concept detectors by training task 189 ■ Visualizing concept detectors 195 ■ Limitations of network dissection 198 7 Understanding semantic similarity 200 7.1 Sentiment analysis 201 7.2 Exploratory data analysis 203 7.3 Neural word embeddings 206 One-hot encoding 207 ■ Word2Vec 208 ■ GloVe embeddings 212 ■ Model for sentiment analysis 213
CONTENTSxii 7.4 Interpreting semantic similarity 215 Measuring similarity 217 ■ Principal component analysis (PCA) 220 t-distributed stochastic neighbor embedding (t-SNE) 225 ■ Validating semantic similarity visualizations 231 PART 4 FAIRNESS AND BIAS ..............................................235 8 Fairness and mitigating bias 237 8.1 Adult income prediction 240 Exploratory data analysis 241 ■ Prediction model 244 8.2 Fairness notions 246 Demographic parity 248 ■ Equality of opportunity and odds 251 Other notions of fairness 255 8.3 Interpretability and fairness 256 Discrimination via input features 256 ■ Discrimination via representation 260 8.4 Mitigating bias 261 Fairness through unawareness 261 ■ Correcting label bias through reweighting 262 8.5 Datasheets for datasets 266 9 Path to explainable AI 270 9.1 Explainable AI 272 9.2 Counterfactual explanations 275 appendix A Getting set up 281 appendix B PyTorch 284 index 299
xiii preface I’ve been fortunate to have worked with data and machine learning for about a decade now. My background is in machine learning, and my PhD was focused on applying machine learning in wireless networks. I have published papers (http:// mng.bz/zQR6) at leading conferences and journals on the topic of reinforcement learning, convex optimization, and classical machine learning techniques applied to 5G cellular networks. After completing my PhD, I began working in the industry as a data scientist and machine learning engineer and gained experience deploying complex AI solutions for customers across multiple industries, such as manufacturing, retail, and finance. It was during this time that I realized the importance of interpretable AI and started researching it heavily. I also started to implement and deploy interpretability tech- niques in real-world scenarios for data scientists, business stakeholders, and experts to get a deeper understanding of machine-learned models. I wrote a blog post (http://mng.bz/0wnE) on interpretable AI and coming up with a principled approach to building robust, explainable AI systems. The post got a surprisingly large response from data scientists, researchers, and practitioners from a wide range of industries. I also presented on this subject at various AI and machine learning conferences. By putting my content in the public domain and speaking at leading conferences, I learned the following: ■ I wasn’t the only one interested in this subject. ■ I was able to get a better understanding of what specific topics are of interest to the community.
PREFACExiv These learnings led to the book that you are reading now. You can find a few resources available to help you stay abreast of interpretable AI, like survey papers, blog posts, and one book, but no single resource or book covers all the important interpretability techniques that would be valuable for AI practitioners. There is also no practical guide on how to implement these cutting-edge techniques. This book aims to fill that gap by first providing a structure to this active area of research and covering a broad range of interpretability techniques. Throughout this book, we will look at concrete real-world examples and see how to build sophisticated models and interpret them using state-of-the-art techniques. I strongly believe that as complex machine learning models are being deployed in the real world, understanding them is extremely important. The lack of a deep under- standing can result in models propagating bias, and we’ve seen examples of this in criminal justice, politics, retail, facial recognition, and language understanding. All of this has a detrimental effect on trust, and, from my experience, this is one of the main reasons why companies are resisting the deployment of AI. I’m excited that you also realize the importance of this deep understanding, and I hope you learn a lot from this book.
xv acknowledgments Writing a book is harder than I thought, and it requires a lot of work—really! None of this would have been possible without the support and understanding of my parents, Krishnan and Lakshmi Thampi; my wife, Shruti Menon; and my brother, Arun Thampi. My parents put me on the path of lifelong learning and have always given me the strength to chase my dreams. I’m also eternally grateful to my wife for supporting me throughout the difficult journey of writing this book, patiently listening to my ideas, reviewing my rough drafts, and believing that I could finish this. My brother deserves my wholehearted thanks as well for always having my back! Next, I’d like to acknowledge the team at Manning: Brian Sawyer, who read my blog post and suggested that there might a book there; my editors, Matthew Spaur, Lesley Trites, and Kostas Passadis, for working with me, providing high-quality feed- back, and for being patient when things got rough; and Marjan Bace, for green-light- ing this whole project. Thanks as well to all the other folks at Manning who worked with me on the production and promotion of the book: Deirdre Hiam, my production editor; Pamela Hunt, my copyeditor; and Melody Dolab, my page proofer. I’d also like to thank the reviewers who took the time to read my manuscript at var- ious stages during its development and who provided invaluable feedback: Al Rahimi, Alain Couniot, Alejandro Bellogin Kouki, Ariel Gamiño, Craig E. Pfeifer, Djordje Vuke- lic, Domingo Salazar, Dr. Kanishka Tyagi, Izhar Haq, James J. Byleckie, Jonathan Wood, Kai Gellien, Kim Falk Jorgensen, Marc Paradis, Oliver Korten, Pablo Roccatagliata, Pat- rick Goetz, Patrick Regan, Raymond Cheung, Richard Vaughan, Sergio Govoni, Sha- shank Polasa Venkata, Sriram Macharla, Stefano Ongarello, Teresa Fontanella De
ACKNOWLEDGMENTSxvi Santis, Tiklu Ganguly, Vidhya Vinay, Vijayant Singh, Vishwesh Ravi Shrimali, and Vittal Damaraju.Special thanks to James Byleckie and Vishwesh Ravi Shrimali, technical proof- readers, for carefully reviewing the code one last time shortly before the book went into production.
xvii about this book Interpretable AI is written to help you implement state-of-the-art interpretability tech- niques for complex machine learning models and to build fair and explainable AI sys- tems. Interpretability is a hot topic in research, and only a few resources and practical guides cover all the important techniques that would be valuable for practitioners in the real world. This book aims to address that gap. Who should read this book Interpretable AI is for data scientists and engineers who are interested in gaining a deeper understanding of how their models work and how to build fair and unbiased models. The book should also be useful for architects and business stakeholders who want to understand models powering AI systems to ensure fairness and protect the business’s users and brand. How this book is organized: a roadmap The book has four parts that cover nine chapters. Part 1 introduces you to the world of interpretable AI: ■ Chapter 1 covers different types of AI systems, defines interpretability and its importance, discusses white-box and black-box models, and explains how to build interpretable AI systems. ■ Chapter 2 covers white-box models and how to interpret them, specifically focusing on linear regression, decision trees, and generalized additive models (GAMs).
ABOUT THIS BOOKxviii Part 2 focuses on black-box models and understanding how the model processes the inputs and arrives at the final prediction: ■ Chapter 3 covers a class of black-box models called tree ensembles and how to interpret them using post hoc model-agnostic methods that are global in scope, such as partial dependence plots (PDPs) and feature interaction plots. ■ Chapter 4 covers deep neural networks and how to interpret them using post hoc model-agnostic methods that are local in scope, such as local interpretable model-agnostic explanations (LIME), SHapley Additive exPlanations (SHAP), and anchors. ■ Chapter 5 covers convolutional neural networks and how to visualize what the model is focusing on using saliency maps, specifically focusing on techniques such as gradients, guided backpropagation, gradient-weighted class activation mapping (Grad-CAM), guided Grad-CAM, and smooth gradients (SmoothGrad). Part 3 continues to focus on black-box models but moves to understanding what fea- tures or representations have been learned by them: ■ Chapter 6 covers convolutional neural networks and how to dissect them to understand representations of the data that are learned by the intermediate or hidden layers in the neural network. ■ Chapter 7 covers language models and how to visualize high-dimensional repre- sentations learned by them using techniques like principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). Part 4 focuses on fairness and bias and paves the way for explainable AI: ■ Chapter 8 covers various definitions of fairness and ways to check whether mod- els are biased. It also discusses techniques for mitigating bias and a standardizing approach of documenting datasets using datasheets that will help improve trans- parency and accountability with the stakeholders and users of the AI system. ■ Chapter 9 paves the way for explainable AI by understanding how to build such systems and also covers contrastive explanations using counterfactual examples. About the code This book contains many examples of source code. In most cases, source code is for- matted in a fixed-width font like this to separate it from ordinary text. In many cases, the original source code has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page space in the book. In rare cases, even this was not enough, and listings include line-continuation markers (➥). Additionally, comments in the source code have often been removed from the listings when the code is described in the text. Code annotations accompany many of the listings, highlighting important concepts. You can get executable snippets of code from the liveBook (online) version of this book at https://livebook.manning.com/book/interpretable-ai. The complete code
Comments 0
Loading comments...
Reply to Comment
Edit Comment