Statistics
7
Views
0
Downloads
0
Donations
Support
Share
Uploader

高宏飞

Shared on 2026-05-30

AuthorBrian Buzzelli

Data quality will either make you or break you in the financial services industry. Missing prices, wrong market values, trading violations, client performance restatements, and incorrect regulatory filings can all lead to harsh penalties, lost clients, and financial disaster. This practical guide provides data analysts, data scientists, and data practitioners in financial services firms with the framework to apply manufacturing principles to financial data management, understand data dimensions, and engineer precise data quality tolerances at the datum level and integrate them into your data processing pipelines. You'll get invaluable advice on how to • Evaluate data dimensions and how they apply to different data types and use cases • Determine data quality tolerances for your data quality specification • Choose the points along the data processing pipeline where data quality should be assessed and measured • Apply tailored data governance frameworks within a business or technical function or across an organization • Precisely align data with applications and data processing pipelines • And more

Tags
No tags
ISBN: 1098136934
Publisher: O'Reilly Media
Publish Year: 2022
Language: 英文
Pages: 177
File Format: PDF
File Size: 7.5 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

B uzzelli Brian Buzzelli Data Quality Engineering in Financial Services Applying Manufacturing Techniques to Data
DATA SCIENCE “This book presents a clear how-to guide for the finance professional to motivate, design, and implement a comprehensive data quality framework.“ —Matthew Lyberg, CFA Quantitative Researcher, NDVR Inc. “A definitive guide on how to best ensure that data processes— from sourcing and ingestion, to firmwide utilization—are properly monitored, measured, and controlled.” —Barry S. Raskin Head of Data Practice, Relevate Data Monetization Corp. Data Quality Engineering in Financial Services US $59.99 CAN $74.99 ISBN: 978-1-098-13693-2 Twitter: @oreillymedia linkedin.com/company/oreilly-media youtube.com/oreillymedia Data quality will either make you or break you in the financial services industry. Missing prices, wrong market values, trading violations, and incorrect regulatory filings can all lead to harsh penalties, lost clients, and financial disaster. With this practical book, you’ll learn how to apply manufacturing principles to financial data management, understand data dimensions, and integrate data quality tolerances into your data processing pipelines. Author Brian Buzzelli teaches you how to define, engineer, and use data validation checks and quality tolerances to deliver high-quality data that meets consumers’ data quality specifications. You’ll learn how to think like a manufacturer, understand data dimensions, use techniques to define data quality specifications, and assess data quality by applying quantitative tolerances to data dimensions. You’ll get invaluable advice to help you: • Understand how data dimensions apply to different datatypes and use cases • Apply manufacturing principles to data using DQS • Generate data quality metrics based on standardized data quality measurements • Recognize differences between data quality tolerances for multiple consumers • Determine control points along data pipelines where quality should be measured • Apply data governance frameworks, concepts, policies, and data catalogs Brian Buzzelli is senior VP and head of enterprise data management for Acadian, a quantitative institutional asset management firm. He has defined a systematic and rigorous approach to data quality engineering based on manufacturing principles.
Praise for Data Quality Engineering in Financial Services This book is an essential reading not only for the data management specialists but for anyone who works with and relies on data. Brian Buzzelli harnesses his many years of practical, “been there, done that, have scars to prove it” experience to teach the reader how to apply manufacturing quality control principles to “find a needle in a haystack”—that one erroneous attribute that will have an outside impact. —Julia Bardmesser, SVP, Head of Data, Architecture and Salesforce Development, Voya Financial This is the perfect playbook that, if implemented, will allow any financial services company to put their data on an offensive footing to drive alpha and insights without sacrificing quality, governance, or compliance. —Michael McCarthy, Principal Investment Data Architect, Investment Data Management Office, MFS The approach to data quality expressed in this book is based on an original idea of using quality and standardization principles applied from manufacturing. It provides insights into a pragmatic and tested data quality framework that will be useful to any data practitioner. —Predrag Dizdarevic, Partner, Element22 This book clearly explains how to apply a manufacturing approach to data quality, provides an easy framework to capture data quality requirements, and has high-impact data quality metrics and visualization. —Alag Solaiappan, VP, Data Engineering, Acadian Asset Management
This book is a must for any data professional, regardless of industry. Brian has provided a definitive guide on how to best ensure that data processes—from sourcing and ingestion, to firmwide utilization—are properly monitored, measured and controlled. The insights that he illustrates are born out of a long history of working with content and enabling financial professionals to perform their jobs. The principles presented herein are applicable to any organization that needs to build proper and efficient data governance and data management. Finally, here is a tool that can help everyone from chief data officers to data engineers in the performance of their roles. —Barry S. Raskin, Head of Data Practice, Relevate Data Monetization Corp. Brian Buzzelli presents a clear how—to guide for the finance professional to motivate, design, and implement a comprehensive data quality framework. Even in early stages, the data quality program will improve efficiency, reduce risk, and build trust with clients and across functions. Brian demonstrates the connection between data integrity and fiduciary obligation with relevant examples. Borrowing unabashedly from concepts in high precision manufacturing, Brian provides a step-by-step plan to engineer an enterprise level data quality program with solutions designed for specific functions. The code examples are especially applicable, providing the reader with a set of practical tools. I believe these concepts are an important contribution to the field. —Matthew Lyberg, CFA, Quantitative Researcher, NDVR Inc.
Brian Buzzelli Data Quality Engineering in Financial Services Applying Manufacturing Techniques to Data Boston Farnham Sebastopol TokyoBeijing
978-1-098-13687-1 [LSI] Data Quality Engineering in Financial Services by Brian Buzzelli Copyright © 2023 Brian Buzzelli. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Michelle Smith Development Editor: Corbin Collins Production Editor: Beth Kelly Copyeditor: Nicole Taché Proofreader: Shannon Turlington Indexer: Potomac Indexing, LLC Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea October 2022: First Edition Revision History for the First Edition 2022-10-19: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098136932 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Data Quality Engineering in Financial Services, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1. Thinking Like a Manufacturer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Operational Efficiency 1 Lessons from Lean Manufacturing 2 Coca-Cola: Excellence in Manufacturing Quality 3 DASANI®: Purifying Water 3 Manufacturing Control Specifications 4 Water Quality Specifications 4 Quality Control and Anomaly Detection 5 Summary 5 2. The Shape of Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Data as Physical Asset 7 Data Shape Concept Model 8 Data Element 8 Datum 8 Data Universe 9 Time Series Data 9 Cross-Section Data 10 Panel Data 11 Data Volumes 12 Data Dimensions and Attributes 15 Data Attributes 16 Data Dimensions 17 Summary 19 v
3. Data Quality Specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Manufacturing Controls 21 DQS Overview 22 Data Quality Tolerances 23 Completeness 24 Timeliness 25 Accuracy 27 Precision 30 Conformity 32 Congruence 33 Collection 41 Cohesion 42 Summary 44 4. DQS Model Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Completeness DQS 48 Timeliness DQS 50 Accuracy DQS 52 Precision DQS 55 Conformity DQS 58 Congruence DQS 60 Collection DQS 67 Example 71 Cohesion DQS 72 Example 76 Fit for Purpose 77 Summary 79 5. Data Quality Metrics and Visualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Data Quality Metrics 81 Data Quality Visualization 82 Summary 91 6. Operational Efficiency Cost Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Model Details 93 Model Cost Assumptions 94 Pre-Use Data Validations Versus Reconciliation 99 Summary 101 vi | Table of Contents
7. Data Governance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Establishing a Data Governance Function 104 Principles of Data Governance 105 Data Governance Function 105 Data Governance Models 106 Creating a Data Governance Program 107 Organizing the Program 108 Establishing the Data Governance Council 108 Engaging the Data Management Function 109 Engaging Business Functions 110 Enhanced Data Governance Operating Model 110 Data Governance Program Activities and Deliverables 112 Data Governance Business Value 113 Data Management Maturity 114 Summary 115 8. Master Data Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Mastering Data 118 Data Governance Synergies 123 Data Management Synergies 124 Summary 125 9. Data Project Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Business Requirements 129 Defining the Business Use Case 129 Mapping Business Processes and Data Flows 129 Impact Analysis 130 Defining Data Quality Scorecards 131 Data Usage Policies 131 Technology Requirements 132 Defining the Application Data Processing Use Case 132 Mapping Application Functions and Data Flows 132 Data Governance Requirements 133 Data Definition Tasks 134 Data Integrity Tasks 135 Data Management Tasks 139 Summary 142 10. Enterprise Data Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Where to Begin? 146 Table of Contents | vii
Understanding Data Volumes 147 Engineering Data Quality 147 Improving Efficiency 148 Scaling Data Architectures and Pipelines 148 Achieving a Data-Quality-First Culture 149 Making It Happen 149 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 viii | Table of Contents
Preface Most people would say we live in a world where we trust in the manufacturing disci‐ pline and quality standards used to provide the food we eat, the water we drink, the medications we take, and the sophisticated technology products we use in our daily lives. We can appreciate the years of evolution in science, refinement in manufactur‐ ing techniques, and codification of product specifications that form the basis of the trust we enjoy in consuming and using physical products today. Given the monumen‐ tal achievements in science, technology, and manufacturing; what then is so different about the data used in the financial industry whereby data and information must be constantly checked, rechecked, and reconciled to ensure its accuracy and quality? Data is the fundamental raw material used in the financial industry to manage your retirement and family’s wealth assets, provide operating and growth capital to compa‐ nies, and drive the global financial system as the life blood of the global economy. Unlike the manufacturing industry, data flows in the financial industry have evolved from being based on open outcry, telephone, paper trails, and ticker tapes, to being grounded in sophisticated and complex computational, artificial intelligence, and machine learning applications. We capture, store, and pass along data through com‐ plex applications, and we use data in business processes with a general assumption that the data is reliable and suitable for use. However, data has no physical form and has the capacity to be infinitely malleable. By contrast, the raw materials in manufacturing have physical form. The physical prop‐ erties can be measured and assessed for suitability based on the specification for the physical properties and tolerances for which the raw material is certified compliant for use. This is one of the key concepts whereby we will apply a similar manufactur‐ ing framework to data and define the properties of data that can be measured against a specification. Examples of data will be presented as if it has mass and physical form, but in the context of measurable data dimensions: completeness, timeliness, accuracy, precision, conformity, congruence, collection, and cohesion. ix
The premise in this book is that data has shape, has measurable dimensions, and can be inspected and measured relative to data quality specifications and tolerances that yield data quality metrics. The results can then be analyzed using data quality specifi‐ cations to derive data quality metrics. The data quality processes in manufacturing include evaluating specific measurements of physical materials relative to control specifications. The results are analyzed to determine whether the materials quality measurements and metrics are within design specifications and acceptable tolerances. While the financial industry struggles with a lack of industry standards for data iden‐ tification, definition, and representation, combined with the fast pace of financial product innovation, manufacturing evolution and maturity demonstrates highly robust methodologies, accuracy, and purity in techniques, and precision in materials processing. Today we enjoy and perhaps take for granted the technical complexities that give us modern medicines, genetics, super crops, jets, satellites, smartphones, flat screen TVs, artificial intelligence, robotics, wristwatch computers—the list is endless. The financial industry can learn a great deal from the application of mature manufac‐ turing techniques to our immature data management discipline. The primary benefit of applying similar precision in data quality validations is high quality data. However, from a business perspective, additional benefits include the following: • Operational efficiency • Lower cost of operations • Less wasted effort • Higher data precision • More accurate business decision making This book is intended to provide useful frameworks and techniques that can be intro‐ duced into your data structures and data management operations. The expectation is the application of these techniques and frameworks will improve data processing effi‐ ciency, data identification, data quality, reduce operational data issues, and increase trust that the data is business ready and fit for purpose. My Journey and a Brief History of Data in the Financial Services Industry I have worked in the financial industry for 27+ years in both technical and business roles wrestling with and managing data. Over the years, early organizations such as the Data Warehousing Institute (TDWI) and Data Management Association Interna‐ tional (DAMA) have led the early focus on data and data management. Many other industries such as pharma, aerospace, and technical manufacturing have significantly advanced in physical materials quality management, and society have reaped the x | Preface
benefits of Lean, total quality management (TQM), Six Sigma, and so on. Compara‐ tively, the financial industry has lagged far behind in the definition and adoption of unified data definition and data quality standards. Though the industry has employed certain standards such as the Society for Worldwide Interbank Financial Telecommu‐ nications (SWIFT), the Financial Information Exchange (FIX) protocol, International Standards Organization (ISO) country and ISO currency standards, the industry has traditionally lacked a global, common, standardized, unified data taxonomy and ontology that defines and identifies securities and investment instruments in the global financial system. The Software Publishers Association was established in 1984 and then merged with the Information Industry Association in 1999 and was renamed the Software and Information Industry Association (SIIA). The Financial Information Services Divi‐ sion (FISD), a division of SIIA, was formed to focus specifically on the financial industry and primarily market data and information. At that time, more than two decades ago, most financial information revolved around the markets, and the pri‐ mary distributers Reuters and Bloomberg delivered the market ticks, issuer and secu‐ rity information, and company fundamentals, and filings.. Generally, the equities markets used Committee on Uniform Securities Identification Procedures (CUSIP), Stock Exchange Daily Official List (SEDOL), and International Securities Identifica‐ tion Number (ISIN) for securities identification. As fixed income, synthetics, deriva‐ tives, and securitized products entered the financial system, and with no standardized definition nor national securities exchanges, the industry saw tremendous divergence in data definitions, valuations, identification schemes, and so on. The primary data vendors such as Reuters, Bloomberg, Interactive Data, Thomson Financial, and Tele‐ kurs were the powerhouses that provided the extended reference data and related analytics, and each had its own unique way of curating the data and packaging the data into data products. They were competitors, so the lack of industry adopted standards for interoperability/substitution combined with the lack of regulatory man‐ dated common taxonomy hindered any major improvement in common data defini‐ tion and data management practices. Then came the rise of the enterprise data management (EDM) platforms, including Eagle PACE, Asset Control, Cadis, and GoldenSource. I have extensive experience implementing and managing Eagle PACE as well as driving data integrations to sev‐ eral EDM platforms. That’s how I gained significant insight into data architecture, data structuring, data quality validations, data provisioning, curation, and more. Mike Atkin, former president of the FISD, left in 2004 and formed the Enterprise Data Management Council in 2005 with participation from several data vendors, including Reuters. The EDM Council’s initial mission focused on improving methods and frameworks for data definitions and data management practices with special emphasis on reference and analytical data. Today, the EDM Council and the Global Legal Entity Identifier Foundation (GLEIF) have been exceptional leaders in driving Preface | xi
global data management and standards initiatives to improve data management disci‐ pline in the financial industry. I have been a member of SIIA and FISD, a contribu‐ ting member of the EDM Council, and an industry participatory member of the GLEIF during its formation. Today I have had the privilege to collaborate with other industry data experts and practitioners, and collectively we have contributed to the evolution and maturity of how we define data, instruments, attributes, analytics, entities, and the like. You will now find many variations of the data dimensions frameworks illustrated by various working groups, associations, authors, vendors, and so on, but there is no common, single, agreed reference to data dimensions. It became apparent to me that data quality is not only multi-faceted, in that it reflects the individual and combined tolerances for each dimension, but that the very indi‐ vidual tolerances across dimensions for each datum differ according to the target use case. Thus, I designed a data quality specification (DQS) that embodies the alignment of data dimensions, data quality tolerances and validations, and the data quality expectation of the consumer or consuming system. This definition of the DQS differs from other practitioner definitions due to its focus on a specific set of eight quantita‐ tively measurable data dimensions, and for each data quality tolerance definition, the DQS includes the concept of suspect. This provides three main categories: Valid Within tolerance Suspect Approaching out of tolerance Invalid Out of tolerance This approach leads to greater delineation between within tolerance and out of toler‐ ance conditions. For example, consider global economic data used by quantitative researchers to understand trends and patterns versus a portfolio holdings file that contains null val‐ ues in the price data element. The use case might be to generate a portfolio net asset value and return. The DQS of the data quality across the data is remarkably different for each use case, yet the datasets may commonly contain a price data element. This book provides useful frameworks and techniques for implementing data gover‐ nance, master data management, and data quality engineering. The application of manufacturing principles and techniques to data management, used in combination with these frameworks, is intended to promote structured data architectures and a more disciplined and precise data management operation, yielding higher quality fit- for-purpose data and lower operational cost and risk. xii | Preface
This book aims to teach you how to define, engineer, and use data validation checks and quality tolerances to deliver high quality data that meets the consumers data quality specification. The book will introduce you to data governance and the role it plays in driving best practices in data management. You will learn about manufactur‐ ing and how manufacturers use precise material and processing control specifications to ensure the physical properties of raw and semifinished materials meet the manu‐ facturing requirements. Data in the financial industry is like the raw materials used in the manufacturing industry. You will be challenged to think about a volume of data as if it were a volume of raw material to be used in manufacturing. You will learn about the shape of data, timeseries data, cross section data, panel data volumes, and the dimensions of data. You will be able to specify precise data quality checks and toler‐ ances at the datum level to ensure only high quality data is used that meets the con‐ sumer or application use case. You will learn how to generate data quality metrics and analytics based on standardized data quality measurements that support a consistent approach to data quality engineering. A master data management approach will be used to demonstrate the concepts of raw, staged, and mastered data provide the archi‐ tecture in which you can apply the data quality engineering techniques and prevent data that does not meet the consumer data quality specification from being provi‐ sioned and used. Finally, you will see how enterprise data management is the combi‐ nation of data governance, data quality engineering, and master data management. These concepts and techniques are intertwined within the many of the tasks com‐ monly performed across different data intensive business and technical engineering functions in the financial industry. Generally, most professionals in the financial industry are data practitioners and include business professionals, data scientists, data analysts, data engineers, and data architects. Business and technical professionals who operate in data intensive functions such as data management, data analytics, research, portfolio management, portfolio construction, trading, accounting opera‐ tions, compliance, and performance measurement to name a few will benefit from these data quality frameworks. Data quality is determined by the consumer or appli‐ cation based on data quality specifications. The consumer can use these frameworks to convey the precise data quality tolerances about the data intended for their use or to be used by an application. The data analysts and technicians responsible for imple‐ menting data management architectures and data management processes can use these frameworks to understand the data quality requirements of the consumer and engineer data quality measurements into the data ecosystem. You can directly apply these techniques to understanding the quality of the data you are using or about to use. The application of data quality engineering frameworks will empower you with deep insight about the shape and the quality of your data, and the ability and language to precisely convey to others the data quality required. Preface | xiii
Conventions Used in This Book The following typographical conventions are used in this book: This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. https://oreil.ly/DQEF-figures Online Figures You can find larger, color versions of some figures at https://oreil.ly/DQEF-figures. Links to each figure also appear in their captions. Email bookquestions@oreilly.com if you have a technical question. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com. xiv | Preface
How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/DQE. Email bookquestions@oreilly.com to comment or ask technical questions about this book. For news and information about our books and courses, visit http://oreilly.com. Find us on LinkedIn: https://linkedin.com/company/oreilly-media. Follow us on Twitter: https://twitter.com/oreillymedia. Watch us on YouTube: https://youtube.com/oreillymedia. Acknowledgments There is no “I” in the word data, nor in the word success, and the same is true for the development and production of this book, which has been made possible by the con‐ tributions and support of many. I would like to express my deepest thanks and appre‐ ciation for the professional and personal support, contributions, and encouragement I have received from Acadian, O’Reilly Media, industry colleagues, fellow data practitioners, friends, and family. The successful achievement of producing this book is to be shared by all. I would like to recognize and thank the team at Acadian Asset Management: Execu‐ tive Vice President and Chief Investment Officer Brendan Bradley, Senior Vice Presi‐ dent and Director of Investment Analytics and Data Jim Dufort, and the incredible Enterprise Data Management and Information Technology teams for their support and commitment to the application of manufacturing principles and the continuous improvement of the firm’s data architectures, data management operations, data qual‐ ity validations, and data governance discipline. The frameworks and techniques provided in this book have been proven to work due to the collective and successful implementation efforts of my Acadian colleagues. I am truly grateful to them for their willingness to think differently about data quality and data management. “Acadian‐ ites” have embraced architectural and procedural changes that deliver high-quality Preface | xv
data for use across the firm and that support Acadian’s innate commitment to deliver‐ ing exceptional investment products and client services. My sincere thanks to O’Reilly Media: Content Acquisition Editor Michelle Smith, Content Development Editor Corbin Collins, Production Editor Elizabeth Kelly, and Copy Editor Nicole Taché. Thank you for the tremendous opportunity to bring this book to fruition. I am grateful for the privilege to contribute to O’Reilly Media’s con‐ tent and success. This book would not be possible without Michelle’s keen insight and recognition of the importance and relevance of data quality in the financial industry. Michelle understood the foundational frameworks in this book would be helpful to all data practitioners. I cannot thank Corbin enough for his patience, expertise, rec‐ ommendations, suggestions, feedback, and steady and clear guidance while working with me to develop the content in this book. My gratitude extends to Elizabeth, Nic‐ ole, Suzanne Huston, and the production team for their expertise in the preparation, presentation, and production of this book. I have a newfound appreciation and high‐ est respect for content editors, copy editors, and publishing professionals. The excep‐ tional expertise demonstrated by these individuals contributes to O’Reilly Media’s success. The successful development of key concepts, and the accuracy of the content and examples in this book were made possible by multiple technical reviewers. I wish to thank Predrag Dizdarevic, founder and partner of Element22, for his industry leader‐ ship, his many years of experience, and his expertise that he graciously imparted dur‐ ing the technical reviews of this book. My heartfelt thanks to Abdullah Karasan, PhD, senior data science consultant at TFI TAB Food Investments and author of Machine Learning for Financial Risk Management with Python (O’Reilly), for the deep expertise and sharp insights conveyed in his technical review. I would also like to thank fellow data quality warrior and practitioner Alagappan Solaiappan, vice president, senior data analyst, EDM data quality engineer at Acadian, for his many years of experience, his expertise in data quality, and his exceptional collaboration with me and our fellow Acadian colleagues. Data is a team sport, and his feedback and meticulous technical review of the book has contributed to its success. I wish to recognize and thank Matthew Lyberg, CFA, quantitative researcher at NDVR, Inc. and former director of performance attribution at Acadian, for his insights and feedback that drove many improvements in the definition and demon‐ stration of the data quality concepts and the DQS framework. His recognition of the business value embodied in this work led to my introducing it to the CFA Institute. They now include portions of this work in their training curriculum. My thanks to all data quality warriors, data practitioners, and industry colleagues who strive to deliver the highest quality data in our financial services and asset management firms. On a personal note, my deepest thanks to Barbara Buzzelli (Mom), Richard Buzzelli (Dad), and Claudine Wagenfuehr (sister) for their love, support, and unending xvi | Preface
encouragement. Mom always said, “You can do anything you set your mind to.” I can…and I did. I also wish to thank Dr. Andrew T. Revel, who is my best friend, greatest supporter, fiercest critic, and through the many years, my partner in life. My thanks to “The Foundation” that includes Robert Davis, Peggy Walther, and Chuck Wesley (IM), for their unending friendship, support, and encouragement. My thanks and appreciation to Matthew Szczepanski for his support and encouragement during the early, formative years at university and at the beginning my career. Many thanks to Nancy Pribich who gave me a swift kick as motivation to pursue my dreams and aspirations, and to work hard to achieve them. Finally, my recognition and gratitude for the exceptional education I received at both Carnegie Mellon University and the University of Pittsburgh, which gave me the technical foundations to develop this book, and to build and achieve my career goals. Preface | xvii
(This page has no text content)