📄 Page
1
(This page has no text content)
📄 Page
2
Python for Programmers® s tory o pics e arning Paths f fers & Deals g hlights ttings Support Sign Out
📄 Page
3
Deitel Developer Series Python for Programmers Paul Deitel Harvey Deitel ® Playlists H istory T opics L earning Paths O ffers & Deals ighlights S ettings Support Sign Out
📄 Page
4
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. For information about buying this title in bulk quantities, or for special sales opportunities (which may include electronic versions; custom cover designs; and content particular to your business, training goals, marketing focus, or branding interests), please contact our corporate sales department at c orpsales@pearsoned.com or (800) 3823419. For government sales inquiries, please contact g overnmentsales@pearsoned.com. For questions about sales outside the U.S., please contact i ntlcs@pearson.com. Visit us on the Web: informit.com Library of Congress Control Number: 2019933267 Copyright © 2019 Pearson Education, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions, request forms, and the appropriate contacts within the Pearson Education Global Rights & Permissions Department, please visit w ww.pearsoned.com/permissions/. Playlists H istory T opics L earning Paths O ffers & Deals ighlights S ettings Support Sign Out
📄 Page
5
D eitel and the doublethumbsup bug are registered trademarks of Deitel and Associates, Inc. Python logo courtesy of the Python Software Foundation. Cover design by Paul Deitel, Harvey Deitel, and Chuti Prasertsith Cover art by Agsandrew/Shutterstock ISBN13: 9780135224335 ISBN10: 0135224330 1 19
📄 Page
6
P reface “There’s gold in them thar hills!” Welcome to Python for Programmers! In this book, you’ll learn handson with today’s most compelling, leadingedge computing technologies, and you’ll program in Python—one of the world’s most popular languages and the fastest growing among them. Developers often quickly discover that they like Python. They appreciate its expressive power, readability, conciseness and interactivity. They like the world of opensource software development that’s generating a rapidly growing base of reusable software for an enormous range of application areas. For many decades, some powerful trends have been in place. Computer hardware has rapidly been getting faster, cheaper and smaller. Internet bandwidth has rapidly been getting larger and cheaper. And quality computer software has become ever more abundant and essentially free or nearly free through the “open source” movement. Soon, the “Internet of Things” will connect tens of billions of devices of every imaginable type. These will generate enormous volumes of data at rapidly increasing speeds and quantities. In computing today, the latest innovations are “all about the data”—data science, data analytics, big data, relational databases (SQL), and NoSQL and NewSQL databases, each of which we address along with an innovative treatment of Python programming. JOBS REQUIRING DATA SCIENCE SKILLS In 2011, McKinsey Global Institute produced their report, “Big data: The next frontier for innovation, competition and productivity.” In it, they said, “The United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyze big data and make decisions based on their findings.” This continues to be the case. The August 2018 “LinkedIn Workforce Report” says the United States has a shortage of over 150,000 people with data science skills. A 2017 report from IBM, Burning Glass Technologies and the BusinessHigher Education Forum, says that by 2020 in the United States there will be hundreds of thousands of new jobs requiring data science skills. MODULAR ARCHITECTURE The book’s modular architecture (please see the Table of Contents graphic on the book’s inside front cover) helps us meet the diverse needs of various professional audiences. C hapters 1– 1 0 cover Python programming. These chapters each include a brief Intro to Data Science section introducing artificial intelligence, basic descriptive statistics, measures of central tendency and dispersion, simulation, static and dynamic visualization, working with CSV files, pandas for data exploration and data wrangling, time series and Source unknown, frequently misattributed to Mark Twain. h ttps://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%20Digital/Our%20I n sights/Big%20data%20The%20next%20frontier%20for%20innovation/MGI_big_data_full_report.ashx ( page 3). ttps://economicgraph.linkedin.com/resources/linkedinworkforce r eportaugust2018. ttps://www.burningglass.com/wp c ontent/uploads/The_Quant_Crunch.pdf (page 3). 1 1 2 3 4 2 3 4 ylists t ory p ics a rning Paths e rs & Deals g hlights ings Support Sign Out . .
📄 Page
7
s imple linear regression. These help you prepare for the data science, AI, big data and cloud case studies in C hapters 11– 1 6, which present opportunities for you to use realworld datasets in complete case studies. After covering Python hapters 1– 5 and a few key parts of hapters 6– 7 , you’ll be able to handle significant portions of the case studies in hapters 11– 6. The “Chapter Dependencies” section of this Preface will help trainers plan their professional courses in the context of the book’s unique architecture. hapters 11– 6 are loaded with cool, powerful, contemporary examples. They present hands on implementation case studies on topics such as natural language processing, data mining Twitter, cognitive computing with IBM’s Watson, supervised machine learning with classification and regression, unsupervised machine learning with clustering, deep learning with convolutional neural networks, deep learning with recurrent neural networks, big data with Hadoop, Spark and NoSQL databases, the Internet of Things and more. Along the way, you’ll acquire a broad literacy of data science terms and concepts, ranging from brief definitions to using concepts in small, medium and large programs. Browsing the book’s detailed Table of Contents and Index will give you a sense of the breadth of coverage. KEY FEATURES KIS (Keep It Simple), KIS (Keep it Small), KIT (Keep it Topical) Keep it simple—In every aspect of the book, we strive for simplicity and clarity. For example, when we present natural language processing, we use the simple and intuitive TextBlob library rather than the more complex NLTK. In our deep learning presentation, we prefer Keras to TensorFlow. In general, when multiple libraries could be used to perform similar tasks, we use the simplest one. Keep it small—Most of the book’s 538 examples are small—often just a few lines of code, with immediate interactive IPython feedback. We also include 40 larger scripts and indepth case studies. Keep it topical—We read scores of recent Pythonprogramming and data science books, and browsed, read or watched about 15,000 current articles, research papers, white papers, videos, blog posts, forum posts and documentation pieces. This enabled us to “take the pulse” of the Python, computer science, data science, AI, big data and cloud communities. Immediate-Feedback: Exploring, Discovering and Experimenting with IPython The ideal way to learn from this book is to read it and run the code examples in parallel. Throughout the book, we use the IPython interpreter, which provides a friendly, immediatefeedback interactive mode for quickly exploring, discovering and experimenting with Python and its extensive libraries. Most of the code is presented in small, interactive IPython sessions. For each code snippet you write, IPython immediately reads it, evaluates it and prints the results. This instant feedback keeps your attention, boosts learning, facilitates rapid prototyping and speeds the softwaredevelopment process. Our books always emphasize the livecode approach, focusing on complete, working programs with live inputs and outputs. IPython’s “magic” is that it turns even snippets into code that “comes alive” as you enter each line. This promotes learning and encourages experimentation. Python Programming Fundamentals First and foremost, this book provides rich Python coverage. We discuss Python’s programming models—procedural programming, functional
📄 Page
8
s tyle programming and objectoriented programming. We use best practices, emphasizing current idiom. Functionalstyle programming is used throughout the book as appropriate. A chart in C hapter 4 lists most of Python’s key functionalstyle programming capabilities and the chapters in which we initially cover most of them. 538 Code Examples You’ll get an engaging, challenging and entertaining introduction to Python with 538 realworld examples ranging from individual snippets to substantial computer science, data science, artificial intelligence and big data case studies. You’ll attack significant tasks with AI, big data and cloud technologies like natural language processing, data mining Twitter, machine learning, deep learning, Hadoop, MapReduce, Spark, IBM Watson, key data science libraries (NumPy, pandas, SciPy, NLTK, TextBlob, spaCy, Textatistic, Tweepy, Scikitlearn, Keras), key visualization libraries (Matplotlib, Seaborn, Folium) and more. Avoid Heavy Math in Favor of English Explanations We capture the conceptual essence of the mathematics and put it to work in our examples. We do this by using libraries such as statistics, NumPy, SciPy, pandas and many others, which hide the mathematical complexity. So, it’s straightforward for you to get many of the benefits of mathematical techniques like linear regression without having to know the mathematics behind them. In the machinelearning and deep learning examples, we focus on creating objects that do the math for you “behind the scenes.” Visualizations 67 static, dynamic, animated and interactive visualizations (charts, graphs, pictures, animations etc.) help you understand concepts. Rather than including a treatment of lowlevel graphics programming, we focus on high level visualizations produced by Matplotlib, Seaborn, pandas and Folium (for interactive maps). We use visualizations as a pedagogic tool. For example, we make the law of large numbers “come alive” in a dynamic dierolling simulation and bar chart. As the number of rolls increases, you’ll see each face’s percentage of the total rolls gradually approach 16.667% (1/6th) and the sizes of the bars representing the percentages equalize. Visualizations are crucial in big data for data exploration and communicating reproducible research results, where the data items can number in the millions, billions or more. A common saying is that a picture is worth a thousand words —in big data, a visualization could be worth billions, trillions or even more items in a database. Visualizations enable you to “fly 40,000 feet above the data” to see it “in the large” and to get to know your data. Descriptive statistics help but can be misleading. For example, Anscombe’s quartet demonstrates through visualizations that significantly different datasets can have nearly identical descriptive statistics. We show the visualization and animation code so you can implement your own. We also provide the animations in sourcecode files and as Jupyter Notebooks, so you can conveniently customize the code and animation parameters, reexecute the animations and see the effects of the changes. Data Experiences h ttps://en.wikipedia.org/wiki/A_picture_is_worth_a_thousand_words. ttps://en.wikipedia.org/wiki/Anscombe%27s_quartet. 5 6 5 6
📄 Page
9
Data Experiences Our Intro to Data Science sections and case studies in C hapters 11– 1 6 provide rich data experiences. You’ll work with many realworld datasets and data sources. There’s an enormous variety of free open datasets available online for you to experiment with. Some of the sites we reference list hundreds or thousands of datasets. Many libraries you’ll use come bundled with popular datasets for experimentation. You’ll learn the steps required to obtain data and prepare it for analysis, analyze that data using many techniques, tune your models and communicate your results effectively, especially through visualization. GitHub GitHub is an excellent venue for finding opensource code to incorporate into your projects (and to contribute your code to the opensource community). It’s also a crucial element of the software developer’s arsenal with version control tools that help teams of developers manage opensource (and private) projects. You’ll use an extraordinary range of free and opensource Python and data science libraries, and free, freetrial and freemium offerings of software and cloud services. Many of the libraries are hosted on GitHub. Hands-On Cloud Computing Much of big data analytics occurs in the cloud, where it’s easy to scale dynamically the amount of hardware and software your applications need. You’ll work with various cloud based services (some directly and some indirectly), including Twitter, Google Translate, IBM Watson, Microsoft Azure, OpenMapQuest, geopy, Dweet.io and PubNub. • We encourage you to use free, free trial or freemium cloud services. We prefer those that don’t require a credit card because you don’t want to risk accidentally running up big bills. If you decide to use a service that requires a credit card, ensure that the tier you’re using for free will not automatically jump to a paid tier. Database, Big Data and Big Data Infrastructure According to IBM (Nov. 2016), 90% of the world’s data was created in the last two years. Evidence indicates that the speed of data creation is rapidly accelerating. According to a March 2016 AnalyticsWeek article, within five years there will be over 50 billion devices connected to the Internet and by 2020 we’ll be producing 1.7 megabytes of new data every second for every person on the planet! We include a treatment of relational databases and SQL with SQLite. Databases are critical big data infrastructure for storing and manipulating the massive amounts of data you’ll process. Relational databases process structured data— they’re not geared to the unstructured and semistructured data in big data applications. So, as big data evolved, NoSQL and NewSQL databases were created to handle such data efficiently. We include a NoSQL and NewSQL overview and a handson case study with a MongoDB JSON document database. MongoDB is the most popular NoSQL database. We discuss big data hardware and software infrastructure in hapter 16, “ B ig h ttps://public.dhe.ibm.com/common/ssi/ecm/wr/en/wrl12345usen/watson customerengagementwatsonmarketingwrotherpapersandreports w rl12345usen20170719.pdf. ttps://analyticsweek.com/content/bigdatafacts/. 7 7 8 8
📄 Page
10
D ata: Hadoop, Spark, NoSQL and IoT (Internet of Things).” Artificial Intelligence Case Studies In case study C hapters 11– 1 5, we present artificial intelligence topics, including natural language processing, data mining Twitter to perform sentiment analysis, cognitive computing with IBM Watson, supervised machine learning, unsupervised machine learning and deep learning. hapter 16 presents the big data hardware and software infrastructure that enables computer scientists and data scientists to implement leadingedge AIbased solutions. Built-In Collections: Lists, Tuples, Sets, Dictionaries There’s little reason today for most application developers to build custom data structures. The book features a rich twochapter treatment of Python’s builtin data structures—lists, tuples, dictionaries and sets—with which most data structuring tasks can be accomplished. Array-Oriented Programming with NumPy Arrays and Pandas Series/DataFrames We also focus on three key data structures from opensource libraries—NumPy arrays, pandas Series and pandas DataFrames. These are used extensively in data science, computer science, artificial intelligence and big data. NumPy offers as much as two orders of magnitude higher performance than builtin Python lists. We include in hapter 7 a rich treatment of NumPy arrays. Many libraries, such as pandas, are built on NumPy. The Intro to Data Science sections in hapters 7– 9 introduce pandas Series and DataFrames, which along with NumPy arrays are then used throughout the remaining chapters. File Processing and Serialization hapter 9 presents textfile processing, then demonstrates how to serialize objects using the popular JSON (JavaScript Object Notation) format. JSON is used frequently in the data science chapters. Many data science libraries provide builtin fileprocessing capabilities for loading datasets into your Python programs. In addition to plain text files, we process files in the popular CSV (commaseparated values) format using the Python Standard Library’s csv module and capabilities of the pandas data science library. Object-Based Programming We emphasize using the huge number of valuable classes that the Python opensource community has packaged into industry standard class libraries. You’ll focus on knowing what libraries are out there, choosing the ones you’ll need for your apps, creating objects from existing classes (usually in one or two lines of code) and making them “jump, dance and sing.” This objectbased programming enables you to build impressive applications quickly and concisely, which is a significant part of Python’s appeal. With this approach, you’ll be able to use machine learning, deep learning and other AI technologies to quickly solve a wide range of intriguing problems, including cognitive computing challenges like speech recognition and computer vision. Object-Oriented Programming Developing custom classes is a crucial objectoriented programming skill, along with inheritance, polymorphism and duck typing. We discuss these in hapter 10. hapter 10 includes a discussion of unit testing with doctest and a fun card shufflinganddealing simulation.
📄 Page
11
C hapters 11– 1 6 require only a few straightforward custom class definitions. In Python, you’ll probably use more of an objectbased programming approach than fullout object oriented programming. Reproducibility In the sciences in general, and data science in particular, there’s a need to reproduce the results of experiments and studies, and to communicate those results effectively. Jupyter Notebooks are a preferred means for doing this. We discuss reproducibility throughout the book in the context of programming techniques and software such as Jupyter Notebooks and Docker. Performance We use the %timeit profiling tool in several examples to compare the performance of different approaches to performing the same tasks. Other performancerelated discussions include generator expressions, NumPy arrays vs. Python lists, performance of machinelearning and deeplearning models, and Hadoop and Spark distributed computing performance. Big Data and Parallelism In this book, rather than writing your own parallelization code, you’ll let libraries like Keras running over TensorFlow, and big data tools like Hadoop and Spark parallelize operations for you. In this big data/AI era, the sheer processing requirements of massive data applications demand taking advantage of true parallelism provided by multicore processors, graphics processing units (GPUs), tensor processing units (TPUs) and huge clusters of computers in the cloud. Some big data tasks could have thousands of processors working in parallel to analyze massive amounts of data expeditiously. CHAPTER DEPENDENCIES If you’re a trainer planning your syllabus for a professional training course or a developer deciding which chapters to read, this section will help you make the best decisions. Please read the onepage color Table of Contents on the book’s inside front cover—this will quickly familiarize you with the book’s unique architecture. Teaching or reading the chapters in order is easiest. However, much of the content in the Intro to Data Science sections at the ends of hapters 1– 0 and the case studies in hapters 11– 6 requires only hapters 1– 5 and small portions of hapters 6– 0 as discussed below. Part 1: Python Fundamentals Quickstart We recommend that you read all the chapters in order: hapter 1, Introduction to Computers and Python, introduces concepts that lay the groundwork for the Python programming in hapters 2– 0 and the big data, artificialintelligence and cloudbased case studies in hapters 11– 6. The chapter also includes testdrives of the IPython interpreter and Jupyter Notebooks. hapter 2, Introduction to Python Programming, presents Python programming fundamentals with code examples illustrating key language features. hapter 3, Control Statements, presents Python’s control statements and introduces basic list processing. hapter 4, Functions, introduces custom functions, presents simulation techniques with randomnumber generation and introduces tuple fundamentals. hapter 5, Sequences: Lists and Tuples, presents Python’s builtin list and tuple collections in more detail and begins introducing functionalstyle programming. P art 2: Python Data Structures, Strings and Files
📄 Page
12
P art 2: Python Data Structures, Strings and Files The following summarizes interchapter dependencies for Python C hapters 6– 9 and assumes that you’ve read hapters 1– 5 . hapter 6, Dictionaries and Sets—The Intro to Data Science section in this chapter is not dependent on the chapter’s contents. hapter 7, ArrayOriented Programming with NumPy—The Intro to Data Science section requires dictionaries ( hapter 6) and arrays ( hapter 7). hapter 8, Strings: A Deeper Look—The Intro to Data Science section requires raw strings and regular expressions ( S ections 8.11– 8 .12), and pandas Series and DataFrame features from ection 7.14’s Intro to Data Science. hapter 9, Files and Exceptions—For JSON serialization, it’s useful to understand dictionary fundamentals ( ection 6.2). Also, the Intro to Data Science section requires the builtin open function and the with statement ( ection 9.3), and pandas DataFrame features from ection 7.14’s Intro to Data Science. Part 3: Python High-End Topics The following summarizes interchapter dependencies for Python hapter 10 and assumes that you’ve read hapters 1– 5 . hapter 10, ObjectOriented Programming—The Intro to Data Science section requires pandas DataFrame features from Intro to Data Science ection 7.14. Trainers wanting to cover only classes and objects can present ections 10.1– 1 0.6. Trainers wanting to cover more advanced topics like inheritance, polymorphism and duck typing, can present ections 10.7– 0.9. ections 10.10– 0.15 provide additional advanced perspectives. Part 4: AI, Cloud and Big Data Case Studies The following summary of interchapter dependencies for hapters 11– 6 assumes that you’ve read hapters 1– 5 . Most of hapters 11– 6 also require dictionary fundamentals from ection 6.2. hapter 11, Natural Language Processing (NLP), uses pandas DataFrame features from ection 7.14’s Intro to Data Science. hapter 12, Data Mining Twitter, uses pandas DataFrame features from ection 7 .14’s Intro to Data Science, string method join ( ection 8.9), JSON fundamentals ( ection 9.5), TextBlob ( ection 11.2) and Word clouds ( ection 11.3). Several examples require defining a class via inheritance ( hapter 10). hapter 13, IBM Watson and Cognitive Computing, uses builtin function open and the with statement ( ection 9.3). hapter 14, Machine Learning: Classification, Regression and Clustering, uses NumPy array fundamentals and method unique ( hapter 7), pandas DataFrame features from ection 7.14’s Intro to Data Science and Matplotlib function subplots ( ection 10.6). hapter 15, Deep Learning, requires NumPy array fundamentals ( hapter 7), string method join ( ection 8.9), general machinelearning concepts from hapter 14 and features from hapter 14’s Case Study: Classification with kNearest Neighbors and the Digits Dataset. hapter 16, B ig Data: Hadoop, Spark, NoSQL and IoT, uses string method split ( ection 6.2.7), Matplotlib FuncAnimation from ection 6.4’s Intro to Data Science, pandas Series and DataFrame features from ection 7.14’s Intro to Data Science, string
📄 Page
13
m ethod join ( S ection 8.9), the json module ( ection 9.5), NLTK stop words ( ection 1 1.2.13) and from C hapter 12, Twitter authentication, Tweepy’s StreamListener class for streaming tweets, and the geopy and folium libraries. A few examples require defining a class via inheritance ( hapter 10), but you can simply mimic the class definitions we provide without reading hapter 10. JUPYTER NOTEBOOKS For your convenience, we provide the book’s code examples in Python source code (.py) files for use with the commandline IPython interpreter and as Jupyter Notebooks (.ipynb) files that you can load into your web browser and execute. Jupyter Notebooks is a free, opensource project that enables you to combine text, graphics, audio, video, and interactive coding functionality for entering, editing, executing, debugging, and modifying code quickly and conveniently in a web browser. According to the article, “What Is Jupyter?”: Jupyter has become a standard for scientific research and data analysis. It packages computation and argument together, letting you build “computational narratives”; and it simplifies the problem of distributing working software to teammates and associates. In our experience, it’s a wonderful learning environment and rapid prototyping tool. For this reason, we use Jupyter Notebooks rather than a traditional IDE, such as Eclipse, Visual Studio, PyCharm or Spyder. Academics and professionals already use Jupyter extensively for sharing research results. Jupyter Notebooks support is provided through the traditional opensource community mechanisms (see “Getting Jupyter Help” later in this Preface). See the Before You Begin section that follows this Preface for software installation details and see the testdrives in ection 1.5 for information on running the book’s examples. Collaboration and Sharing Results Working in teams and communicating research results are both important for developers in or moving into dataanalytics positions in industry, government or academia: The notebooks you create are easy to share among team members simply by copying the files or via GitHub. Research results, including code and insights, can be shared as static web pages via tools like nbviewer ( h ttps://nbviewer.jupyter.org) and GitHub—both automatically render notebooks as web pages. Reproducibility: A Strong Case for Jupyter Notebooks In data science, and in the sciences in general, experiments and studies should be reproducible. This has been written about in the literature for many years, including Donald Knuth’s 1992 computer science publication—Literate Programming. The article “LanguageAgnostic Reproducible Data Analysis Using Literate Programming,” which says, “Lir (literate, reproducible computing) is based on the idea of literate programming as proposed by Donald Knuth.” Essentially, reproducibility captures the complete environment used to produce results— hardware, software, communications, algorithms (especially code), data and the data’s ttps://www.oreilly.com/ideas/whatisjupyter. ttps://jupyter.org/community. Knuth, D., “Literate Programming” (PDF), The Computer Journal, British Computer Society, 1992. ttp://journals.plos.org/plosone/article? i d=10.1371/journal.pone.0164023. 9 9 0 0 1 1 2 2
📄 Page
14
p rovenance (origin and lineage). DOCKER In C hapter 16, we’ll use Docker—a tool for packaging software into containers that bundle everything required to execute that software conveniently, reproducibly and portably across platforms. Some software packages we use in hapter 16 require complicated setup and configuration. For many of these, you can download free preexisting Docker containers. These enable you to avoid complex installation issues and execute software locally on your desktop or notebook computers, making Docker a great way to help you get started with new technologies quickly and conveniently. Docker also helps with reproducibility. You can create custom Docker containers that are configured with the versions of every piece of software and every library you used in your study. This would enable other developers to recreate the environment you used, then reproduce your work, and will help you reproduce your own results. In hapter 16, you’ll use Docker to download and execute a container that’s preconfigured for you to code and run big data Spark applications using Jupyter Notebooks. SPECIAL FEATURE: IBM WATSON ANALYTICS AND COGNITIVE COMPUTING Early in our research for this book, we recognized the rapidly growing interest in IBM’s Watson. We investigated competitive services and found Watson’s “no credit card required” policy for its “free tiers” to be among the most friendly for our readers. IBM Watson is a cognitivecomputing platform being employed across a wide range of realworld scenarios. Cognitivecomputing systems simulate the patternrecognition and decisionmaking capabilities of the human brain to “learn” as they consume more data. We include a significant handson Watson treatment. We use the free Watson Developer Cloud: Python SDK, which provides APIs that enable you to interact with Watson’s services programmatically. Watson is fun to use and a great platform for letting your creative juices flow. You’ll demo or use the following Watson APIs: Conversation, Discovery, Language Translator, Natural Language Classifier, Natural Language Understanding, Personality Insights, Speech to Text, Text to Speech, Tone Analyzer and Visual Recognition. Watson’s Lite Tier Services and a Cool Watson Case Study IBM encourages learning and experimentation by providing free lite tiers for many of its APIs. In hapter 13, you’ll try demos of many Watson services. Then, you’ll use the lite tiers of Watson’s Text to Speech, Speech to Text and Translate services to implement a “traveler’s assistant” translation app. You’ll speak a question in English, then the app will transcribe your speech to English text, translate the text to Spanish and speak the Spanish text. Next, you’ll speak a Spanish response (in case you don’t speak Spanish, we provide an audio file you can use). Then, the app will quickly transcribe the speech to Spanish text, translate the text to English and speak the English response. Cool stuff! TEACHING APPROACH Python for Programmers contains a rich collection of examples drawn from many fields. You’ll work through interesting, realworld examples using realworld datasets. The book concentrates on the principles of good software engineering and stresses program h ttp://whatis.techtarget.com/definition/cognitivecomputing. ttps://en.wikipedia.org/wiki/Cognitive_computing. ttps://www.forbes.com/sites/bernardmarr/2016/03/23/whateveryone s houldknowaboutcognitivecomputing. Always check the latest terms on IBM’s website, as the terms and services may change. ttps://console.bluemix.net/catalog/. 1 3, , 5 4 3 4 5 6 7 6 7
📄 Page
15
clarity. Using Fonts for Emphasis We place the key terms and the index’s page reference for each defining occurrence in bold text for easier reference. We refer to onscreen components in the bold Helvetica font (for example, the File menu) and use the Lucida font for Python code (for example, x = 5). Syntax Coloring For readability, we syntax color all the code. Our syntaxcoloring conventions are as follows: comments appear in green keywords appear in dark blue constants and literal values appear in light blue errors appear in red all other code appears in black 538 Code Examples The book’s 538 examples contain approximately 4000 lines of code. This is a relatively small amount for a book this size and is due to the fact that Python is such an expressive language. Also, our coding style is to use powerful class libraries to do most of the work wherever possible. 160 Tables/Illustrations/Visualizations We include abundant tables, line drawings, and static, dynamic and interactive visualizations. Programming Wisdom We integrate into the discussions programming wisdom from the authors’ combined nine decades of programming and teaching experience, including: Good programming practices and preferred Python idioms that help you produce clearer, more understandable and more maintainable programs. Common programming errors to reduce the likelihood that you’ll make them. Errorprevention tips with suggestions for exposing bugs and removing them from your programs. Many of these tips describe techniques for preventing bugs from getting into your programs in the first place. Performance tips that highlight opportunities to make your programs run faster or minimize the amount of memory they occupy. Software engineering observations that highlight architectural and design issues for proper software construction, especially for larger systems. SOFTWARE USED IN THE BOOK The software we use is available for Windows, macOS and Linux and is free for download from the Internet. We wrote the book’s examples using the free Anaconda Python distribution. It includes most of the Python, visualization and data science libraries you’ll need, as well as the IPython interpreter, Jupyter Notebooks and Spyder, considered one of the best Python data science IDEs. We use only IPython and Jupyter Notebooks for program development in the book. The Before You Begin section following this Preface discusses installing Anaconda and a few other items you’ll need for working with our examples. PYTHON DOCUMENTATION You’ll find the following documentation especially helpful as you work through the book: The Python Language Reference: h ttps://docs.python.org/3/reference/index.html
📄 Page
16
The Python Standard Library: h ttps://docs.python.org/3/library/index.html Python documentation list: ttps://docs.python.org/3/ GETTING YOUR QUESTIONS ANSWERED Popular Python and general programming online forums include: pythonforum.io ttps://www.dreamincode.net/forums/forum/29python/ StackOverflow.com Also, many vendors provide forums for their tools and libraries. Many of the libraries you’ll use in this book are managed and maintained at github.com. Some library maintainers provide support through the Issues tab on a given library’s GitHub page. If you cannot find an answer to your questions online, please see our web page for the book at ttp://www.deitel.com GETTING JUPYTER HELP Jupyter Notebooks support is provided through: Project Jupyter Google Group: ttps://groups.google.com/forum/#!forum/jupyter Jupyter realtime chat room: ttps://gitter.im/jupyter/jupyter GitHub ttps://github.com/jupyter/help StackOverflow: ttps://stackoverflow.com/questions/tagged/jupyter Jupyter for Education Google Group (for instructors teaching with Jupyter): ttps://groups.google.com/forum/#!forum/jupytereducation SUPPLEMENTS To get the most out of the presentation, you should execute each code example in parallel with reading the corresponding discussion in the book. On the book’s web page at ttp://www.deitel.com we provide: Downloadable Python source code (.py files) and Jupyter Notebooks (.ipynb files) for the book’s code examples. Getting Started videos showing how to use the code examples with IPython and Jupyter Notebooks. We also introduce these tools in S ection 1.5. Our website is undergoing a major upgrade. If you do not find something you need, please write to us directly at d eitel@deitel.com. 1 8 8
📄 Page
17
Blog posts and book updates. For download instructions, see the Before You Begin section that follows this Preface. KEEPING IN TOUCH WITH THE AUTHORS For answers to questions or to report an error, send an email to us at d eitel@deitel.com or interact with us via social media: Facebook ( h ttp://www.deitel.com/deitelfan) Twitter (@deitel) LinkedIn ( ttp://linkedin.com/company/deitel&associates) YouTube ( ttp://youtube.com/DeitelTV) ACKNOWLEDGMENTS We’d like to thank Barbara Deitel for long hours devoted to Internet research on this project. We’re fortunate to have worked with the dedicated team of publishing professionals at Pearson. We appreciate the efforts and 25year mentorship of our friend and colleague Mark L. Taub, Vice President of the Pearson IT Professional Group. Mark and his team publish our professional books, LiveLessons video products and Learning Paths in the Safari service ( ttps://learning.oreilly.com/). They also sponsor our Safari live online training seminars. Julie Nahil managed the book’s production. We selected the cover art and Chuti Prasertsith designed the cover. We wish to acknowledge the efforts of our reviewers. Patricia ByronKimball and Meghan Jacoby recruited the reviewers and managed the review process. Adhering to a tight schedule, the reviewers scrutinized our work, providing countless suggestions for improving the accuracy, completeness and timeliness of the presentation. Reviewers Book Reviewers Daniel Chen, Data Scientist, Lander Analytics Garrett Dancik, Associate Professor of Computer Science/Bioinformatics, Eastern Connecticut State University Pranshu Gupta, Assistant Professor, Computer Science, DeSales University David Koop, Assistant Professor, Data Science Program CoDirector, UMass Dartmouth Ramon MataToledo, Professor, Computer Science, James Madison University Shyamal Mitra, Senior Lecturer, Computer Science, University of Texas at Austin Alison Sanchez, Assistant Professor in Daniel Chen, Data Scientist, Lander Analytics Garrett Dancik, Associate Professor of Computer Science/Bioinformatics, Eastern Connecticut State University Dr. Marsha Davis, Department Chair of Mathematical Sciences, Eastern Connecticut State University Roland DePratti, Adjunct Professor of Computer Science, Eastern Connecticut State University Shyamal Mitra, Senior Lecturer, Computer Science, University of Texas at Austin Dr. Mark Pauley, Senior Research Fellow, Bioinformatics, School of Interdisciplinary ® ® ® ®
📄 Page
18
Economics, University of San Diego José Antonio González Seco, IT Consultant Jamie Whitacre, Independent Data Science Consultant Elizabeth Wickes, Lecturer, School of Information Sciences, University of Illinois Proposal Reviewers Dr. Irene Bruno, Associate Professor in the Department of Information Sciences and Technology, George Mason University Lance Bryant, Associate Professor, Department of Mathematics, Shippensburg University Informatics, University of Nebraska at Omaha Sean Raleigh, Associate Professor of Mathematics, Chair of Data Science, Westminster College Alison Sanchez, Assistant Professor in Economics, University of San Diego Dr. Harvey Siy, Associate Professor of Computer Science, Information Science and Technology, University of Nebraska at Omaha Jamie Whitacre, Independent Data Science Consultant As you read the book, we’d appreciate your comments, criticisms, corrections and suggestions for improvement. Please send all correspondence to: d eitel@deitel.com We’ll respond promptly. Welcome again to the exciting opensource world of Python programming. We hope you enjoy this look at leadingedge computerapplications development with Python, IPython, Jupyter Notebooks, data science, AI, big data and the cloud. We wish you great success! Paul and Harvey Deitel ABOUT THE AUTHORS Paul J. Deitel, CEO and Chief Technical Officer of Deitel & Associates, Inc., is an MIT graduate with 38 years of experience in computing. Paul is one of the world’s most experienced programminglanguages trainers, having taught professional courses to software developers since 1992. He has delivered hundreds of programming courses to industry clients internationally, including Cisco, IBM, Siemens, Sun Microsystems (now Oracle), Dell, Fidelity, NASA at the Kennedy Space Center, the National Severe Storm Laboratory, White Sands Missile Range, Rogue Wave Software, Boeing, Nortel Networks, Puma, iRobot and many more. He and his coauthor, Dr. Harvey M. Deitel, are the world’s bestselling programminglanguage textbook/professional book/video authors. Dr. Harvey M. Deitel, Chairman and Chief Strategy Officer of Deitel & Associates, Inc., has 58 years of experience in computing. Dr. Deitel earned B.S. and M.S. degrees in Electrical Engineering from MIT and a Ph.D. in Mathematics from Boston University—he studied computing in each of these programs before they spun off Computer Science programs. He has extensive college teaching experience, including earning tenure and serving as the Chairman of the Computer Science Department at Boston College before founding Deitel & Associates, Inc., in 1991 with his son, Paul. The Deitels’ publications have earned international recognition, with more than 100 translations published in Japanese, German, Russian, Spanish, French, Polish, Italian, Simplified Chinese, Traditional Chinese, Korean, Portuguese, Greek, Urdu and Turkish. Dr. Deitel has delivered hundreds of programming courses to academic, corporate, government and military clients. ABOUT DEITEL & ASSOCIATES, INC.®
📄 Page
19
D eitel & Associates, Inc., founded by Paul Deitel and Harvey Deitel, is an internationally recognized authoring and corporate training organization, specializing in computer programming languages, object technology, mobile app development and Internet and web software technology. The company’s training clients include some of the world’s largest companies, government agencies, branches of the military and academic institutions. The company offers instructorled training courses delivered at client sites worldwide on major programming languages. Through its 44year publishing partnership with Pearson/Prentice Hall, Deitel & Associates, Inc., publishes leadingedge programming textbooks and professional books in print and e book formats, LiveLessons video courses (available for purchase at h ttps://www.informit.com), Learning Paths and live online training seminars in the Safari service ( ttps://learning.oreilly.com) and Revel™ interactive multimedia courses. To contact Deitel & Associates, Inc. and the authors, or to request a proposal onsite, instructorled training, write to: d eitel@deitel.com To learn more about Deitel onsite corporate training, visit ttp://www.deitel.com/training Individuals wishing to purchase Deitel books can do so at ttps://www.amazon.com Bulk orders by corporations, the government, the military and academic institutions should be placed directly with Pearson. For more information, visit ttps://www.informit.com/store/sales.aspx
📄 Page
20
H istory T opics L earning Paths O ffers & Deals ighlights S ettings Support Sign Out Before You Begin This section contains information you should review before using this book. We’ll post updates at: http://www.deitel.com. FONT AND NAMING CONVENTIONS We show Python code and commands and file and folder names in a sansserif fon , and onscreen components, such as menu names, in a bold sansserif font. We use italics for emphasis and bold occasionally for strong emphasis. GETTING THE CODE EXAMPLES You can download the examples.zip file containing the book’s examples from our Python for Programmers web page at: http://www.deitel.com Click the Download Examples link to save the file to your local computer. Most web browsers place the file in your user account’s Downloads folder. When the download completes, locate it on your system, and extract its examples folder into your user account’s Documents folder: Windows: C:\Users\YourAccount\Documents\examples macOS or Linux: ~/Documents/examples Most operating systems have a builtin extraction tool. You also may use an archive tool such as 7Zip (www.7zip.org) or WinZip (www.winzip.com). STRUCTURE OF THE EXAMPLES FOLDER You’ll execute three kinds of examples in this book: