📄 Page
1
(This page has no text content)
📄 Page
2
Mastering R R is a statistical computing and graphics programming language that you can use to clean, analyze, and graph data. It is widely used by researchers from various disciplines to estimate and display results and by teachers of statistics and research methods. This book is a detailed guide for beginners to understand R with an ex- planation of core statistical and research ideas. One of the powerful characteristics of R is that it is open-source, which means that anyone can access the underlying code used to run the program and add their own code for free. It will always be able to perform the latest statistical analyses as soon as anyone thinks of them. R corrects mistakes quickly and transparently and has put together a community of programming and statistical experts that you can turn to for help. Mastering R: A Beginner’s Guide not only explains how to program but also how to use R for visualization and modeling. The fundamental principles of R explained here are helpful to beginner and intermediate users interested in learning this highly technological and diverse language.
📄 Page
3
About the Series The Mastering Computer Science covers a wide range of topics, spanning programming languages as well as modern-day technologies and frame- works. The series has a special focus on beginner-level content and is pre- sented in an easy-to-understand manner, comprising: • Crystal-clear text, spanning various topics sorted by relevance. • Special focus on practical exercises, with numerous code samples and programs. • A guided approach to programming, with step-by-step tutorials for absolute beginners. • Keen emphasis on real-world utility of skills, thereby cutting the redundant and seldom-used concepts and focusing instead of industry- prevalent coding paradigm. • A wide range of references and resources to help both beginner and intermediate-level developers gain the most out of the books. Mastering Computer Science series of books start from the core concepts, and then quickly move on to industry-standard coding practices, to help learners gain efficient and crucial skills in as little time as possible. The books assume no prior knowledge of coding, so even the absolute newbie coders can benefit from this series. Mastering Computer Science series is edited by Sufyan bin Uzayr, a writer and educator with over a decade of experience in the computing field. For more information about this series, please visit: https://www.rout- ledge.com/Mastering-Computer-Science/book-series/MCS
📄 Page
4
Mastering R A Beginner’s Guide Edited by Sufyan bin Uzayr
📄 Page
5
First Edition published 2024 by CRC Press 2385 NW Executive Center Drive, Suite 320, Boca Raton, FL 33431 and by CRC Press 2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN CRC Press is an imprint of Taylor & Francis Group, LLC © 2024 Sufyan bin Uzayr Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf. co.uk Trademark Notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. ISBN: 9781032415215 (hbk) ISBN: 9781032415185 (pbk) ISBN: 9781003358480 (ebk) DOI: 10.1201/9781003358480 Typeset in Minion by KnowledgeWorks Global Ltd.
📄 Page
7
(This page has no text content)
📄 Page
8
vii Contents About the Editor, xviii Acknowledgments, xix Zeba Academy – Mastering Computer Science, xx Chapter 1 ◾ Introduction to R 1 DATA SCIENCE 1 Data Science Life Cycle 2 Data Science Process 2 INTRODUCTION TO R 3 ENVIRONMENT OF R 3 ORIGIN OF R 4 HISTORY 4 R VERSUS PYTHON 5 How Does R Work? 6 R Is Software 7 Usage 7 Use of the R Programming Language in Various Areas 7 R in Research and Academics 8 R Use Cases in Research and Academia 8 R in IT Sectors 8 R in Finance 9 R for Social Media 10 R in Banking 10 R in Healthcare 10
📄 Page
9
viii ◾ Contents R in Manufacturing 11 R in the Governmental Department 11 SOME OTHER APPLICATIONS OF R 11 THE MOST POPULAR R PACKAGES 13 R PACKAGE 13 COMPREHENSIVE R ARCHIVE NETWORK 14 IDE AND EDITORS 15 RStudio 16 R Tools for Visual Studio 16 Rattle 16 Installing Rattle 17 Installing on Ubuntu 22.04 18 Installing on Macintosh OS X (Leopard and Lion) 18 DESIGN OF THE R SYSTEM 19 INTRODUCTION TO RSTUDIO 20 How to Install RStudio 20 Installing R 20 Installing RStudio 20 DOWNLOAD AND INSTALL R 20 INSTALLING RSTUDIO ON WINDOWS 21 RSTUDIO 27 RSTUDIO INSTALLATION 27 RStudio Desktop 28 RStudio Server 29 RSTUDIO USER INTERFACE 34 StatET for R 34 Installation of StatET 35 35 For Mac 36 KWard 37 Tinn-R 38 R AnalyticFlow 38 Why AnalyticFlow? 39 For Windows
📄 Page
10
Contents ◾ ix ADVANTAGES OF R PROGRAMMING 39 DISADVANTAGES OF R PROGRAMMING 42 CHAPTER SUMMARY 43 NOTES 43 Chapter 2 ◾ Handling Data with R 45 GETTING STARTED WITH THE RSTUDIO APPLICATION 45 THE CONSOLE 47 The Text Editor 49 File Browser Tab 49 “Environment” Tab 50 Code Completion 51 Retrieving Previous Commands 52 Console Title Bar 52 Keyboard Shortcuts 52 Basic Syntax in R Programming 52 R Syntax 53 Variables in R 53 Creating Variables in R 53 Print/Output Variables 54 Comments 54 Types of Comments 54 Comments in R 55 Single-Line Comments in R 55 Keywords in R 56 Quick Navigation between Windows 69 Console Keyboard Shortcuts 69 Completions (Console and Source) 70 Help 70 Debug 70 Plots 70 Git/SVN 70 Session 71
📄 Page
11
x ◾ Contents Terminal 71 Accessibility 71 Main Menu (Server) 71 DATA TYPES IN R 72 Logical Data Type 74 Numeric Data Type 75 Integer Data Type 75 Complex Data Type 76 Character Data Type 77 Raw Data Type 77 Find Data Type of an Object 78 Type Verification 79 Convert an Object’s Data Type to Another 80 NOMENCLATURE OF R VARIABLES 80 IMPORTANT METHODS FOR VARIABLES 81 class() Function 81 ls() Function 81 rm() Function 82 Scope of Variable in R 83 VARIABLE NAMING CONVENTION 84 The Scope of a Variable 84 How to Access Local Variable Globally 86 How Does the Super Assignment Operator Work? 86 R Programming Environment 87 DATA STRUCTURE IN R 88 GETTING DATA IN R 89 CHAPTER SUMMARY 89 NOTES 89 Chapter 3 ◾ Variables and Operators 91 INTRODUCTION 91 Rules for Writing Identifiers in R 91 Constants in R 92
📄 Page
12
Contents ◾ xi Valid Identifiers in R 92 Invalid Identifiers in R 93 Numerical Constants 93 Character Constants 94 Constants: Built-in Constants 94 BASIC OPERATIONS IN R 95 <- 95 R Is Case-Sensitive 95 Function 96 Defined in Base R 96 R OPERATORS 96 Types of Operators 96 R ARITHMETIC OPERATORS 97 MATRIX OPERATIONS IN R 98 Addition and Subtraction 98 TRANSPOSE A MATRIX IN R 101 Matrix Multiplication in R 101 Multiplication by a Scalar 101 Element-Wise Multiplication 103 Matrix Multiplication in R 103 Matrix Multiplication Using Three Matrix 104 MATRIX CROSS-PRODUCT 105 Exterior Product 107 Kronecker Product 109 POWER OF A MATRIX IN R 110 Determinant of a Matrix in R 115 Inverse of a Matrix in R 116 Rank of a Matrix in R 117 R RELATIONAL OPERATORS 118 Less Than (<) and Greater Than (<) 118 Greater Than (>) 119 Less Than Equal (<=) 120 Greater Than Equal (>=) 120
📄 Page
13
xii ◾ Contents Not Equal to (!=) 121 Assignment Operators 122 R LOGICAL OPERATORS 124 AND Operator “&” 124 R Miscellaneous Operators 125 CHAPTER SUMMARY 126 NOTES 126 Chapter 4 ◾ Loops and Decision-Making in R 127 CONTROL STRUCTURES IN R PROGRAMMING 128 If Statement in R 129 NESTED IF STATEMENTS 131 If–Else Statement 132 Nested If–Else Statements 133 R If–Else–If Statement 133 R Switch Statement 134 R Repeat Statement 136 Next Statement 137 Return Statement 137 Break Statement 138 LOOPING IN R 139 R ifelse() Function 140 CHAPTER SUMMARY 140 NOTES 140 Chapter 5 ◾ Functions and Strings 141 INTRODUCTION 141 Defining a Function 142 Functional Components 142 Arguments 143 Number of Arguments 143 Default Parameter Value 144
📄 Page
14
Contents ◾ xiii FUNCTION TYPES 144 Primitive Function 145 Infix Functions 154 INFIX OPERATORS IN R 154 USER-DEFINED INFIX OPERATOR 155 THE gR GLOBAL VARIABLES 155 THE GLOBAL ASSIGNMENT OPERATOR 155 TYPES OF FUNCTIONS 156 Function Call 156 BUILT-IN FUNCTIONS 157 Numeric Functions 157 Character Functions 158 Statistical Probability Functions 160 Other Statistical Functions 160 R STRINGS 161 String Basics 162 c() Function in R 163 BUILT-IN STRING FUNCTION IN R 164 String Manipulation 169 CHAPTER SUMMARY 171 NOTES 171 Chapter 6 ◾ Lists and Arrays 173 INTRODUCTION 173 LIST OPERATIONS 173 Creating a List 174 Naming List Elements in R 176 Access to R List Elements 176 INDEXING LISTS 178 SLICING THE LIST 179 Letters 180 LETTERS 180
📄 Page
15
xiv ◾ Contents month.abb 181 month.name 181 ADDING, DELETING, AND UPDATING ELEMENTS OF A LIST 181 CONVERTING A LIST TO VECTOR 182 DOUBLE VERSUS SINGLE PARENTHESES 183 LIST LENGTH 183 Check If Item Exists 183 ADD LIST IN ITEMS 184 REMOVE LIST ITEMS 184 RANGE OF INDEXES IN LIST 185 LOOP THROUGH A LIST 185 Join Two Lists 186 CHAPTER SUMMARY 187 NOTES 187 Chapter 7 ◾ Data Structures 189 INTRODUCTION 189 VECTOR 190 Single Element Vector 191 How to Create Vector in R? 191 Atomic Vectors 191 Types of Vectors 192 Numeric Vectors 192 Character Vectors 193 Logical Vectors 193 CREATION OF VECTOR IN R 193 Create Vector Using c() Function 193 Create Vector Using seq() Function 194 Create Vector Using: Operator 194 R MATRIX 194 Creating a Matrix in R 194 Access Elements of a Matrix 195
📄 Page
16
Contents ◾ xv DATA FRAME 196 CHAPTER SUMMARY 196 NOTES 196 Chapter 8 ◾ Error Handling and File Handling 197 INTRODUCTION 197 Warning 198 Message 198 HANDLING ERROR FUNCTIONS 199 MANIPULATION OF CONDITIONS IN R 204 withCallingHandlers() in R 208 PROCESSING CONDITIONS IN R PROGRAMMING 209 CHANGING OF CONDITIONS PROGRAMMATICALLY 209 Custom Signal Classes 212 DEBUGGING IN R PROGRAMMING 213 Editor Breakpoints 213 The traceback() 214 browser() Function 214 recover() Function 215 DEBUGGING TOOLS IN R 216 Using debug() 216 Using recover() 217 FILE HANDLING IN R 217 Creation of a File 217 Writing to Files 218 Parameters 218 Reading Data from a File 219 Parameters 219 Check an Existing File 219 CHAPTER SUMMARY 220 NOTES 220
📄 Page
17
xvi ◾ Contents Chapter 9 ◾ Graphics in R 221 INTRODUCTION GRAPHIC 221 Types of R Charts 222 DATA SET 222 Information-Related Data Set 223 Get Information 224 Print Variable Values 225 Sort Variable Values 226 Analyzing the Data 226 Max Min 226 Mean, Median, and Mode 227 Mean 227 Median 227 Mode 227 Visualize the mtcars Data Set 228 To Create a Scatter Plot 228 Loading the mtcars Data Set in R Code 228 BAR CHART 229 Syntax 229 R PLOTTING 236 R PLOT TYPE 237 R PLOT PCH 238 Plot Title in R 240 SUBTITLE IN R PLOT 241 AXIS IN R 242 R LINE CHART AND GRAPH 243 ADDING TITLE, COLOR, AND LABELS TO LINE CHARTS IN R 244 MULTIPLE LINES IN A LINE CHART 245 Color 246 Width 247 Style 248
📄 Page
18
Contents ◾ xvii MULTIPLE LINES 249 R SCATTER PLOT 249 Creating the Scatterplot 251 Scatter Matrices 251 PIE CHARTS 252 Pie Chart Title and Colors 253 3D Pie Chart 254 Pie Chart Height in R 255 Pie Chart Angle in R 255 BOXPLOT 256 Histograms and Density Plots in R 256 Parameters 257 Density Plot 258 HEATMAP 259 Using the Heatmap() 259 PARIS PLOT 259 VENN DIAGRAM 260 CHAPTER SUMMARY 261 NOTES 261 APPRAISAL, 263 INDEX, 267
📄 Page
19
xviii About the Editor Sufyan bin Uzayr is a writer, coder, and entrepreneur with over a decade of experience in the industry. He has authored several books in the past, per- taining to a diverse range of topics, ranging from History to Computers/ IT. Sufyan is the Director of Parakozm, a multinational IT company spe- cializing in EdTech solutions. He also runs Zeba Academy, an online learning and teaching vertical with a focus on STEM fields. Sufyan specializes in a wide variety of technologies, such as JavaScript, Dart, WordPress, Drupal, Linux, and Python. He holds multiple degrees, including ones in management, IT, literature, and political science. Sufyan is a digital nomad, dividing his time between four countries. He has lived and taught in universities and educational institutions around the globe. Sufyan takes a keen interest in technology, politics, literature, history, and sports, and in his spare time, he enjoys teaching coding and English to young students. Learn more at sufyanism.com
📄 Page
20
xix Acknowledgments There are many people who deserve to be on this page for this book would not have come into existence without their support. That said, some names deserve a special mention, and I am genuinely grateful to: • My parents, for everything they have done for me. • The Parakozm team, especially Divya Sachdeva, Jaskiran Kaur, and Simran Rao, for offering great amounts of help and assistance during the book-writing process. • The CRC team, especially Sean Connelly and Danielle Zarfati, for ensuring that the book’s content, layout, formatting, and everything else remain perfect throughout. • Reviewers of this book, for going through the manuscript and pro- viding their insight and feedback. • Typesetters, cover designers, printers, and everyone else, for their part in the development of this book. • All the folks associated with Zeba Academy, either directly or indi- rectly, for their help and support. • The programming community in general, and the web development community in particular, for all their hard work and efforts. —Sufyan bin Uzayr