📄 Page
1
Michael Hausenblas Learning Modern Linux A Handbook for the Cloud Native Practitioner
📄 Page
2
(This page has no text content)
📄 Page
3
Michael Hausenblas Learning Modern Linux A Handbook for the Cloud Native Practitioner Boston Farnham Sebastopol TokyoBeijing
📄 Page
4
978-1-098-10894-6 [LSI] Learning Modern Linux by Michael Hausenblas Copyright © 2022 Michael Hausenblas. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: John Devins Development Editor: Jeff Bleiel Production Editor: Gregory Hyman Copyeditor: Piper Editorial Consulting, LLC Proofreader: Amnet Systems, LLC Indexer: WordCo Indexing Services, Inc. Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea May 2022: First Edition Revision History for the First Edition 2022-04-15: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098108946 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Learning Modern Linux, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
📄 Page
5
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1. Introduction to Linux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 What Are Modern Environments? 1 The Linux Story (So Far) 3 Why an Operating System at All? 3 Linux Distributions 5 Resource Visibility 5 A Ten-Thousand-Foot View of Linux 8 Conclusion 9 2. The Linux Kernel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Linux Architecture 12 CPU Architectures 14 x86 Architecture 15 ARM Architecture 15 RISC-V Architecture 16 Kernel Components 16 Process Management 17 Memory Management 19 Networking 20 Filesystems 21 Device Drivers 21 syscalls 22 Kernel Extensions 26 Modules 26 A Modern Way to Extend the Kernel: eBPF 27 Conclusion 29 iii
📄 Page
6
3. Shells and Scripting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Basics 32 Terminals 33 Shells 33 Modern Commands 41 Common Tasks 45 Human-Friendly Shells 48 Fish Shell 49 Z-shell 53 Other Modern Shells 54 Which Shell Should I Use? 55 Terminal Multiplexer 55 screen 56 tmux 56 Other Multiplexers 60 Which Multiplexer Should I Use? 61 Scripting 62 Scripting Basics 62 Writing Portable bash Scripts 64 Linting and Testing Scripts 67 End-to-End Example: GitHub User Info Script 68 Conclusion 70 4. Access Control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Basics 74 Resources and Ownership 74 Sandboxing 75 Types of Access Control 75 Users 76 Managing Users Locally 77 Centralized User Management 80 Permissions 80 File Permissions 81 Process Permissions 85 Advanced Permission Management 87 Capabilities 87 seccomp Profiles 89 Access Control Lists 89 Good Practices 89 Conclusion 90 iv | Table of Contents
📄 Page
7
5. Filesystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Basics 94 The Virtual File System 97 Logical Volume Manager 99 Filesystem Operations 101 Common Filesystem Layouts 103 Pseudo Filesystems 104 procfs 104 sysfs 106 devfs 107 Regular Files 108 Common Filesystems 109 In-Memory Filesystems 110 Copy-on-Write Filesystems 111 Conclusion 112 6. Applications, Package Management, and Containers. . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Basics 116 The Linux Startup Process 117 systemd 119 Units 120 Management with systemctl 121 Monitoring with journalctl 122 Example: scheduling greeter 122 Linux Application Supply Chains 124 Packages and Package Managers 126 RPM Package Manager 126 Debian deb 129 Language-Specific Package Managers 131 Containers 131 Linux Namespaces 133 Linux cgroups 135 Copy-on-Write Filesystems 138 Docker 138 Other Container Tooling 142 Modern Package Managers 143 Conclusion 143 7. Networking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Basics 146 The TCP/IP Stack 147 The Link Layer 149 Table of Contents | v
📄 Page
8
The Internet Layer 152 The Transport Layer 160 Sockets 164 DNS 165 DNS Records 168 DNS Lookups 170 Application Layer Networking 173 The Web 173 Secure Shell 177 File Transfer 178 Network File System 181 Sharing with Windows 181 Advanced Network Topics 181 whois 181 Dynamic Host Configuration Protocol 182 Network Time Protocol 183 Wireshark and tshark 183 Other Advanced Tooling 184 Conclusion 185 8. Observability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Basics 188 Observability Strategy 188 Terminology 189 Signal Types 190 Logging 191 Syslog 194 journalctl 196 Monitoring 197 Device I/O and Network Interfaces 199 Integrated Performance Monitors 201 Instrumentation 204 Advanced Observability 205 Tracing and Profiling 205 Prometheus and Grafana 207 Conclusion 210 9. Advanced Topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Interprocess Communication 214 Signals 214 Named Pipes 216 UNIX Domain Sockets 217 vi | Table of Contents
📄 Page
9
Virtual Machines 217 Kernel-Based Virtual Machine 218 Firecracker 219 Modern Linux Distros 220 Red Hat Enterprise Linux CoreOS 221 Flatcar Container Linux 221 Bottlerocket 221 RancherOS 222 Selected Security Topics 222 Kerberos 222 Pluggable Authentication Modules 223 Other Modern and Future Offerings 223 NixOS 224 Linux on the Desktop 224 Linux on Embedded Systems 225 Linux in Cloud IDE 225 Conclusion 225 A. Helpful Recipes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 B. Modern Linux Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Table of Contents | vii
📄 Page
10
(This page has no text content)
📄 Page
11
Preface A warm welcome to Learning Modern Linux! I’m glad that we will walk this journey together for a bit. This book is for you if you’ve already been using Linux and are looking for a structured, hands-on approach to dive in deeper, or if you already have experience and want to get some tips and tricks to improve your flow when working with Linux—for example, in a professional setup, such as development or operations. We’ll focus on using Linux for your everyday needs, from development to office- related tasks, rather than on the system administration side of things. Also, we’ll focus on the command line, not visual UIs. So, while 2022 might be the year of Linux on the desktop after all, we’ll use the terminal as the main way to interact with Linux. This has the additional advantage that you can equally apply your knowledge in many different setups, from a Raspberry Pi to the virtual machine of your cloud provider of choice. Before we start, I’d like to provide some context by sharing my own journey: my first hands-on experience with an operating system was not with Linux. The first operat‐ ing system I used was AmigaOS (in the late 80s), and after that, in technical high school, I mainly used Microsoft DOS and the then-new Microsoft Windows, specifi‐ cally around the event system and user interface–related development. Then, in the mid- to late 1990s, during my studies at university, I mainly used Unix-based Solaris and Silicon Graphics machines in the university labs. I really only got into Linux in the mid-2000s in the context of big data and then when I started working with con‐ tainers, first in 2015 in the context of Apache Mesos (working at Mesosphere), and then with Kubernetes (initially at Red Hat on the OpenShift team and then at AWS on the container service team). That’s where I realized that one needs to master Linux to be effective in this space. Linux is different. Its background, worldwide community of users, and versatility and flexibility make it unique. Linux is an interesting, ever-growing ecosystem of open source individuals and ven‐ dors. It runs on pretty much anything under the sun, from the $50 Raspberry Pi to the virtual machines of your favorite cloud provider to a Mars vehicle. After 30 years ix
📄 Page
12
in the making, Linux will likely stick around for some time, so now is a good time to get into Linux a bit deeper. Let’s first set some ground rules and expectations. In the preface, I’ll share how you can get the most out of this book as well as some administrative things, like where and how you can try out the topics we’ll work through together. About You This book is for those who want or need to use Linux in a professional setup, such as software developers, software architects, QA testing engineers, DevOps and SRE roles, and similar roles. I’ll assume that if you’re a hobbyist encountering Linux when pursuing an activity such as 3D printing or home improvement, you have very little to no knowledge about operating systems in general or Linux/UNIX in particular. You will get the most out of the book if you work through it from beginning to end, as the chapters tend to build on one another; however, you can also use it as a refer‐ ence if you’re already familiar with Linux. How to Use the Book The focus of this book is enabling you to use Linux, not administer it. There are plenty of great books about Linux administration out there. By the end of this book, you will understand what Linux is (Chapter 1) and what its critical components are (Chapters 2 and 3). You’ll be able to enumerate and use essential access control mechanisms (Chapter 4). You’ll also understand the role of filesystems (Chapter 5) as a fundamental building block in Linux as well as know what apps (Chapter 6) are. Then, you’ll get some hands-on experience with the Linux networking stack and tool‐ ing (Chapter 7). Further, you’ll learn about modern operating system observability (Chapter 8) and how to apply it to manage your workloads. You’ll understand how to run Linux applications in modern ways by using containers as well as immutable distros such as Bottlerocket and also how to securely communi‐ cate (download files, etc.) and share data using Secure Shell (SSH) and advanced tool‐ ing like peer-to-peer and cloud sync mechanisms (Chapter 9). Following are suggestions for ways you can try things out and follow along (and I strongly recommend you do; learning Linux is like learning a language—you want to practice a lot): x | Preface
📄 Page
13
• Get a Linux desktop or laptop. For example, I have a very nice machine called StarBook from Star Labs. Alternatively, you could use a desktop or laptop that no longer runs a recent Windows version and install Linux on it. • If you want to experiment on a different (host) operating system—say, your Mac‐ Book or iMac—you could use a virtual machine (VM). For example, on macOS you could use the excellent Linux-on-Mac. • Use your cloud provider of choice to spin up a Linux-based VM. • If you’re into tinkering and want to try out a non-Intel processor architecture such as ARM, you could buy a single-board computer such as the wonderful Raspberry Pi. In any case, you should have an environment at hand and practice a lot. Don’t just read: try out commands and experiment. Try to “break” things, for example, by pro‐ viding nonsensical or deliberately strange inputs. Before you execute the command, form a hypothesis about the outcome. Another tip: always ask why. When you see a command or a certain output, try to figure out where it came from and what the underlying component responsible for it is. Conventions The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. Preface | xi
📄 Page
14
This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://oreil.ly/learning-modern-linux-code. If you have a technical question or a problem using the code examples, please send an email to bookquestions@oreilly.com. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission. We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Learning Modern Linux by Michael Hausenblas (O’Reilly). Copyright 2022 Michael Hausenblas, 978-1-098-10894-6.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. xii | Preface
📄 Page
15
O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/learning-modern-linux. Email bookquestions@oreilly.com to comment or ask technical questions about this book. For news and information about our books and courses, visit http://oreilly.com. Find us on LinkedIn: https://linkedin.com/company/oreilly-media Follow us on Twitter: http://twitter.com/oreillymedia Watch us on YouTube: http://youtube.com/oreillymedia Preface | xiii
📄 Page
16
Acknowledgments First off, I’d like to thank the fabulous reviewers of the book: Chris Negus, John Bone‐ sio, and Pawel Krupa. Without their feedback, this book wouldn’t be half as good or useful. I want to thank my parents, who enabled my education and laid the foundations for who I am and what I do today. Big kudos to my big sister, Monika, who was my inspi‐ ration to get into tech in the first place. I would like to express my deepest gratitude to my very awesome and supportive family: my kids, Saphira, Ranya, and Iannis; my wicked smart and fun wife, Anneli‐ ese; our bestest of all dogs, Snoopy; and our newest family member, Charlie the tomcat. In the context of my Unix and Linux journey, there are way too many people who influenced my thinking and from whom I learned a lot. I had the pleasure and privi‐ lege of working with or interacting with many of them, including but not limited to Jérôme Petazzoni, Jessie Frazelle, Brendan Gregg, Justin Garrison, Michael Kerrisk, and Douglas McIlroy. Last, but most certainly not least, I’d like to thank the O’Reilly team, especially my development editor, Jeff Bleiel, for shepherding me through the process of writing this book. xiv | Preface
📄 Page
17
CHAPTER 1 Introduction to Linux Linux is the most widely used operating system, used in everything from mobile devi‐ ces to the cloud. You might not be familiar with the concept of an operating system. Or you might be using an operating system such as Microsoft Windows without giving it too much thought. Or maybe you are new to Linux. To set the scene and get you in the right mindset, we’ll take a bird’s-eye view of operating systems and Linux in this chapter. We’ll first discuss what modern means in the context of the book. Then we’ll review a high-level Linux backstory, looking at important events and phases over the past 30 years. Further, in this chapter you’ll learn what the role of an operating system is in general and how Linux fills this role. We also take a quick look at what Linux distri‐ butions are and what resource visibility means. If you’re new to operating systems and Linux, you’ll want to read the entire chapter. If you’re already experienced with Linux, you might want to jump to “A Ten-Thousand- Foot View of Linux” on page 8, which provides a visual overview as well as mapping to the book’s chapters. But before we get into the technicalities, let’s first step back a bit and focus on what we mean when we say “modern Linux.” This is, surprisingly, a nontrivial matter. What Are Modern Environments? The book title specifies modern, but what does that really mean? Well, in the context of this book, it can mean anything from cloud computing to a Raspberry Pi. In addi‐ tion, the recent rise of Docker and related innovations in infrastructure has dramati‐ cally changed the landscape for developers and infrastructure operators alike. 1
📄 Page
18
Let’s take a closer look at some of these modern environments and the prominent role Linux plays in them: Mobile devices When I say “mobile phone” to our kids, they say, “In contrast to what?” In all fairness and seriousness, these days many phones—depending on who you ask, up to 80% or more—as well as tablets run Android, which is a Linux variant. These environments have aggressive requirements around power consumption and robustness, as we depend on them on a daily basis. If you’re interested in developing Android apps, consider visiting the Android developer site for more information. Cloud computing With the cloud, we see at scale a similar pattern as in the mobile and micro space. There are new, powerful, secure, and energy-saving CPU architectures such as the successful ARM-based AWS Graviton offerings, as well as the established heavy-lifting outsourcing to cloud providers, especially in the context of open source software. Internet of (Smart) Things I’m sure you’ve seen a lot of Internet of Things (IoT)–related projects and prod‐ ucts, from sensors to drones. Many of us have already been exposed to smart appliances and smart cars. These environments have even more challenging requirements around power consumption than mobile devices. In addition, they might not even be running all the time but, for example, only wake up once a day to transmit some data. Another important aspect of these environments is real- time capabilities. If you’re interested in getting started with Linux in the IoT con‐ text, consider the AWS IoT EduKit. Diversity of processor architectures For the past 30 years or so, Intel has been the leading CPU manufacturer, domi‐ nating the microcomputer and personal computer space. Intel’s x86 architecture was considered the gold standard. The open approach that IBM took (publishing the specifications and enabling others to offer compatible devices) was promis‐ ing, resulting in x86 clones that also used Intel chips, at least initially. While Intel is still widely used in desktop and laptop systems, with the rise of mobile devices we’ve seen the increasing uptake of the ARM architecture and recently RISC-V. At the same time, multi-arch programming languages and tool‐ ing, such as Go or Rust, are becoming more and more widespread, creating a perfect storm. All of these environments are examples of what I consider modern environments. And most, if not all of them, run on or use Linux in one form or another. 2 | Chapter 1: Introduction to Linux
📄 Page
19
Now that we know about the modern (hardware) systems, you might wonder how we got here and how Linux came into being. The Linux Story (So Far) Linux celebrated its 30th birthday in 2021. With billions of users and thousands of developers, the Linux project is, without doubt, a worldwide (open source) success story. But how did it all this start, and how did we get here? 1990s We can consider Linus Torvalds’s email on August 25, 1991, to the comp.os.minix newsgroup as the birth of the Linux project, at least in terms of the public record. This hobby project soon took off, both in terms of lines of code (LOC) and in terms of adoption. For example, after less than three years, Linux 1.0.0 was released with over 176,000 LOCs. By that time, the original goal of being able to run most Unix/GNU software was already well reached. Also, the first commercial offering appeared in the 1990s: Red Hat Linux. 2000 to 2010 As a “teenager,” Linux was not only maturing in terms of features and supported hardware but was also growing beyond what UNIX could do. In this time period, we also witnessed a huge and ever-increasing buy-in of Linux by the big players, that is, adoption by Google, Amazon, IBM, and so on. It was also the peak of the distro wars, resulting in businesses changing their directions. 2010s to now Linux established itself as the workhorse in data centers and the cloud, as well as for any types of IoT devices and phones. In a sense, one can consider the distro wars as being over (nowadays, most commercial systems are either Red Hat or Debian based), and in a sense, the rise of containers (from 2014/15 on) is respon‐ sible for this development. With this super-quick historic review, necessary to set the context and understand the motivation for the scope of this book, we move on to a seemingly innocent question: Why does anyone need Linux, or an operating system at all? Why an Operating System at All? Let’s say you do not have an operating system (OS) available or cannot use one for whatever reason. You would then end up doing pretty much everything yourself: memory management, interrupt handling, talking with I/O devices, managing files, configuring and managing the network stack—the list goes on. The Linux Story (So Far) | 3
📄 Page
20
Technically speaking, an OS is not strictly needed. There are sys‐ tems out there that do not have an OS. These are usually embedded systems with a tiny footprint: think of an IoT beacon. They simply do not have the resources available to keep anything else around other than one application. For example, with Rust you can use its Core and Standard Library to run any app on bare metal. An operating system takes on all this undifferentiated heavy lifting, abstracting away the different hardware components and providing you with a (usually) clean and nicely designed Application Programming Interface (API), such as is the case with the Linux kernel that we will have a closer look at in Chapter 2. We usually call these APIs that an OS exposes system calls, or syscalls for short. Higher-level programming languages such as Go, Rust, Python, or Java build on top of those syscalls, potentially wrapping them in libraries. All of this allows you to focus on the business logic rather than having to manage the resources yourself, and also takes care of the different hardware you want to run your app on. Let’s have a look at a concrete example of a syscall. Let’s say we want to identify (and print) the ID of the current user. First, we look at the Linux syscall getuid(2): ... getuid() returns the real user ID of the calling process. ... OK, so this getuid syscall is what we could use programmatically, from a library. We will discuss Linux syscalls in greater detail in “syscalls” on page 22. You might be wondering what the (2) means in getuid(2). It’s a terminology that the man utility (think built-in help pages) uses to indicate the section of the command assigned in man, akin to a postal or country code. This is one example where the Unix legacy is apparent; you can find its origin in the Unix Programmer’s Man‐ ual, seventh edition, volume 1 from 1979. On the command line (shell), we would be using the equivalent id command that in turn uses the getuid syscall: $ id --user 638114 Now that you have a basic idea of why using an operating system, in most cases, makes sense, let’s move on to the topic of Linux distributions. 4 | Chapter 1: Introduction to Linux