Previous Next

Concurrency in Go Tools and Techniques for Developers (Katherine Cox-Buday) (z-library.sk, 1lib.sk, z-lib.sk)

Author: Katherine Cox-Buday

GO

Concurrency can be notoriously difficult to get right, but fortunately, the Go open source programming language makes working with concurrency tractable and even easy. If you’re a developer familiar with Go, this practical book demonstrates best practices and patterns to help you incorporate concurrency into your systems. Author Katherine Cox-Buday takes you step-by-step through the process. You’ll understand how Go chooses to model concurrency, what issues arise from this model, and how you can compose primitives within this model to solve problems. Learn the skills and tooling you need to confidently write and implement concurrent systems of any size. Understand how Go addresses fundamental problems that make concurrency difficult to do correctly Learn the key differences between concurrency and parallelism Dig into the syntax of Go’s memory synchronization primitives Form patterns with these primitives to write maintainable concurrent code Compose patterns into a series of practices that enable you to write large, distributed systems that scale Learn the sophistication behind goroutines and how Go’s runtime stitches everything together

📄 File Format: PDF
💾 File Size: 1.7 MB
12
Views
0
Downloads
0.00
Total Donations

📄 Text Preview (First 20 pages)

ℹ️

Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

📄 Page 1
Katherine Cox-Buday Concurrency in Go Tools and Techniques for Developers Boston Farnham Sebastopol TokyoBeijing
📄 Page 2
978-1-491-94119-5 [LSI] Concurrency in Go August 2017: First Edition Revision History for the First Edition 2017-07-18: First Release by Katherine Cox-Buday Copyright © 2017 Katherine Cox-Buday Printed in the United States of America http://oreilly.com/catalog/errata.csp?isbn=9781491941195
📄 Page 3
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1. An Introduction to Concurrency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Moore’s Law, Web Scale, and the Mess We’re In 2 Why Is Concurrency Hard? 4 Race Conditions 4 Atomicity 6 Memory Access Synchronization 8 Deadlocks, Livelocks, and Starvation 10 Determining Concurrency Safety 18 Simplicity in the Face of Complexity 20 2. Modeling Your Code: Communicating Sequential Processes. . . . . . . . . . . . . . . . . . . . . . . 23 The Difference Between Concurrency and Parallelism 23 What Is CSP? 26 How This Helps You 29 Go’s Philosophy on Concurrency 31 3. Go’s Concurrency Building Blocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Goroutines 37 The sync Package 47 WaitGroup 47 Mutex and RWMutex 49 Cond 52 Once 57 Pool 59 Channels 64 The select Statement 78 Contents
📄 Page 4
The GOMAXPROCS Lever 83 Conclusion 83 4. Concurrency Patterns in Go. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Confinement 85 The for-select Loop 89 Preventing Goroutine Leaks 90 The or-channel 94 Error Handling 97 Pipelines 100 Best Practices for Constructing Pipelines 104 Some Handy Generators 109 Fan-Out, Fan-In 114 The or-done-channel 119 The tee-channel 120 The bridge-channel 122 Queuing 124 The context Package 131 Summary 145 5. Concurrency at Scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Error Propagation 147 Timeouts and Cancellation 155 Heartbeats 161 Replicated Requests 172 Rate Limiting 174 Healing Unhealthy Goroutines 188 Summary 194 6. Goroutines and the Go Runtime. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Work Stealing 197 Stealing Tasks or Continuations? 204 Presenting All of This to the Developer 212 Conclusion 212 A. Appendix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
📄 Page 5
Preface Hey, welcome to Concurrency in Go! I’m delighted that you’ve picked up this book and excited to join you in exploring the topic of concurrency in Go over the next six chapters! Go is a wonderful language. When it was first announced and birthed into the world, I remember exploring it with great interest: it was terse, compiled incredibly fast, per‐ formed well, supported duck typing, and—to my delight—I found working with its concurrency primitives to be intuitive. The first time I used the go keyword to create a goroutine (something we’ll cover, I promise!) I got this silly grin on my face. I had worked with concurrency in several languages, but I had never worked in a language that made concurrency so easy (which is not to say they don’t exist; I just hadn’t used any). I had found my way to Go. Over the years I moved from writing personal scripts in Go, to personal projects, until I found myself working on a many-hundreds-of-thousands-of-lines project pro‐ fessionally. Along the way the community was growing with the language, and we were collectively discovering best practices for working with concurrency in Go. A few people gave talks on patterns they had discovered. But there still weren’t many comprehensive guides on how to wield concurrency in Go in the community. It was with this in mind that I set out to write this book. I wanted the community to have access to high-quality and comprehensive information about concurrency in Go: how to use it, best practices and patterns for incorporating it into your systems, and how it all works under the covers. I have done my best to strike a balance between these concerns. I hope this book proves useful!
📄 Page 6
Who Should Read This Book This book is meant for developers who have some experience with Go; I make no attempt to explain the basic syntax of the language. Knowledge of how concurrency is presented in other languages is useful, but not necessary. By the end of this book we will have discussed the entire stack of Go concurrency concerns: common concurrency pitfalls, motivation behind the design of Go’s con‐ currency, the basic syntax of Go’s concurrency primitives, common concurrency pat‐ terns, patterns of patterns, and various tooling that will help you along the way. Because of the breadth of topics we’ll cover, this book will be useful to various cross- sections of people. The next section will help you navigate this book depending on what needs you have. Navigating This Book When I read technical books, I usually hop around to the areas that pique my inter‐ est. Or, if I’m trying to ramp up on a new technology for work, I frantically skim for the bits that are immediately relevant to my work. Whatever your use case is, here’s a roadmap for the book with the hopes that it help guide you to where you need to be! Chapter 1, An Introduction to Concurrency This chapter will give you a broad historical perspective on why concurrency is an important concept, and also discuss some of the fundamental problems that make concurrency difficult to get correct. It also briefly touches on how Go helps ease some of this burden. If you have a working knowledge of concurrency or just want to get to the tech‐ nical aspects of how to use Go’s concurrency primitives, it’s safe to skip this chapter. Chapter 2, Modeling Your Code: Communicating Sequential Processes This chapter deals with some of the motivational factors that contributed to Go’s design. This will help give you some context for conversations with others in the Go community and help to frame your understanding of why things work the way they do in the language. Chapter 3, Go’s Concurrency Building Blocks Here we’ll start to dig into the syntax of Go’s concurrency primitives. We’ll also cover the sync package, which is responsible for handling Go’s memory access synchronization. If you haven’t used concurrency within Go before and are look‐ ing to hop right in, this is the place to start.
📄 Page 7
Interspersed with the basics of writing concurrent code in Go are comparisons of concepts to other languages and concurrency models. Strictly speaking, it’s not necessary to understand these things, but these concepts help you to achieve a complete understanding on concurrency in Go. Chapter 4, Concurrency Patterns in Go In this chapter, we begin to look at how Go’s concurrency primitives are com‐ posed together to form useful patterns. These patterns will both help us solve problems and avoid issues that can come up when combining concurrency prim‐ itives. If you’ve already been writing some concurrent code in Go, this chapter should still prove useful. Chapter 5, Concurrency at Scale In this chapter, we take the patterns we have learned and compose these into larger patterns commonly employed in larger programs, services, and distributed systems. Chapter 6, Goroutines and the Go Runtime This chapter describes how the Go runtime handles scheduling goroutines. This is for those of you who want to understand the internals of Go’s runtime. Appendix The appendix simply enumerates various tools and commands that can help make writing and debugging concurrent programs easier. Online Resources Go has a very active and passionate community! For those newer to Go, take heart, it will be easy to find friendly, helpful people to guide you along on your path to Go. Here are a few of my favorite community-oriented resources for reading, getting help, and interacting with your fellow gophers: • https://golang.org/ • https://golang.org/play • https://go.googlesource.com/go • https://groups.google.com/group/golang-nuts • https://github.com/golang/go/wiki
📄 Page 8
Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This icon signifies a tip, suggestion, or general note. This icon indicates a warning or caution. Using Code Examples All of the code contained in this book can be found on the landing page for the book, http://katherine.cox-buday.com/concurrency-in-go. It is released under the MIT license and may be used under those terms.
📄 Page 9
CHAPTER 1 An Introduction to Concurrency Concurrency is an interesting word because it means different things to different peo‐ ple in our field. In addition to “concurrency,” you may have heard the words, “asyn‐ chronous,” “parallel,” or “threaded” bandied about. Some people take these words to mean the same thing, and other people very specifically delineate between each of those words. If we’re to spend an entire book’s worth of time discussing concurrency, it would be beneficial to first spend some time discussing what we mean when we say “concurrency.” We’ll spend some time on the philosophy of concurrency in Chapter 2, but for now let’s adopt a practical definition that will serve as the foundation of our understand‐ ing. When most people use the word “concurrent,” they’re usually referring to a process that occurs simultaneously with one or more processes. It is also usually implied that all of these processes are making progress at about the same time. Under this defini‐ tion, an easy way to think about this are people. You are currently reading this sen‐ tence while others in the world are simultaneously living their lives. They are existing concurrently to you. Concurrency is a broad topic in computer science, and from this definition spring all kinds of topics: theory, approaches to modeling concurrency, correctness of logic, practical issues—even theoretical physics! We’ll touch on some of the ancillary topics throughout the book, but we’ll mostly stick to the practical issues that involve under‐ standing concurrency within the context of Go, specifically: how Go chooses to model concurrency, what issues arise from this model, and how we can compose primitives within this model to solve problems. In this chapter, we’ll take a broad look at some of the reasons concurrency became such an important topic in computer science, why concurrency is difficult and war‐ 1
📄 Page 10
rants careful study, and—most importantly—the idea that despite these challenges, Go can make programs clearer and faster by using its concurrency primitives. As with most paths toward understanding, we’ll begin with a bit of history. Let’s first take a look at how concurrency became such an important topic. Moore’s Law, Web Scale, and the Mess We’re In In 1965, Gordon Moore wrote a three-page paper that described both the consolida‐ tion of the electronics market toward integrated circuits, and the doubling of the number of components in an integrated circuit every year for at least a decade. In 1975, he revised this prediction to state that the number of components on an inte‐ grated circuit would double every two years. This prediction more or less held true until just recently—around 2012. Several companies foresaw this slowdown in the rate Moore’s law predicted and began to investigate alternative ways to increase computing power. As the saying goes, necessity is the mother of innovation, and so it was in this way that multicore processors were born. This looked like a clever way to solve the bounding problems of Moore’s law, but computer scientists soon found themselves facing down the limits of another law: Amdahl’s law, named after computer architect Gene Amdahl. Amdahl’s law describes a way in which to model the potential performance gains from implementing the solution to a problem in a parallel manner. Simply put, it states that the gains are bounded by how much of the program must be written in a sequential manner. For example, imagine you were writing a program that was largely GUI based: a user is presented with an interface, clicks on some buttons, and stuff happens. This type of program is bounded by one very large sequential portion of the pipeline: human interaction. No matter how many cores you make available to this program, it will always be bounded by how quickly the user can interact with the interface. Now consider a different example, calculating digits of pi. Thanks to a class of algo‐ rithms called spigot algorithms, this problem is called embarrassingly parallel, which —despite sounding made up—is a technical term which means that it can easily be divided into parallel tasks. In this case, significant gains can be made by making more cores available to your program, and your new problem becomes how to combine and store the results. Amdahl’s law helps us understand the difference between these two problems, and can help us decide whether parallelization is the right way to address performance concerns in our system. 2 | Chapter 1: An Introduction to Concurrency
📄 Page 11
For problems that are embarrassingly parallel, it is recommended that you write your application so that it can scale horizontally. This means that you can take instances of your program, run it on more CPUs, or machines, and this will cause the runtime of the system to improve. Embarrassingly parallel problems fit this model so well because it’s very easy to structure your program in such a way that you can send chunks of a problem to different instances of your application. Scaling horizontally became much easier in the early 2000s when a new paradigm began to take hold: cloud computing. Although there are indications that the phrase had been used as early as the 1970s, the early 2000s are when the idea really took root in the zeitgeist. Cloud computing implied a new kind of scale and approach to appli‐ cation deployments and horizontal scaling. Instead of machines that you carefully curated, installed software on, and maintained, cloud computing implied access to vast pools of resources that were provisioned into machines for workloads on- demand. Machines became something that were almost ephemeral, and provisioned with characteristics specifically suited to the programs they would run. Usually (but not always) these resource pools were hosted in data centers owned by other compa‐ nies. This change encouraged a new kind of thinking. Suddenly, developers had relatively cheap access to vast amounts of computing power that they could use to solve large problems. Solutions could now trivially span many machines and even global regions. Cloud computing made possible a whole new set of solutions to problems that were previously only solvable by tech giants. But cloud computing also presented many new challenges. Provisioning these resour‐ ces, communicating between machine instances, and aggregating and storing the results all became problems to solve. But among the most difficult was figuring out how to model code concurrently. The fact that pieces of your solution could be run‐ ning on disparate machines exacerbated some of the issues commonly faced when modeling a problem concurrently. Successfully solving these issues soon led to a new type of brand for software, web scale. If software was web scale, among other things, you could expect that it would be embarrassingly parallel; that is, web scale software is usually expected to be able to handle hundreds of thousands (or more) of simultaneous workloads by adding more instances of the application. This enabled all kinds of properties like rolling upgrades, elastic horizontally scalable architecture, and geographic distribution. It also intro‐ duced new levels of complexity both in comprehension and fault tolerance. And so it is in this world of multiple cores, cloud computing, web scale, and problems that may or may not be parallelizable that we find the modern developer, maybe a bit overwhelmed. The proverbial buck has been passed to us, and we are expected to rise to the challenge of solving problems within the confines of the hardware we’ve been handed. In 2005, Herb Sutter authored an article for Dr. Dobb’s, titled, “The free lunch Moore’s Law, Web Scale, and the Mess We’re In | 3
📄 Page 12
is over: A fundamental turn toward concurrency in software”. The title is apt, and the article prescient. Toward the end, Sutter states, “We desperately need a higher-level programming model for concurrency than languages offer today.” To know why Sutter used such strong language, we have to look at why concurrency is so hard to get right. Why Is Concurrency Hard? Concurrent code is notoriously difficult to get right. It usually takes a few iterations to get it working as expected, and even then it’s not uncommon for bugs to exist in code for years before some change in timing (heavier disk utilization, more users logged into the system, etc.) causes a previously undiscovered bug to rear its head. Indeed, for this very book, I’ve gotten as many eyes as possbile on the code to try and mitigate this. Fortunately everyone runs into the same issues when working with concurrent code. Because of this, computer scientists have been able to label the common issues, which allows us to discuss how they arise, why, and how to solve them. So let’s get started. Following are some of the most common issues that make working with concurrent code both frustrating and interesting. Race Conditions A race condition occurs when two or more operations must execute in the correct order, but the program has not been written so that this order is guaranteed to be maintained. Most of the time, this shows up in what’s called a data race, where one concurrent operation attempts to read a variable while at some undetermined time another con‐ current operation is attempting to write to the same variable. Here’s a basic example: 1 var data int 2 go func() { 3 data++ 4 }() 5 if data == 0 { 6 fmt.Printf("the value is %v.\n", data) 7 } In Go, you can use the go keyword to run a function concurrently. Doing so cre‐ ates what’s called a goroutine. We’ll discuss this in detail in the section, “Gorou‐ tines” on page 37. 4 | Chapter 1: An Introduction to Concurrency
📄 Page 13
Here, lines 3 and 5 are both trying to access the variable data, but there is no guaran‐ tee what order this might happen in. There are three possible outcomes to running this code: • Nothing is printed. In this case, line 3 was executed before line 5. • “the value is 0” is printed. In this case, lines 5 and 6 were executed before line 3. • “the value is 1” is printed. In this case, line 5 was executed before line 3, but line 3 was executed before line 6. As you can see, just a few lines of incorrect code can introduce tremendous variability into your program. Most of the time, data races are introduced because the developers are thinking about the problem sequentially. They assume that because a line of code falls before another that it will run first. They assume the goroutine above will be scheduled and execute before the data variable is read in the if statement. When writing concurrent code, you have to meticulously iterate through the possible scenarios. Unless you’re utilizing some of the techniques we’ll cover later in the book, you have no guarantees that your code will run in the order it’s listed in the source‐ code. I sometimes find it helpful to imagine a large period of time passing between operations. Imagine an hour passes between the time when the goroutine is invoked, and when it is run. How would the rest of the program behave? What if it took an hour between the goroutine executing successfully and the program reaching the if statement? Thinking in this manner helps me because to a computer, the scale may be different, but the relative time differentials are more or less the same. Indeed, some developers fall into the trap of sprinkling sleeps throughout their code exactly because it seems to solve their concurrency problems. Let’s try that in the pre‐ ceding program: 1 var data int 2 go func() { data++ }() 3 time.Sleep(1*time.Second) // This is bad! 4 if data == 0 { 5 fmt.Printf("the value is %v.\n" data) 6 } Have we solved our data race? No. In fact, it’s still possible for all three outcomes to arise from this program, just increasingly unlikely. The longer we sleep in between invoking our goroutine and checking the value of data, the closer our program gets to achieving correctness—but this probability asymptotically approaches logical correct‐ ness; it will never be logically correct. In addition to this, we’ve now introduced an inefficiency into our algorithm. We now have to sleep for one second to make it more likely we won’t see our data race. If we Why Is Concurrency Hard? | 5
📄 Page 14
utilized the correct tools, we might not have to wait at all, or the wait could be only a microsecond. The takeaway here is that you should always target logical correctness. Introducing sleeps into your code can be a handy way to debug concurrent programs, but they are not a solution. Race conditions are one of the most insidious types of concurrency bugs because they may not show up until years after the code has been placed into production. They are usually precipitated by a change in the environment the code is executing in, or an unprecedented occurrence. In these cases, the code seems to be behaving correctly, but in reality, there’s just a very high chance that the operations will be executed in order. Sooner or later, the program will have an unintended consequence. Atomicity When something is considered atomic, or to have the property of atomicity, this means that within the context that it is operating, it is indivisible, or uninterruptible. So what does that really mean, and why is this important to know when working with concurrent code? The first thing that’s very important is the word “context.” Something may be atomic in one context, but not another. Operations that are atomic within the context of your process may not be atomic in the context of the operating system; operations that are atomic within the context of the operating system may not be atomic within the con‐ text of your machine; and operations that are atomic within the context of your machine may not be atomic within the context of your application. In other words, the atomicity of an operation can change depending on the currently defined scope. This fact can work both for and against you! When thinking about atomicity, very often the first thing you need to do is to define the context, or scope, the operation will be considered to be atomic in. Everything follows from this. Fun Fact In 2006, the gaming company Blizzard successfully sued MDY Industries for $6,000,000 USD for making a program called “Glider,” which would automatically play their game, World of Warcraft, without user intervention. These types of pro‐ grams are commonly referred to as “bots” (short for robots). At the time, World of Warcraft had an anti-cheating program called “Warden,” which would run anytime you played the game. Among other things, Warden would scan the memory of the host machine and run a heuristic to look for programs that appeared to be used for cheating. 6 | Chapter 1: An Introduction to Concurrency
📄 Page 15
Glider successfully avoided this check by taking advantage of the concept of atomic context. Warden considered scanning the memory on the machine as an atomic oper‐ ation, but Glider utilized hardware interrupts to hide itself before this scanning started! Warden’s scan of memory was atomic within the context of the process, but not within the context of the operating system. Now let’s look at the terms “indivisible” and “uninterruptible.” These terms mean that within the context you’ve defined, something that is atomic will happen in its entirety without anything happening in that context simultaneously. That’s still a mouthful, so let’s look at an example: i++ This is about as simple an example as anyone can contrive, and yet it easily demon‐ strates the concept of atomicity. It may look atomic, but a brief analysis reveals several operations: • Retrieve the value of i. • Increment the value of i. • Store the value of i. While each of these operations alone is atomic, the combination of the three may not be, depending on your context. This reveals an interesting property of atomic opera‐ tions: combining them does not necessarily produce a larger atomic operation. Mak‐ ing the operation atomic is dependent on which context you’d like it to be atomic within. If your context is a program with no concurrent processes, then this code is atomic within that context. If your context is a goroutine that doesn’t expose i to other goroutines, then this code is atomic. So why do we care? Atomicity is important because if something is atomic, implicitly it is safe within concurrent contexts. This allows us to compose logically correct pro‐ grams, and—as we’ll later see—can even serve as a way to optimize concurrent pro‐ grams. Most statements are not atomic, let alone functions, methods, and programs. If atom‐ icity is the key to composing logically correct programs, and most statements aren’t atomic, how do we reconcile these two statements? We’ll go into more depth later, but in short we can force atomicity by employing various techniques. The art then becomes determining which areas of your code need to be atomic, and at what level of granularity. We discuss some of these challenges in the next section. Why Is Concurrency Hard? | 7
📄 Page 16
Memory Access Synchronization Let’s say we have a data race: two concurrent processes are attempting to access the same area of memory, and the way they are accessing the memory is not atomic. Our previous example of a simple data race will do nicely with a few modifications: var data int go func() { data++}() if data == 0 { fmt.Println("the value is 0.") } else { fmt.Printf("the value is %v.\n", data) } We’ve added an else clause here so that regardless of the value of data we’ll always get some output. Remember that as it is written, there is a data race and the output of the program will be completely nondeterministic. In fact, there’s a name for a section of your program that needs exclusive access to a shared resource. This is called a critical section. In this example, we have three critical sections: • Our goroutine, which is incrementing the data variables. • Our if statement, which checks whether the value of data is 0. • Our fmt.Printf statement, which retrieves the value of data for output. There are various ways to guard your program’s critical sections, and Go has some better ideas on how to deal with this, but one way to solve this problem is to syn‐ chronize access to the memory between your critical sections. Let’s see what that looks like. The following code is not idiomatic Go (and I don’t suggest you attempt to solve your data race problems like this), but it very simply demonstrates memory access syn‐ chronization. If any of the types, functions, or methods in this example are foreign to you, that’s OK. Focus on the concept of synchronizing access to the memory by fol‐ lowing the callouts. var memoryAccess sync.Mutex var value int go func() { memoryAccess.Lock() value++ memoryAccess.Unlock() }() memoryAccess.Lock() if value == 0 { fmt.Printf("the value is %v.\n", value) 8 | Chapter 1: An Introduction to Concurrency
📄 Page 17
} else { fmt.Printf("the value is %v.\n", value) } memoryAccess.Unlock() Here we add a variable that will allow our code to synchronize access to the data variable’s memory. We’ll go over the sync.Mutex type in detail in “The sync Pack‐ age” on page 47. Here we declare that until we declare otherwise, our goroutine should have exclusive access to this memory. Here we declare that the goroutine is done with this memory. Here we once again declare that the following conditional statements should have exclusive access to the data variable’s memory. Here we declare we’re once again done with this memory. In this example we’ve created a convention for developers to follow. Anytime devel‐ opers want to access the data variable’s memory, they must first call Lock, and when they’re finished they must call Unlock. Code between those two statements can then assume it has exclusive access to data; we have successfully synchronized access to the memory. Also note that if developers don’t follow this convention, we have no guar‐ antee of exclusive access! We’ll return to this idea in the section “Confinement” on page 85. You may have noticed that while we have solved our data race, we haven’t actually solved our race condition! The order of operations in this program is still nondeter‐ ministic; we’ve just narrowed the scope of the nondeterminism a bit. In this example, either the goroutine will execute first, or both our if and else blocks will. We still don’t know which will occur first in any given execution of this program. Later, we’ll explore the tools to solve this kind of issue properly. On its face this seems pretty simple: if you find you have critical sections, add points to synchronize access to the memory! Easy, right? Well…sort of. It is true that you can solve some problems by synchronizing access to the memory, but as we just saw, it doesn’t automatically solve data races or logical correctness. Fur‐ ther, it can also create maintenance and performance problems. Note that earlier we mentioned that we had created a convention for declaring we needed exclusive access to some memory. Conventions are great, but they’re also easy to ignore—especially in software engineering where the demands of business some‐ times outweigh prudence. By synchronizing access to the memory in this manner, you are counting on all other developers to follow the same convention now and into Why Is Concurrency Hard? | 9
📄 Page 18
1 There is an accepted proposal to allow the runtime to detect partial deadlocks, but it has not been imple‐ mented. For more information, see https://github.com/golang/go/issues/13759. the future. That’s a pretty tall order. Thankfully, later in this book we’ll also look at some ways we can help our colleagues be more successful. Synchronizing access to the memory in this manner also has performance ramifac‐ tions. We’ll save the details for later when we examine the sync package in the section “The sync Package” on page 47, but the calls to Lock you see can make our program slow. Every time we perform one of these operations, our program pauses for a period of time. This brings up two questions: • Are my critical sections entered and exited repeatedly? • What size should my critical sections be? Answering these two questions in the context of your program is an art, and this adds to the difficulty in synchronizing access to the memory. Synchronizing access to the memory also shares some problems with other techni‐ ques of modeling concurrent problems, and we’ll discuss those in the next section. Deadlocks, Livelocks, and Starvation The previous sections have all been about discussing program correctness in that if these issues are managed correctly, your program will never give an incorrect answer. Unfortunately, even if you successfully handle these classes of issues, there is another class of issues to contend with: deadlocks, livelocks, and starvation. These issues all concern ensuring your program has something useful to do at all times. If not han‐ dled properly, your program could enter a state in which it will stop functioning alto‐ gether. Deadlock A deadlocked program is one in which all concurrent processes are waiting on one another. In this state, the program will never recover without outside intervention. If that sounds grim, it’s because it is! The Go runtime attempts to do its part and will detect some deadlocks (all goroutines must be blocked, or “asleep”1), but this doesn’t do much to help you prevent deadlocks. To help solidify what a deadlock is, let’s first look at an example. Again, it’s safe to ignore any types, functions, methods, or packages you don’t know and just follow the code callouts. type value struct { mu sync.Mutex 10 | Chapter 1: An Introduction to Concurrency
📄 Page 19
value int } var wg sync.WaitGroup printSum := func(v1, v2 *value) { defer wg.Done() v1.mu.Lock() defer v1.mu.Unlock() time.Sleep(2*time.Second) v2.mu.Lock() defer v2.mu.Unlock() fmt.Printf("sum=%v\n", v1.value + v2.value) } var a, b value wg.Add(2) go printSum(&a, &b) go printSum(&b, &a) wg.Wait() Here we attempt to enter the critical section for the incoming value. Here we use the defer statement to exit the critical section before printSum returns. Here we sleep for a period of time to simulate work (and trigger a deadlock). If you were to try and run this code, you’d probably see: fatal error: all goroutines are asleep - deadlock! Why? If you look carefully, you’ll see a timing issue in this code. Following is a graph‐ ical representation of what’s going on. The boxes represent functions, the horizontal lines calls to these functions, and the vertical bars lifetimes of the function at the head of the graphic (Figure 1-1). Figure 1-1. Demonstration of a timing issue giving rise to a deadlock Why Is Concurrency Hard? | 11
📄 Page 20
2 We actually have no guarantee what order the goroutines will run in, or how long it will take them to start. It’s plausible, although unlikely, that one goroutine could acquire and release both locks before the other begins, thus avoiding the deadlock! Essentially, we have created two gears that cannot turn together: our first call to print Sum locks a and then attempts to lock b, but in the meantime our second call to print Sum has locked b and has attempted to lock a. Both goroutines wait infinitely on each other. Irony To keep this example simple, I use a time.Sleep to trigger the deadlock. However, this introduces a race condition! Can you find it? A logically “perfect” deadlock would require correct synchronization.2 It seems pretty obvious why this deadlock is occurring when we lay it out graphically like that, but we would benefit from a more rigorous definition. It turns out there are a few conditions that must be present for deadlocks to arise, and in 1971, Edgar Coff‐ man enumerated these conditions in a paper. The conditions are now known as the Coffman Conditions and are the basis for techniques that help detect, prevent, and correct deadlocks. The Coffman Conditions are as follows: Mutual Exclusion A concurrent process holds exclusive rights to a resource at any one time. Wait For Condition A concurrent process must simultaneously hold a resource and be waiting for an additional resource. No Preemption A resource held by a concurrent process can only be released by that process, so it fulfills this condition. Circular Wait A concurrent process (P1) must be waiting on a chain of other concurrent pro‐ cesses (P2), which are in turn waiting on it (P1), so it fulfills this final condition too. 12 | Chapter 1: An Introduction to Concurrency
The above is a preview of the first 20 pages. Register to read the complete e-book.

💝 Support Author

0.00
Total Amount (¥)
0
Donation Count

Login to support the author

Login Now

Recommended for You

Loading recommended books...
Failed to load, please try again later
Back to List