Tutorial: Synchronizing State with Mutexes in Go
The Mutex (mutual exclusion lock) is an invaluable resource when synchronizing state across multiple goroutines, but I find that it’s usage is somewhat mystifying to new Go developers. The truth is that mutexes are incredibly simple to use, but do come with a couple caveats that can have a serious impact on your software - namely deadlocks, but we’ll get into those in a minute.
So what is a mutex? At its core, it allows you to ensure that only one goroutine has access to a block of code at a time, with all other goroutines attempting to access the same code having to wait until the mutex has been unlocked before proceeding. Let’s look at a quick example:
var mu sync.Mutex var sum = 0 func add(a int) { mu.Lock() sum = sum + a mu.Unlock() }
In this example, if two goroutines call the add function, only one can proceed at a time. When Lock is called on the mutex, it ensures that no other goroutine can access the same block until Unlock is called.
Race Conditions
So why does this matter? The primary purpose of a mutex is to prevent race conditions, whereby two or more goroutines access and/or modify the same state with varying outcomes based on the order of execution.
Let’s take a look at an example:
package main import ( "fmt" "sync" ) var wg sync.WaitGroup var sum = 0 func process(n string) { wg.Add(1) go func() { defer wg.Done() for i := 0; i < 10000; i++ { sum = sum + 1 } fmt.Println("From " + n + ":", sum) }() } func main() { processes := []string{"A", "B", "C", "D", "E"} for _, p := range processes { process(p) } wg.Wait() fmt.Println("Final Sum:", sum) }
Note: You can safely ignore the WaitGroup logic here, it is simply there to ensure that we wait for the goroutines to complete before the program exits.
Here we run five separate goroutines (A, B, C, D, and E), each adding one to the shared sum variable ten thousand times. Basic math tells us that the Final Sum printed at the end should be 50,000 because 5 (goroutines) times 10,000 (executions) is equal to 50,000.
Running the code however gives us a different outcome:
$ go run sum.go From E: 10000 From A: 20000 From D: 30188 From C: 41800 From B: 47166 Final Sum: 47166
Your outcome will vary, in fact each execution you’ll likely get a different outcome, but the issue is the same: we didn’t get a total sum of 50,000 like we expected. So what happened? We can see in the sample output above that the E and A goroutines ran properly, but we started getting into trouble around D. This is because the goroutines can’t finish fast enough before the next routine begins, and a data race begins where each goroutine is modifying the sum value at the same time. If we ran this only 100 or 1000 times, we likely wouldn’t notice any issues, however as the goroutines take longer and longer, more data races occur and we get into trouble really quick.
This example is contrived, but imagine this was banking or payment processing software - we’d be pretty hosed, out about 3000 units in the example above.
Adding a Mutex
So how do we fix this? One option, and the one we’ll be using today, is to add a mutex. We’re only going to make two changes, but it will completely change the outcome of our program:
- Define a Mutex
- Add Lock and Unlock calls around the addition to sum
var mu sync.Mutex // In "process" mu.Lock() sum = sum + 1 mu.Unlock()
Here’s the full program again for clarity:
package main import ( "fmt" "sync" ) var wg sync.WaitGroup var mu sync.Mutex var sum = 0 func process(n string) { wg.Add(1) go func() { defer wg.Done() for i := 0; i < 10000; i++ { mu.Lock() sum = sum + 1 mu.Unlock() } fmt.Println("From " + n + ":", sum) }() } func main() { processes := []string{"A", "B", "C", "D", "E"} for _, p := range processes { process(p) } wg.Wait() fmt.Println("Final Sum:", sum) }
With the changes in place, we can run the example again and verify that the output matches the 50,000 we expect:
$ go run mutex.go From A: 38372 From C: 38553 From E: 42019 From D: 48251 From B: 50000 Final Sum: 50000
You’ll notice that the final sum is correct, but the sum after each goroutine completes isn’t a multiple of 10,000. This is because we’re still executing each goroutine concurrently, all we’ve changed is that we’re synchronizing access to the sum variable inside process.
So how does this work? Each time we call Lock, all other goroutines must wait before executing the same code, until the processing goroutine unlocks the mutex by calling Unlock. Lock is a blocking operation, so the goroutine will sit idle until the lock can be acquired, ensuring that only one goroutine ever has the ability to add to sum at a time.
Tips and Tricks
Idiomatic Definition
Because a mutex doesn’t directly relate to a specific variable or function, it is idiomatic in Go to define the mutex above the variable it is applicable to. For instance, if we had a Processor struct for the example above, we’d define it like so:
type Processor struct { mu sync.Mutex sum int }
This also applies when the same mutex is used for multiple variables, like so:
type Processor struct { // Related mu sync.Mutex sum int anotherVar int yetAnotherVar int // Not related somethingElse int }
By defining the mutex directly above the variable(s) it relates to, we are signalling to other developers that the mutex is used to protect access to these variables.
Deferred Unlocks
In more complex software than the trivial examples above, where the function that calls Lock has various places to return, or the entire function must be locked, it is common to use a defer to ensure Unlock is called prior to the function returning, like so:
func process() { mu.Lock() defer mu.Unlock() // Process... }
This ensures that no matter what branch the code takes inside the function, Unlock will always be called. As a bonus, developers can add code to the function without worrying that they may miss a case where Unlock must be called.
Deadlocks
Forgetting to Unlock
It is absolutely crucial to call Unlock! If you don’t all other goroutines will wait indefinitely for the Unlock call, meaning they will never proceed and the program will grind to a halt.
By taking the call to Unlock out of the example above, we’ll get the following:
$ go run mutex.go fatal error: all goroutines are asleep - deadlock! goroutine 1 [semacquire]: sync.runtime_Semacquire(0x1f6b7c, 0x876e0) /usr/local/go/src/runtime/sema.go:47 +0x40 sync.(*WaitGroup).Wait(0x1f6b70, 0x1) /usr/local/go/src/sync/waitgroup.go:127 +0x100 main.main() /tmp/sandbox022115118/main.go:35 +0x160 ...
This trace will go on for a while, but you get the point.
This is called a deadlock and is a surefire way to crash your programs. In more complicated codebases it can be hard to immediately recognize situations where deadlocks can occur, as you’ll see in the example below.
Multiple Calls to Lock
In this example, we’ll see that if you call Lock from multiple places on the same Mutex, and it’s possible that Lock is called by the same goroutine that already has the lock prior to it being unlocked, we’ll end up in another deadlock situation. Take a look:
package main import ( "fmt" "sync" ) var mu sync.Mutex func funcA() { mu.Lock() funcB() mu.Unlock() } func funcB() { mu.Lock() fmt.Println("Hello, World") mu.Unlock() } func main() { funcA() }
If you were to run this program, you’d get the following:
$ go run deadlock.go fatal error: all goroutines are asleep - deadlock! goroutine 1 [semacquire]: sync.runtime_Semacquire(0x1043411c, 0x1) /usr/local/go/src/runtime/sema.go:47 +0x40 sync.(*Mutex).Lock(0x10434118, 0x0) /usr/local/go/src/sync/mutex.go:83 +0x200 main.funcB() /tmp/sandbox352026507/main.go:17 +0x40 main.funcA() /tmp/sandbox352026507/main.go:12 +0x40 main.main() /tmp/sandbox352026507/main.go:23 +0x80
The reason for this is that funcB, running in the same goroutine as funcA, tries to acquire a Lock on the same Mutex that funcA already locked. Because Lock blocks until the lock can be acquired, we’ll never reach the Unlock in funcA, and the program halts.
Conclusion
While there are a plethora of ways to handle synchronization of state and the sync package provides a number of options, but mutexes are a simple and effective means of getting the job done, so long as some care is taken to ensure you are implementing them safely and correctly into your codebase.