Kyle Banks

Distributed Locks using Golang and Redis

Written by @kylewbanks on Oct 5, 2016.

Maintaining locks across a cluster of application instances, be it multiple threads on the same server, or different servers altogether, is an often underestimated component of developing clustered applications (be it Golang or other languages and frameworks). It’s relatively straightforward but there are some gotchas to look out for when implementing your distributed locking mechanisms.

What’s Distributed Locking?

Distributed locks are a means to ensure that multiple processes can utilize a shared resource in a mutually exclusive way, meaning that only one can make use of the resource at a time. They are quite similar to the concept of a mutex, however in the case of distributed locks, the locking mechanism is intended for processes that may or may not be operating in the same machine or application.

Let’s say for example that you’re writing some software that has a daily job to charge customers who’s subscription is renewing that day. You have multiple instances of your billing application running in order to process the payments quickly, but you need to make sure that two (or more) instances don’t pick up the same user, and double bill them. In this case the user is the shared resource that must be accessed by only a single process.

What you need is something for the two billing application instances, let’s call them Billing A and Billing B, to coordinate with (we’ll call this the Coordinator) that can definitively say that one of them is good to go, and the other needs to skip that user. The process looks like this:

Distributed Locking Workflow

As you can see, Billing A and Billing B both ask the Coordinator if they can process the same user, User X. The Coordinator ensures that only one, Billing A, gets the go ahead, and the Billing B has to skip the user.

Using Redis

Redis is a prime (and popular) candidate for the Coordinator job due to it’s single-threaded nature, and it’s ability to perform atomic operations. The official documentation contains a wealth of information on this topic, and provides us with the exact commands we need in order to turn our Redis instance into a Coordinator.

In order to request a lock on a particular resource, we do the following:

SET resource_name unique_value NX PX duration

The resource_name is a value that all instances of your application would share. In the case above it could be the user’s identifier, such as a primary key ID or username, for example. The unique_value is something that must be unique to each instance of your application, such as a UUID. The purpose of this unique value is to allow the application instance that successfully locked the resource to have exclusive access to modify, extend, or remove the lock (unlock), similar to a password or key. Finally we also provide a duration (in milliseconds), after which the lock will be automatically removed by Redis.

Speaking of unlocking, when we no longer need a lock on the resource, we can unlock it like so:

if redis.call("get", resource_name) == unique_value then
    return redis.call("del", resource_name)
else
    return 0
end

Basically what we do here is get the resource_name and ensure it’s value matches the unique_value we stored earlier. This ensures we don’t accidentally delete another instance’s lock. If the unique_value does match, we go ahead and delete the lock, otherwise we simply return 0 to signal that the unlock attempt was unsuccessful.

Implementation

Implementing this locking mechanism in Go is fairly straightforward once you understand the Redis scripts above. For the examples below, I’ll be using the popular Redigo Redis client for Go, but the concept is the same regardless of language or client.

The first thing we’ll do is define the scripts:

const (
	lockScript = `
		return redis.call('SET', KEYS[1], ARGV[1], 'NX', 'PX', ARGV[2])
	`
	unlockScript = `
		if redis.call("get",KEYS[1]) == ARGV[1] then
		    return redis.call("del",KEYS[1])
		else
		    return 0
		end
	`
)

And we can then utilize them like so:

// Lock attempts to put a lock on the key for a specified duration (in milliseconds).
// If the lock was successfully acquired, true will be returned.
func Lock(key, value string, timeoutMs int) (bool, error) {
	r := pool.Get()
	defer r.Close()

	cmd := redis.NewScript(1, lockScript)
	if res, err := cmd.Do(r, key, value, timeoutMs); err != nil {
		return false, err
	} else {
		return res == "OK", nil
	}
}

// Unlock attempts to remove the lock on a key so long as the value matches.
// If the lock cannot be removed, either because the key has already expired or
// because the value was incorrect, an error will be returned.
func Unlock(key, value string) error {
	r := pool.Get()
	defer r.Close()

	cmd := redis.NewScript(1, unlockScript)
	if res, err := redis.Int(cmd.Do(r, key, value)); err != nil {
		return err
	} else if res != 1 {
		return errors.New("Unlock failed, key or secret incorrect")
	}
	
	// Success
	return nil
}

In these two functions, key and value are synonymous with the resource_name and unique_value above. Lock will return a boolean indicating if the lock was acquired, allowing us to proceed with our application logic accordingly.

Going back to the billing application example, when our billing service runs it simply calls Lock before processing the user, and only proceeds if Lock returns true. When it’s done processing, it can then call Unlock to release the lock:

func Bill(u User) {
    uniqueValue := "a unique value"
    
    if isLocked, _ := Lock(u.Username, uniqueValue, LockDuration); !isLocked {
        continue
    }
    defer Unlock(u.Username, uniqueValue)
    
    // Perform billing logic...
}

Note that I’m intentionally ignoring errors here for the sake of brevity.

As you can see, if the lock doesn’t succeed, we simply skip the user and move on to the next. If the lock is successful, we defer unlocking to ensure it’s run before the function returns, and proceed to bill the user. This gives us peace of mind in knowing that the user will only be charged once no matter how many instances of our billing application are simultaneously attempting to bill the user.

Multiple Redis Instances

Now there is one caveat to the implementation here, in that it only works when there is a single Redis instance. When running Redis in a cluster, a more sophisticated approach is necessary and you can read more about that in the official Redis documentation.

Let me know if this post was helpful on Twitter @kylewbanks or down below, and follow me to keep up with future posts!