Kyle Banks

Redeploying an Application in an AWS Auto Scaling Group, Behind an ELB, with Zero Downtime

Written by @kylewbanks on Aug 26, 2015.

A pretty standard setup these days is to have an application deployed on any number of EC2 instances in an Auto Scaling Group, behind an Elastic Load Balancer (ELB). The trouble is, how do you redeploy the application, with zero downtime, across the entire fleet of instances? In addition, how can we do this while maintaining the current number of healthy instances?

Note: I recommend reading Building and Deploying Self-Managed Applications with Amazon Web Services first if your application is not setup for deploying itself.

Depending on your setup, the answer is actually pretty simple. The basic premise is to:

  1. Launch an entire fleet of new instances (the same number as are currently running) that can manage updating source code and running the application themselves via an User Data script
  2. Wait for the new instances to come online and become healthy in the ELB
  3. Terminate the old instances

Setup

With the AWS CLI installed and configured we can write a simple script that does just that, with two prerequisites.

First, we'll need to add a Scaling Policy that simply adds 1 instance with no alarm, in addition to any existing Scaling Policies that you currently have. The configured scaling policy should look like so:

Scaling Policy: Launch 1 Instance

Second, we need to tell the Auto Scaling group that when it terminates instances, it should terminated the oldest instances first. We do this by setting the Termination Policy to OldestInstance. The reason for this will become apparent shortly.

With our new Scaling Policy, an OldestFirst Termination Policy, and the ability to simply launch an instance and it manage updating source code, setting itself up, and running the application, we're ready to get started.

There are 3 variables we're going to need to collect prior to running the script.

  1. The name of the Auto Scaling group (ex. "API Server AG")
  2. The name of the Scaling Policy we created above (ex. "Launch 1 Instance")
  3. The name of the ELB (ex. "api.example.com ELB")

The Script

Note: This script is available on GitHub and may be more up-to-date there.

With these variables defined, we're ready to go. Let's have a look at the script:

# Define some global variables
export AUTO_SCALING_GROUP_NAME="API Server AG"
export SCALING_POLICY="Launch 1 Instance"
export ELB_NAME="api.example.com ELB"

# Returns the number of instances currently in the AutoScaling group
function getNumInstancesInAutoScalingGroup() {
    local num=$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name "$AUTO_SCALING_GROUP_NAME" --query "length(AutoScalingGroups[0].Instances)")    
    local __resultvar=$1
    eval $__resultvar=$num
}

# Returns the number of healthy instances currently in the ELB
function getNumHealthyInstancesInELB() {
    local num=$(aws elb describe-instance-health --load-balancer-name "$ELB_NAME" --query "length(InstanceStates[?State=='InService'])")
    local __resultvar=$1
    eval $__resultvar=$num
}

# Get the current number of desired instances to reset later
export existingNumDesiredInstances=$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name "$AUTO_SCALING_GROUP_NAME" --query "AutoScalingGroups[0].DesiredCapacity")

# Determine the number of instances we expect to have online
getNumInstancesInAutoScalingGroup numInstances
numInstancesExpected=$(expr $numInstances \* 2)
echo "Expecting to have $numInstancesExpected instance(s) online."

echo "Will launch $numInstances Instance(s)..."
for i in `seq 1 $numInstances`;
do
    echo "Launching instance..."
    aws autoscaling execute-policy --no-honor-cooldown --auto-scaling-group-name "$AUTO_SCALING_GROUP_NAME" --policy-name "$SCALING_POLICY"
    sleep 5s
done

# Wait for the number of instances to increase
getNumInstancesInAutoScalingGroup newNumInstances
until [[ "$newNumInstances" == "$numInstancesExpected" ]]; 
do
    echo "Only $newNumInstances instance(s) online in $AUTO_SCALING_GROUP_NAME, waiting for $numInstancesExpected..."
    sleep 10s
    getNumInstancesInAutoScalingGroup newNumInstances
done

# Wait for the ELB to determine the instances are healthy
echo "All instances online, waiting for the Load Balancer to put them In Service..."
getNumHealthyInstancesInELB numHealthyInstances
until [[ "$numHealthyInstances" == "$numInstancesExpected" ]];
do
    echo "Only $numHealthyInstances instance(s) In Service in $ELB_NAME, waiting for $numInstancesExpected..."
    sleep 10s
    getNumHealthyInstancesInELB numHealthyInstances
done

# Update the desired capacity back to it's previous value
echo "Resetting Desired Instances to $existingNumDesiredInstances"
aws autoscaling update-auto-scaling-group --auto-scaling-group-name "$AUTO_SCALING_GROUP_NAME" --desired-capacity $existingNumDesiredInstances

# Success!
echo "Deployment complete!"

So what's going on here? Essentially, we start by determining the current number of desired instances configured on the Auto Scaling group. The reason for this is that when we execute the Scaling Policy, it's going to increment the number of desired instances by 1 each time, which we don't want. So, we fetch the existing number so that we can reset it later on.

Next up, we determine how many instances are currently running in the Auto Scaling group, and multiply it by two. This is the total number of instances we should have healthy once the new instances launch, and before we terminate the old instances. After that, we run a loop that executes the Scaling Policy each time in order to launch the same number of instances that are currently running.

After that, we wait for the instances to come online in the Auto Scaling group, and then we wait for them to become healthy in the ELB. At this point, we have double the number of instances we need. Half of them are old, half of them are new.

Now that all of our instances are online and healthy, it's time to terminate the old ones. Remember how we kept track of the original number of desired instances so we can reset that later on? Well, that actually doubles as our means of terminating the old ones. When we reset the number of desired instances, the Auto Scaling group is going to notice we have double the amount we need, and begin terminating them. Because we configured the Auto Scaling group to kill the oldest instances first, the old instances are going to be the ones that are terminated.

And that's it! Our entire fleet has been redeployed, we have the same number of instances are we did prior to deploying, and there was zero downtime.

Room to Improve

Deployment strategies are very unique to each developer/company and each application. This strategy may not work for everyone and that's perfectly fine. Hopefully if you cannot use the script exactly as it is, it can serve as a starting point for you.

The script is available on GitHub and can certainly be improved upon. If I have time I'd like to remove the dependencies on the Scaling Policy and the Termination Policy, or if you have a potential improvement, please don't hesitate to submit a pull-request.

Let me know if this post was helpful on Twitter @kylewbanks or down below!