• June 1, 2022

Docker Swarm: how to move tasks to other nodes without downtime

Once in a while, you need to apply a system update, change the docker daemon settings, or for some other reason disable a node in the swarm for a while. How to do this without downtime?

Part of the Docker CLI is the docker node update command, which allows us to control how individual nodes in the swarm receive new tasks using the --availability parameter. If we want to empty a node, we just use the docker node update NODE_NAME --availability=drain command. Going from active to drain will make the new "goal" for that node to have 0 running tasks. Swarm does this simply by terminating all running tasks and moving them elsewhere. However, if these tasks are the only instances of their services, the result is a short downtime. So how do we make sure that our rolling updates settings are used when we move tasks?

The key is not to force the relocation of tasks by setting availability=drain, but instead to use the third option, availability=pause. In this state, the node does not accept new tasks, but at the same time does not try to "force" existing ones to move away. Then just move all services whose instances are running on the node we want to empty in the same way as when deploying a new version.

Step by step

  1. First, we check the condition of the cluster.
$ docker node ps
ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION                                                                                                                  
o81cq30ogz2x34goarbs6xamo *   manager1  Ready     Active         Leader           20.10.7                                                                                                                         
kv9gwot1u35id1bzmaktl4lhw     worker1   Ready     Active                          20.10.7                                                                                                                         
px15r84ksn8uqe02wls5egopl     worker2   Ready     Active                          20.10.7                                                                                                                         
rrsb7disoubuckfz2h7e47hu6     worker3   Ready     Active                          20.10.7

2. We need to "empty" worker1, so we set availability to pause.

$ docker node update worker1 --availability=pause
node1

3. We can see that the status has changed, but all services are still running.

$ docker node ps
ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION                                                                                                                  
o81cq30ogz2x34goarbs6xamo *   manager1  Ready     Active         Leader           20.10.7                                                                                                                         
kv9gwot1u35id1bzmaktl4lhw     worker1   Pause     Active                          20.10.7                                                                                                                         
px15r84ksn8uqe02wls5egopl     worker2   Ready     Active                          20.10.7                                                                                                                         
rrsb7disoubuckfz2h7e47hu6     worker3   Ready     Active                          20.10.7

4. We filter out the tasks that run on worker1 and find the ID of its service for each one.

$ SERVICE_IDS=($(docker node ps worker1 --filter desired-state=running --format '{{ .ID }}' | xargs docker inspect --format '{{ .ServiceID }}'))

5. Run force-update for each service. This will force the movement of its chunks to nodes that are currently receiving new chunks. However, this update respects our rolling updates settings.

$ for s in ${SERVICE_IDS[@]}; do
      docker service update --force "$s"
  done

6. When the process runs out (depending on the speed of the download, it may take some time), we can safely perform maintenance, restart the server for kernel updates, etc.

7. Finally, just switch the node back to the active state.

$ docker node update worker1 --availability=active

We can wrap the whole process in a simple script to save ourselves some work next time: ssh manager1 "bash -s" -- < drain.sh NODE_NAME

#!/bin/bash

if [[ "$#" -ne 1 ]]; then
    echo "Invalid arguments. Usage: $0 NODE_NAME"
    exit 1
fi

NODE="$1"

# check if node exists
docker node ps $NODE > /dev/null 
if [[ $? -ne 0 ]]; then
    echo "Invalid NODE_NAME"
    exit 1
fi

# set availability to pause - this prevents the node from receiving new tasks
echo "Current swarm state:"
docker node ls
docker node update "$NODE" --availability=pause
echo "Node availability updated. New swarm state:"
docker node ls

echo

# get tasks running on the node and find the services they belong to
echo "Getting services..."
SERVICE_IDS=($(docker node ps "$NODE" --filter desired-state=running --format '{{ .ID }}' | xargs docker inspect --format '{{ .ServiceID }}'))
echo "Found ${#SERVICE_IDS[@]} services."

# iterate through services and force-update each of them
for s in ${SERVICE_IDS[@]}; do
    echo
    SERVICE_NAME=$(docker service inspect "$s" --format '{{ .Spec.Name }}')
    echo "Processing service: $SERVICE_NAME"
    docker service update --force "$s" > /dev/null
    echo "Done."
done