Table of Contents

Docker

Reasons for Using Docker

Reasons for Not Using Docker

Docker Templates

Automatically Update Container Images

The Watchtower container can be ran along with the other containers, will automatically monitor the various docker containers and will then automatically stop the containers, update the image and restart the containers.

Similarly, for swarms, shepherd is able to be deployed within a swarm in order to update containers for which a Wizardry and Steamworks guide exists.

Rebalance Swarm

The following command:

docker service ls -q | xargs -n1 docker service update --detach=false --force

will list all the services within a swarm and then force the services to be re-balanced and distributed between the nodes of the swarm.

After executing the command, all nodes can be checked with:

docker ps

to see which service got redistributed to which node.

One idea is to run the rebalancing command using crontab in order to periodically rebalance the swarm.

Run a Shell in Container within a Swarm

Typically, to open a console, the user would write:

docker run -it CONTAINER bash

where:

However, given that containers are distributed in a swarm, one should first locate on which node the container is running by issuing:

docker service ps CONTAINER

where:

The output will display in one of the columns the current node that the container is executing on. Knowing the node, the shell of the node has to be accessed and then the command:

docker ps 

can be used to retrieve the container ID (first column).

Finally, the console can be started within the distributed container by issuing:

docker exec -it CONTAINER_ID sh

where:

Pushing to Private Registry

The syntax is as follows:

docker login <REGISTRY_HOST>:<REGISTRY_PORT>
docker tag <IMAGE_ID> <REGISTRY_HOST>:<REGISTRY_PORT>/<APPNAME>:<APPVERSION>
docker push <REGISTRY_HOST>:<REGISTRY_PORT>/<APPNAME>:<APPVERSION>

Restart Docker if Worker cannot Find Swarm Manager

If a worker cannot find the swarm manager when it starts up, at the current time of writing, Docker is made to terminate. This is problematic because the manager might go online after a while such that the workers should just wait to connect.

On some Linux distributions, such as Debian, Docker is started via a service file located at /lib/systemd/system/docker.service and it can be copied to /etc/systemd/system with some modifications in order to make SystemD restart Docker if it terminates.

On Debian, the service file is missing the RestartSec configuration line, such that it should be added to /etc/systemd/system/docker.service after being copied. Here is the full service file with the added line:

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target docker.socket firewalld.service containerd.service
Wants=network-online.target containerd.service
Requires=docker.socket

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
EnvironmentFile=-/etc/default/docker
ExecStart=/usr/sbin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock $DOCKER_OPTS
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
RestartSec=10s

[Install]
WantedBy=multi-user.target

With this change, SystemD will try to bring Docker up every 10s after it has failed. Unfortunately, this fix has to be applied for all nodes in a Docker swarm.

Achieving an Even-Distribution of Services Across the Swarm

Unfortunately, services do not spread evenly through the swarm such that re-balancing is necessary. In fact, the strategy of distributing services across the swarm is surprisingly bad, with the manager node taking upon itself most of the services and with very few left over to the last node of the swarm. It seems Docker spreads services on a bucket-fill-like strategy where services are only spread out if the current node is deemed somehow full.

Irrespective of the lack of a strategy, here is one constructed command:

docker service ls | \
    awk '{ print $1 }' | \
    tail -n +2 | \
    xargs docker service ps --format "{{.Node}}" --filter "desired-state=running" | \
    awk ' { node[$0]++ } END { for (i in node) print node[i] } ' | \
    awk '{ x += $0; y += $0 ^ 2 } END { print int(sqrt( y/NR - (x/NR) ^ 2)) }' | \
    xargs -I{} test {} -gt 2 && docker service ls -q | xargs -n1 docker service update --detach=false --force

that performs the following operations in order:

In other words, the distribution strategy of the cluster is to place an equal share of services per available nodes.

Intuitively, the command can be placed in a cron script and, compared to just calling the swarm re-distribution command, the script should have no effect when the services are distributed evenly across the nodes due to the standard deviation falling well under $2$ (with $0$ being the theoretical point that the standard deviation should be when the services are evenly spread out).

Building using The Distributed Compiler

Some packages have to be compiled manually such that it is beneficial to use a distributed compiler in order to distribute the compilation workload across multiple computers. However, the system should be flexible enough to include the edge case when a distributed compiler is not available.

To that end, here is a Dockerfile that is meant to define some variables such that "distcc" will be used to distribute the compilation across a range of computers:

FROM debian:latest AS builder

# define compilation variables
ARG DISTCC_HOSTS=""
ARG CC=gcc
ARG CXX=g++

# install required packages
RUN apt-get --assume-yes update && apt-get --assume-yes upgrade && \
    apt-get --assume-yes install \
        build-essential \
        gcc \
        g++ \
        automake \
        distcc

# ... compile ...
RUN DISTCC_HOSTS="${DISTCC_HOSTS}" CC=${CC} CXX=${CXX} make

and the invocation will be as follows:

docker build \
    -t TAG \
    --build-arg DISTCC_HOSTS="a:35001 b:35002" \
    --build-arg CC=distcc \
    --build-arg CXX=distcc \
    .

where:

If you would like a ready-made container for distcc, you can use the Wizardry and Steamworks build.

Opening up a Port Across Multiple Replicas

Even though multiple replicas of a container can exist even on the same system or spread out through a swarm, due to the nature of TCP/IP a single port might be allocated at the same time for any single process, such that when starting a series of clones of a program, there must exist a way to specify a port range or a series of ports for each instance of the program being launched.

The syntax is as follows:

START_PORT-END_PORT:CONTAINER_PORT

where:

Interestingly, this feature does not work as expected and whilst the ports will be used for all nodes within the swarm for all replicas of the service, all ports will be replicated by all nodes such that accessing one port within the port range successively will lead to a service on a different node within the docker swarm. If stickyness is desired, the current solution at the time of writing is to either use jwilder/nginx-proxy or to just declare multiple services of the same image with the constraints set appropriately to each node in the swarm.

Restarting Containers on a Schedule

Depending on the application, in some rare cases some containers must be restarted. For example, invidious documents stat that invidious should be restarted at least once per day or invidious will stop working. There are multiple ways to accomplish that, either by using the system scheduling system, such as cron on Linux, but the most compact seems to use docker-cli and trigger a restart of the service. For example, the following additional service can be added to the invidious service in order to restart invidious at 8pm:

  invidious-restarter:
    image: docker:cli
    restart: unless-stopped
    volumes: ["/var/run/docker.sock:/var/run/docker.sock"]
    entrypoint: ["/bin/sh","-c"]
    command:
      - |
        while true; do
          if [ "$$(date +'%H:%M')" = '20:00' ]; then
            docker restart invidious
          fi
          sleep 60
        done

When running under a swarm, it gets a little more complicated due to the controlling service only being present on master nodes such that the supplementary service has to only be deployed on master nodes in order to restart the service. Here is the modified snippet:

  invidious-restarter:
    image: docker:cli
    restart: unless-stopped
    volumes: ["/var/run/docker.sock:/var/run/docker.sock"]
    entrypoint: ["/bin/sh","-c"]
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.role == manager
    command:
      - |
        while true; do
          if [ "$$(date +'%H:%M')" = '20:00' ]; then
            docker service ls --filter name=general_invidious --format "{{.ID}}" | \
                head -n 1 | \
                xargs -I{} docker service update --force --with-registry-auth "{}"
          fi
          sleep 60
        done

that will make sure that the general_invidious service will be restarted every day at 8pm.

Docker Swarm - Server Contingencies

Here are some useful changes to mitigate various issues with running a Docker swarm:

Dumping Running Container Process Identifier to Files

The following script was written in order to query the currently running containers on a machine running Docker and then create a directory and write within that directory PID files containing the PIDs of the services being ran within the Docker container.

The script was used for monitoring services on multiple machines in a Docker swarm where it was found necessary to retrieve the PID of the services within a Docker container without breaking container isolation.

#!/usr/bin/env bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2024 - License: MIT            ##
###########################################################################
 
# path to the swarm state directory where PID files will be stored
STATE_DIRECTORY=/run/swarm
 
if [ ! -d $STATE_DIRECTORY ]; then
    mkdir -p $STATE_DIRECTORY
fi
 
DOCKER_SWARM_SERVICES=$(docker container ls --format "{{.ID}}" | \
    xargs docker inspect -f '{{.State.Pid}} {{(index .Config.Labels "com.docker.stack.namespace")}} {{(index .Config.Labels "com.docker.swarm.service.name")}}')
while IFS= read -r LINE; do
    read -r PID NAMESPACE FULLNAME <<< "$LINE"
    IFS='_' read -r NAMESPACE NAME <<< "$FULLNAME"
    PIDFILE="$STATE_DIRECTORY/$NAME"".pid"
    if [ ! -f "$PIDFILE" ]; then
        echo $PID >"$PIDFILE"
        continue
    fi
    test $(cat "$PIDFILE") -eq $PID || \
        echo $PID >"$PIDFILE"
done <<< "$DOCKER_SWARM_SERVICES"

Searching Logs on the Command Line

It seems that the Docker logs command will print out the logs on stderr such that piping the output to grep or other tools will not work properly. In order to making piping work, stderr has to be redirected to stdout and then piped to whatever tool needs to be used:

docker service logs --follow general_mosquitto 2>&1 | grep PING

Repositories not Signed in Docker Container

Sometimes the reason behind the errors claiming that repositories are not signed during a Docker build are due to the lack of space on the hard-drive. The errors are along the line of:

#7 0.692 Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
#7 0.771 Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
#7 0.814 Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
#7 0.869 Err:1 http://deb.debian.org/debian bookworm InRelease
#7 0.869   At least one invalid signature was encountered.
#7 0.954 Err:2 http://deb.debian.org/debian bookworm-updates InRelease
#7 0.954   At least one invalid signature was encountered.
#7 1.066 Err:3 http://deb.debian.org/debian-security bookworm-security InRelease
#7 1.066   At least one invalid signature was encountered.
#7 1.101 Reading package lists...

Healthcheck within Docker Compose file vs. Healthcheck within Dockerfile

Both Docker compose files and Dockerfiles allow the creation of health checks and the difference is that health checks placed within compose files will be executed by the host and thus cannot access the inside of the Docker container whilst health checks within Dockerfiles take place inside the container.

If possible, it is always preferable to create health checks within a Dockerfile when building a container image mainly because this represents a separation of concerns and also respects the containerization principle of software running with Docker.

Getting Docker UTF-8 Support on Debian

Some software requires the console to be set to UTF-8, in particular software that deals with the Linux command line such as Jenkins. By default the debian or debian-slim images are configured to have a POSIX locale by default, such that the locale has to be changed to UTF-8 during the build process of the image.

The following snippet should be inserted into a Dockerfile that inherits from debian or debian-slim images in order to set the locale to UTF-8:

# UTF-8 support
RUN apt-get install coreutils -y locales && \
    echo "en_US.UTF-8" | tee -a /etc/locale.gen && \
    locale-gen
    
# set environment variables
ENV LC_ALL=en_US.UTF-8
ENV LANG=en_US.UTF-8
ENV LANGUAGE=en_US.UTF-8

Docker Resource Consumption Accounting using Linux Control Groups

Docker implements special support for c-groups in order to allow controlling the resource usage of Docker itself. In order to enable c-groups, edit or create /etc/docker/daemon.json in order to add the following contents:

{
    "exec-opts": ["native.cgroupdriver=systemd"],
    "cgroup-parent": "docker_limits.slice"

}

The configuration will:

In turn, the file docker_limits.slice is placed at /etc/systemd/system/docker_limits.slice and contains the following:

[Unit]
Description=Slice that limits Docker resources
Before=slices.target

[Slice]
CPUAccounting=true
CPUQuota=90%
MemoryAccounting=true
MemoryHigh=2G
MemoryMax=2.5G

that enables both CPU and RAM accounting, sets the maximum CPU usage to $90\%$ and the maximum memory consumption to $2.5GiB$.

Lastly, in order to check the RAM usage of Docker, the systemd-cgtop tool can be used that displays the resource consumption for c-groups.

Docker Services "Just Not Starting" in a Docker Swarm

Docker on its own performs no accounting in terms of services running within a Docker swarm and the only distribution strategy of services is an equal "spread" of services. Depending on what node is up and at what time, the distribution of services is not even equal to all nodes such that fairer end-user service distribution solutions make sense to keep a balance of services across a set of nodes.

However, even with equally distributed services, Docker does not and can not know what amount of CPU or RAM a service might require at runtime such that a runtime solution to shift services around in a swarm would make more sense. One way to check the CPU consumption is to check all the services and see what total CPU usage they collectively generate and then repeat the same procedure for RAM and/or other resources that the services might consume.

Without accounting for resource consumption it often happens for the Docker managers of a swarm to place services on the same node within a swarm such that the node ends up overloaded and without the ability to answer requests. This section explores possibilities to mitigate such Denial of Service issues that stem from the inability to predict the amount of resource usage head of time in order to ensure that services placed on a node do not end up slowing the node down due to their high resource consumption patterns.

Pinning

Similar to multitasking solution, one obvious solution is to pin the heavy services to different nodes in order to ensure that they do not all run together. This would work by changing the service constrains to pin the service to different nodes.

Here is a snippet from a Docker compose service:

    deploy:
      labels:
        - shepherd.enable=true
        - shepherd.auth.config=docker
      replicas: 1
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.hostname == docker2

where the node.hostname == docker2 constraint makes sure that the service will run on the node with the hostname docker2.

Although this is a fine solution, it will not work in terms of load-balancing and adaptability because when the node docker2 becomes unavailable, the Docker managers would simply not know where to place the service. Furthermore, manually pinning services to nodes adds a level of locality that is unbecoming of a cluster - in other words, if all services are pinned, why even bother running a cluster and not just run the software on the nodes directly?

By Specification

Fortunately, Docker does perform the minimal level of accounting necessary in order to be aware of how many resources the node has such that working by specification, which is the best option, is very much possible. Here is an example excerpt out of a Docker compose service:

    deploy:
      labels:
        - shepherd.enable=true
        - shepherd.auth.config=docker
      replicas: 1
      placement:
        max_replicas_per_node: 1
#        constraints:
#          - node.hostname == docker2
      resources:
        reservations:
          cpus: '1'
          memory: 1G

Now, instead of pinning the service to the node with the hostname docker2, the service is defined (or specified) to require a full core (cpus: '1') and also require $1GiB$ of RAM. Now, when the service is deployed to the swarm, each node that is a potential candidate for deployment will cross-check the requirements with the resources available, and, if the required amount of CPU and RAM are not met, the node will reject the service. This process shall carry on until a node either accepts the service or the service enters a fail state that can be observed with docker ps that will hint that no solution is available that would match the deployment requirements.

It is not even required to provide a specification for all services, adding the requirements for services that seem to generate heavy load should be sufficient.

Automatic Configuration Reload for Docker Software

Typically *nix daemons are not meant to restart or reload themselves especially as a consequence of a changed configuration, which means that software running within a Docker container will require the Docker container to be restarted in order for the daemon to reload its configuration. It is however possible to implement a generic solution that should work across the board for any sort of software running within a container based on filesystem primitives such as INOTIFY.

The script is fairly simple and consists in just one command watching a directory and then raising an alarm when files are changed within that directory:

#!/usr/bin/env bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2024 - License: MIT            ##
###########################################################################
# This script can be used to make a daemon reload its configuration       #
# whenever a change occurs within a defined directory, presumably the     #
# same directory where the configuration is stored in the first place.    #
#                                                                         #
# The script requires the "inotify-tools" package to be installed or      #
# whatever other package provides the "inotifywait" command line tool.    #
# Next, the script must be modified to make the necessary changes in the  #
# "CONFIGURATION" section where the path to the directory to be watched   #
# is specified and to also define a command that should be used to reload #
# the daemon. Note that whatever the command contains, must also be       #
# installed for the script to work.                                       #
#                                                                         #
# The script has to be ran permanently for the entire duration that the   #
# processes that it is monitoring is running. This can be accomplished by #
# starting the script using "supervisord" or any other tool that can run  #
# daemons, including bash scripts.                                        #
###########################################################################
 
###########################################################################
#                             CONFIGURATION                               #
###########################################################################
 
MONITOR_DIRECTORY=/data
RELOAD_COMMAND="kill -s HUP `pidof freeradius`"
 
###########################################################################
#                               INTERNALS                                 #
###########################################################################
 
# alarm(2)
function alarm {
    sleep $1
    eval $RELOAD_COMMAND
}
 
ALARM_PID=0
trap '{ test $ALARM_PID = 0 || kill -9 $ALARM_PID; }' KILL QUIT TERM EXIT INT HUP
 
inotifywait -q -m "$MONITOR_DIRECTORY" -r \
    -e "modify" -e "create" -e "delete" | \
    while IFS=$'\n' read -r LINE; do
    if [ -d /proc/"$ALARM_PID" ]; then
        kill -9 $ALARM_PID &>/dev/null || true
    fi
    alarm "5" &
    ALARM_PID=$!
done

when the alarm runs, the script executes a user-defined command that is supposed to make the daemon reload its configuration. In this example the command is kill -s HUP `pidof freeradius` and is meant to signal FreeRADIUS to reload its configuration by delivering a HUP signal. Both the directory to be watched and the reload command can be modified and adjusted to match whatever other daemon must be monitored for configuration changes.

Enumerate Services that Have not Been Replicated Completely Within a Docker Swarm

The following script can be used in order to list the services in a Docker swarm that have not fully replicated within the swarm. The script will output just the name of the services that have not been fully replicated. In order to use the script, download the text and save it to a file and make it executable.

#!/usr/bin/env bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2024 - License: MIT            ##
###########################################################################
# This script is meant to enumerate Docker swarm service names that have  #
# not yet replicated across the swarm. The script compares the number of  #
# replicas that have been distributed across the swarm with the number of #
# total expected replicas and prints the service name in case there is a  #
# mismatch between the two.                                               #
###########################################################################
 
for DATA in \
    `docker service ls --format="{{.Name}},{{.Replicas}}" | \
         perl -pe 's/\(.+?\)//g'`; do
    NAME=$(printf $DATA | awk -F',' '{ print $1 }')
    RATIO=$(printf $DATA | awk -F',' '{ print $2 }')
 
    A=$(printf $RATIO | awk -F'/' '{ print $1 }')
    B=$(printf $RATIO | awk -F'/' '{ print $2 }')
 
    # If the number of replicas is equal to the number of expected
    # replicas then assume that the service has been already properly
    # distributed across the swarm.
    if [ "$A" = "$B" ]; then
        continue
    fi
 
    echo $SERVICE
done

Computing the Total Amount of CPU Usage For All Running Containers

The following command lists the services running on a Docker node and sums up the CPU usage for all services.

docker stats --no-stream | tail -n +2 | awk '{ s+=$3 } END { printf "%.0f\n", s }'

Ensuring the Uniform Usage of CPU Resources in a Docker Swarm

One of the problems with Docker swarm orchestration is that Docker does not perform any real-time accounting of resource consumption for the services that run in the Docker swarm. Hence, once services are distributed across the swarm, it might just happen that some services start to consume resources on a Docker node unevenly compared to other nodes in the swarm. For that purpose, it makes sense to devise a script that could handle the distribution of services in order to ensure that all nodes in a Docker swarm are used evenly.

One-Shot

The following script can be ran every minute on every node of a Docker swarm in order to terminate containers after the sum total of all CPU usages of every service on the node exceeds a defined threshold. That is, in case the total CPU usage of all Docker services on a node exceeds MAXIMUM_CPU_USAGE, then the script will pick the first most CPU consuming container and will terminate that container.

#!/usr/bin/env bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2024 - License: MIT            ##
###########################################################################
# This script is meant to run on every node of a Docker swarm in order to #
# terminate swarm services that consume too much CPU over a certain limit #
# that can be specified by modifying the "MAXIMUM_CPU_USAGE" variable.    #
#                                                                         #
# Ideally, this script will run with crontab every minute in order to     #
# check whether Docker consumes more CPU than the specified limit. On     #
# Debian systems this can be accomplished by creating /etc/cron.minutely  #
# and then referencing the directory from /etc/crontab using the line:    #
#                                                                         #
# * *     * * *   root    cd / && run-parts --report /etc/cron.minutely   #
#                                                                         #
# Note that after the line is added, cron must be made to reload the      #
# crontab, either by restarting cron or by delivering a HUP signal.       #
###########################################################################
 
MAXIMUM_CPU_USAGE=70
 
[ $(docker stats --no-stream | tail -n +2 | awk '{s+=$3} END {printf "%.0f\n", s}') -gt $MAXIMUM_CPU_USAGE ] && \
    docker stats --no-stream | tail -n +2 | awk '{ print $1,$3 }' | sed 's/%//g' | sort -k2,2n | tail -n 1 | awk '{ print $1 }' | xargs docker stop

It is assumed that after terminating the container, the node manager will redistribute the service in the swarm, and hopefully the service will run on a node with a smaller workload.

This method does not require knowledge of other nodes.

Sampling

This variation is somewhat better because if the one-shot script just happens to be scheduled concomitantly with the start of a CPU-intensive service, the one-shot script will consider that the resources are being misused and will more than likely stop the CPU-intensive service before it boots up completely. Naturally, it is assumed that after a service starts completely, the software will stop being CPU intensive and typical of all software, will end up in an idle state.

With that being said, the following script follows the one-shot termination script from the previous section, but additionally samples the CPU usage of all services over time in order to make sure that the CPU consumption has been overstepped over a period of time before the script starts terminating containers.

#!/usr/bin/env bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2024 - License: MIT            ##
###########################################################################
# This script is meant to measure and detect CPU overload on a Docker     #
# swarm node and then begin to stop containers with the hopes that the    #
# containers will be restarted and distributed to different nodes in the  #
# Docker swarm.                                                           #
3                                                                         #
# This script should be called periodically at a time interval t and each #
# time the script is called a total service CPU measurement m is taken    #
# and deposited into a file under /dev/shm/ (to not burn flash memory).   #
# That being said, the script will begin stopping services iff.:          #
#                                                                         #
#   ( m(t_{1}) + ... + m(t_{SAMPLE_TICKS}) ) / SAMPLE_TICKS > THRESHOLD   #
#                                                                         #
# where both SAMPLE_TICKS and THRESHOLD can be changed by the user.       #
#                                                                         #
# It is up to the user to find a way to run this script periodically, and #
# one possibility is to run this script every minute using cron. Note     #
# that the script is meant to collect SAMPLE_TICKS samples, such that     #
# running this script on a more than hourly basis will make this script   #
# have little effect in sampling the actual CPU consumption of services.  #
###########################################################################
 
###########################################################################
#                             CONFIGURATION                               #
###########################################################################
 
# the sum total CPU consumption that all services on a node must exceed in
# order to being stopping containers and hopefully migrating services
THRESHOLD=70
 
# the amount of samples to collect over time
SAMPLE_TICKS=5
 
###########################################################################
#                               INTERNALS                                 #
###########################################################################
 
# Acquire a lock.
LOCK_FILE='/var/lock/docker-ooc-killer'
if mkdir $LOCK_FILE 2>&1 >/dev/null; then
    trap '{ rm -rf $LOCK_FILE; }' KILL QUIT TERM EXIT INT HUP
else
    exit 0
fi
 
if [[ ! -f /dev/shm/docker-ooc-killer ]] || \
   [[ $(wc -l /dev/shm/docker-ooc-killer | awk '{ print $1 }') -lt $SAMPLE_TICKS ]]; then
   docker stats --no-stream | \
       tail -n +2 | \
       awk '{ s+=$3 } END { printf "%.0f\n", s }' >> /dev/shm/docker-ooc-killer
   exit 0
fi
 
# compute the total average usage across the last sample minutes
AVERAGE_USAGE=$(($(cat /dev/shm/docker-ooc-killer | awk '{ s+=$1 } END { printf "%.0f\n", s }')/$SAMPLE_TICKS))
if [ $AVERAGE_USAGE -gt $THRESHOLD ]; then
    # terminate the most CPU-intensive process if CPU threshold is exceeded
    #echo "Threshold exceeded. Terminating processes."
    docker stats --no-stream | \
        tail -n +2 | \
        awk '{ print $1, $3 }' | \
        sed 's/%//g' | \
        sort -k2,2n | \
        tail -n 1 | \
        awk '{ print $1 }' | \
        xargs docker stop
fi
 
rm -rf /dev/shm/docker-ooc-killer

This method does not require knowledge of other nodes.