One of the advantages of Docker swarm is that a service can be distributed across a cluster of machines, or a swarm in Docker terminology, such that every machine runs a copy of the same application. It stands to reason that given the architecture, the best pairing of protocol would be the Hypertext Transfer Protocol (HTTP) due to its nature of being stateless, which means that connections do not have to be maintained persistently, which makes it easy to devise some shared storage with low impact to problems of concurrency.
In order to achieve high-availability and load-balancing, the following guide shows how to build, what would be called in web-development terms, a L.A.M.P. stack in a container, a load balancer (the choice here being between major players such as HAProxy and varnish and we chose varnish) as well as the setup and interaction with Docker swarm.
There are several expected results, namely:
A LAMP container depends greatly on the needs of what has to be served as web-services such that the composition of the final container will vary wildly depending on application. More than likely, if this guide is followed, it might be the case that an architecture already exists and that the reader wishes to switch to a containerization architecture such that the following tips should be very helpful to assist the migration:
/etc/apache2
directory should be backed up, preferably using something like the tape archiver (tar
) in order to preserve symlinks and perhaps even POSIX ACLs.apachectl -M
in order to create a list of Apache modules that are currently in use, especially the crucial ones that are needed to serve the application because the very same modules will have to be included within the container and enabled at build time.
As said, the requirements will be different, but here is a skeleton Dockerfile
for a LAMP stack that installs and configures Apache2, PHP-FPM and services both via supervisord:
FROM debian:stable-slim # update package manager RUN apt-get update -y && \ apt-get upgrade -y && \ apt-get dist-upgrade -y && \ apt-get -y autoremove && \ apt-get clean # install apache RUN apt-get install -y \ apache2 \ apache2-bin \ apache2-data \ apache2-dev \ apache2-utils \ libmaxminddb0 \ libmaxminddb-dev \ libapache2-mod-authnz-external \ libapache2-mod-authnz-pam \ libapache2-mod-php \ libapache2-mod-svn \ libapache2-mod-xsendfile \ libapache2-mod-php \ lynx # install PHP modules RUN apt-get install -y \ php-bcmath \ ... php-sqlite3 # enable apache modules RUN a2enmod \ authz_host \ ... \ dav_lock || true # enable PHP modules RUN phpenmod \ bcmath \ ... \ zlib # install supervisor to track processes RUN apt-get install -y \ supervisor \ subversion \ git \ curl \ build-essential \ texlive-full # install maxmind WORKDIR /tmp RUN cd /tmp && \ curl -LOs --output-dir /tmp https://github.com/maxmind/mod_maxminddb/releases/download/1.2.0/mod_maxminddb-1.2.0.tar.gz && \ tar -xpvf /tmp/mod_maxminddb-1.2.0.tar.gz && \ cd /tmp/mod_maxminddb-1.2.0 && \ ./configure && \ make install && \ rm -rf /tmp/mod_maxminddb* && \ a2enmod maxminddb # install node-js RUN curl -fsSL https://deb.nodesource.com/setup_current.x | bash - && \ apt-get install -y nodejs && \ node --version && npm --version # install matrix-cli ADD matrix-cli /opt/matrix-cli RUN cd /opt/matrix-cli && \ npm install && \ npm link && \ matrix-cli -h # cleanup RUN apt-get purge -y \ subversion \ git \ curl \ build-essential \ apache2-dev \ libmaxminddb-dev && apt-get autoremove -y # add filesystem requirements ADD rootfs / # open ports EXPOSE 80 # start supervisord to monitor processes ENTRYPOINT [ "supervisord", "-t", "-n", "-c", "/etc/supervisor.conf" ]
As can be observed, the Docker build file first installs Apache and then PHP, after which it proceeds by enabling the required Apache and PHP modules. Finally, for the particular requirements themselves, the build file installs:
The container then uses supervisord as its entry point that references the configuration file supervisor.conf
that is meant to start Apache and PHP-FPM, as follows:
[supervisord] user=root nodaemon=true pidfile=/run/supervisord.pid logfile=/dev/stdout logfile_maxbytes=0 [program:php_fpm] command=/usr/sbin/php-fpm8.2 --nodaemonize --fpm-config /etc/php/8.2/fpm/php-fpm.conf stdout_logfile=/dev/stdout stderr_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile_maxbytes=0 [program:apache2] command=/usr/sbin/apachectl -D FOREGROUND environment= APACHE_RUN_USER=%(ENV_APACHE_RUN_USER)s, APACHE_RUN_GROUP=%(ENV_APACHE_RUN_GROUP)s stdout_logfile=/dev/stdout stderr_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile_maxbytes=0
where:
stdout
and stderr
lines are meant to ensure that supervisord uses stdout in order to output any error message produced by Apache or PHP, which is useful because doing so will interact nicely with Docker, and specifically, the Docker logs
subcommand that can be used to read the container log output,ENTRYPOINT
command running supervisord directly and because making the services log to standard output is desiredThe image is built by issuing:
docker build -t docker:5000/lmap:latest .
where:
docker:5000
is the hostname and port of a registry to publish the image to after compilation,lamp
is the name of the image,latest
is the Docker version tag, here specified as the standard latest
tag depicting the latest image of the software being built
After the image builds, the following command is issued to push the docker:5000/lmap:latest
image to the Docker registry docker:5000
:
docker push docker:5000/lmap:latest
Finally, the image can be ran as a container and in this instance a Docker compose file will be used instead of running the container on the command line due to its flexibility as well as compose being the canonical way of deploying a service to the swarm:
version: '3.9' services: lamp: image: docker:5000/lamp:latest ports: - 6001-6005:80 - 7001-7005:5005 volumes: - /mnt/data/sites:/var/www - /mnt/lamp/httpd/sites-enabled:/etc/apache2/sites-enabled - /mnt/lamp/httpd/sites-available:/etc/apache2/sites-available - /mnt/lamp/httpd/conf-enabled:/etc/apache2/conf-enabled - /mnt/lamp/httpd/conf-available:/etc/apache2/conf-available - /mnt/lamp/httpd/logs:/var/log/apache2 environment: - APACHE_RUN_USER=www-data - APACHE_RUN_GROUP=www-data - APACHE_LOG_DIR=/dev/stdout - APACHE_LOCK_DIR=/var/lock/apache2 - APACHE_PID_FILE=/run/apache2/apache2.pid deploy: replicas: 5 placement: max_replicas_per_node: 1
There are a few things here to notice:
/mnt/data/sites
is the filesystem location on the host where the website files are stored and it is mapped into the container at the standard Debian location for websites, namely inside /var/www
,/mnt/lamp/httpd/
is essentially the semantic equivalent of the standard Debian /etc/apache2
directory and the sub-directories such as sites-enabled
and conf-available
should seem familiar. At this point, the /etc/apache2
that was archived using the tape archive is extracted and the directory structure is recreated in /mnt/lamp/httpd
. Note that the Debian way of doing things is that folders ending with *-enabled
contain files that are, in fact, symlinks from files inside the folders ending in *-available
such that it is important that the symlinks are relative and not absolute. Otherwise, in case the symlinks are absolute, the symlinks inside the guest will contain paths that might not be valid (in this case, the directory structure is mapped 1-to-1, such that it should not be a problem, but in case the configuration would have been moved, any absolute symlinks would have had to be recreated),/etc/apache2/envvars
file, that is also merged with /etc/defaults/apache2
and they are, even symbolically, carried over into the container in order to not expose envvars
and the default configuration file,deploy
sub-section will make it such that all the nodes in the swarm (5 nodes on 5 computers) will run an instance of this LAMP container by using replicas: 5
and then also ensure that every single node only runs one single instance of the LAMP stack max_replicas_per_node: 1
,6001
to 6004
to the container TCP port 80. The main comment here is that due to wanting to keep things portable, the LAMP container will not even bother serving SSL requests and instead an SSL terminator such as hitch
or caddy
will be used to deencapsulate and then forward the plaintext request to (varnish and then) this LAMP container. Of course, this depends on application, with hitch
being more geared towards serving websites and caddy
being geared towards being a reverse-proxy that is designed to scale up to many backend services.In this instance, caddy has been used as the SSL terminator of choice, due to other existing infrastructure that did not warrant introducing yet another piece of technology into the existing architecture. caddy is a reverse-proxy written in Go (Google language) that is highly maintained and integrates a few modules out of convenience that solves the issue of having to run certificate renewal mechanisms, such as Let's Encrypt individually, as well as integrating with some popular DNS registrars seamlessly within the same package via addons.
As an example, here is caddy configuration:
*.site.tld, site.tld { tls { dns cloudflare aaaaaaaaaaaaaaaaaaaaaaaaaaa resolvers 8.8.8.8 propagation_timeout 5m } handle { reverse_proxy docker:6000 { header_up X-Forwarded-For {http.request.header.CF-Connecting-IP} } } }
That, albeit small, will perform the following things:
site.tld
as well as any sub-domain of the domain site.tld
will automatically have SSL certificates generated via Zero SSL and/or Let's Encrypt (whichever responds first) by DNS verification using CloudFlare DNS services,site.tld
to a server at the IP address that the docker
hostname resolves to on port 6000
(this is where vanish will be listening) and populates the HTTP header X-Forwarded-For
with the real IP address as reported by the CloudFlare DNS proxy via their internal header CF-Connecting-IP
(this is necessary for tracking visitors and establishing access controls)Maybe surprisingly, but this is all that is needed from caddy for the purposes listed above.
The varnish Cache, or varnish, for short, is usually remembered as a HTTP cache and not really a load-balancer, like Squid or HAproxy but for this purpose, the caching is not really needed at all and only the load-balancing protocols are used to distribute the request across a cluster of LAMP nodes created above.
Either way, short of starting a holy war on what balancer is better, experienced people will notice that the software does not matter but rather the underlying concept and architecture that is desired to be implemented. In this case, one of the requirements is that whatever does the load-balancing, should emit probes to the backend servers in order to determine which backends are ready to serve requests which is required for high-availability. As to the load-balancing itself, there are a few algorithms that are pretty standard and some of those might be interesting to this application in particular. Here are a few mentions:
For this usage scenario, the two algorithms are the same and it was just so chosen that the URL of requests is hashed and then the traffic is split up between different backends. Here is a configuration file for varnish that does just that:
vcl 4.1; import directors; backend docker1 { .host = "docker1"; .port = "6001"; } backend docker2 { .host = "docker2"; .port = "6002"; } backend docker3 { .host = "docker3"; .port = "6003"; } backend docker4 { .host = "docker4"; .port = "6004"; } backend docker5 { .host = "docker5"; .port = "6005"; } sub vcl_init { new vdir = directors.hash(); vdir.add_backend(docker1, 1.0); vdir.add_backend(docker2, 1.0); vdir.add_backend(docker3, 1.0); vdir.add_backend(docker4, 1.0); vdir.add_backend(docker5, 1.0); } sub vcl_recv { set req.backend_hint = vdir.backend(regsub(req.url, "&key=[A-Za-z0-9]+", "")); }
The varnish configuration file configures five backends with hostnames docker1
through to docker5
, all of them corresponding to the LAMP Docker container that is distributed on the swarm, and all of them listening on TCP ports 6001
through to 6005
. The hash
director (load-balancing algorithm) is used and all backends are added to the load-balancing director in vcl_init
when varnish starts up. Now, when a request arrives and gets forwarded to varnish, the request is processed inside the vcl_recv
stanza, where the request URL is stripped of any query parameters leaving only the URL path and then the URL path is hashed with a backend being chosen based on the resulting hash.
The corresponding Docker compose file for running varnish under a Docker container should be pretty straightforward:
version: '3.9' services: lamp-varnish: image: varnish:latest user: root:root ports: - 6000:80 volumes: - /mnt/lmap/varnish/:/etc/varnish/ - type: tmpfs target: /var/lib/varnish/varnishd:exec deploy: replicas: 1 placement: max_replicas_per_node: 1
where the directory /mnt/lmap/varnish/default.vcl
should contain the varnish configuration file mentioned previously. The container would listen on 6000
and route the traffic inside the container to varnish listening on port 80
such that all HTTP traffic should be redirected to the container on port 6000
.
While all of the former is fine and dandy, there is nothing too special that would factor in the state of the backend when selecting the backend to send the request to, except, for the trivial case when a backend is completely unresponsive. It might be more interesting to find a stronger algorithm that might help even out the traffic across the cluster even more.
For a more comprehensive balancing algorithm that would factor in the state of the various backends, the HAproxy software package can be used. HAproxy is a powerful load balancer with an extensive set of balancing algorithms, that can even act on layer 7 by load-balancing raw TCP and/or UDP connections, as well as query backend services for information that can then be factored into the distribution decision between the backends.
The following schematic should provide an overview of what is to be accomplished:
Caddy is going to forward all the requests to HAProxy, that, in turn, will forward to the LAMP containers behind HAProxy. The ports with a 6
as a prefix, 6001
to 6005
(not depicted) will be used to serve content whereas the ports with a 7
prefix will answer with a percentage indicating to HAProxy how busy the server is.
For the purpose at hand, the following is a sample configuration for HAProxy:
frontend www mode http bind :6000 default_backend lmap backend lamp mode http balance leastconn server docker1 docker1:6001 check weight 100 agent-check agent-port 5005 agent-inter 5s server docker2 docker2:6002 check weight 100 agent-check agent-port 5005 agent-inter 5s server docker3 docker3:6003 check weight 100 agent-check agent-port 5005 agent-inter 5s server docker4 docker4:6004 check weight 100 agent-check agent-port 5005 agent-inter 5s server docker4 docker5:6005 check weight 100 agent-check agent-port 5005 agent-inter 5s
that does the following:
6000
for connections to load-balance,docker1
' through to docker4
listening on their corresponding ports 7001
through to 7005
(note that his corresponds to the Docker compose file for the LAMP container)
The remainder of the configuration, with the servers configured with agent-check agent-port 5005 agent-inter 5s
will make HAProxy attempt to establish a TCP connection to the server hostname on port 5005
every 5s
where it expects a service to send back a percentage that reflects how free the server is (with representing a server that is completely idle and representing a server that is fully busy). Obviously, a program will have to be derived to listen on port 5005
inside the LAMP container and send the percentage whenever a TCP connection is established to its 5005
port. In order to accomplish that, the LAMP supervisord
configuration file supervisor.conf
is modified in order to include an extra program that will provide the CPU utilization to connecting clients. Here is the full file again with the modifications made:
[supervisord] user=root nodaemon=true pidfile=/run/supervisord.pid logfile=/dev/stdout logfile_maxbytes=0 [program:php-fpm] command=/usr/sbin/php-fpm8.2 --nodaemonize --fpm-config /etc/php/8.2/fpm/php-fpm.conf stdout_logfile=/dev/stdout stderr_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile_maxbytes=0 [program:apache2] command=/usr/sbin/apachectl -D FOREGROUND environment= APACHE_RUN_USER=%(ENV_APACHE_RUN_USER)s, APACHE_RUN_GROUP=%(ENV_APACHE_RUN_GROUP)s stdout_logfile=/dev/stdout stderr_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile_maxbytes=0 [program:usage] command=/usr/bin/socat TCP4-LISTEN:5005,fork EXEC:"/usr/bin/bash -c /usr/local/bin/usage" stdout_logfile=/dev/stdout stderr_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile_maxbytes=0
with the new main attraction being the program usage
that runs the command:
/usr/bin/socat TCP4-LISTEN:5005,fork EXEC:"/usr/bin/bash -c /usr/local/bin/usage"
that will execute the /usr/local/bin/usage
program that returns the CPU utilization of the container:
#!/usr/bin/env bash ########################################################################### ## Copyright (C) Wizardry and Steamworks 2025 - License: MIT ## ########################################################################### export IFS=$' ' PROCS=$(ps --no-headers -exo "%cpu:1") && \ printf "%.0f%%\n" $(bc <<< "scale=(1.1);100 - ((100.0 * $(echo "$PROCS" | paste -s -d+ | bc) / $(echo $PROCS | wc -l)) / $(nproc --all))")
For kicks and giggles, after the service is deployed, the functionality can be tested using netcat:
nc IP PORT
where:
IP
is the IP address of a machine that is running a copy of the LAMP container,PORT
is the port that Docker maps from the outside to the 5005
port that socat
listens on
Note that there are better ways to retrieve the CPU utilization on Linux, but this script is designed to count the CPU utilization for all PIDs to be found within the container in order to determine the CPU utilization for the container itself because due to the containerization, a command such as uptime
would return the load average of the host and not of the container. In other words, container separation is a thin-layer virtualization, that does not separate the guest and the host completely such that most commands on Linux that inter-operate with the kernel, will just return the values of the machine running the container, not the values tailored specifically for the container (trivially, because the kernel is a shared resource between all containers and the host).
There are some optimizations that can be done for the former layout that are summarized here.
One of the fears when running distributed tasks, is that it is difficult (as in, computationally) to determine the actual resource usage at runtime, such that it might just happen that given task distribution and migration, a lot of resource-heavy tasks might end up on one single machine and would eventually end up consuming all the resources until the machine freezes. The user enters a sort-of "competition with themselves" due to the machines running containers becoming the equivalent of shared hosting, such that the dread is to end up with a "more crucial than the rest" container being scheduled on a machine that is already overloaded. A solution found in the past has been to use a script and just terminate processes once the CPU consumption goes over what the machine could handle, a sort-of "OOC"-killer, or out-of-CPU killer instead of RAM and it seems to work very well given that once killed, the Docker masters will just reschedule the container within the swarm to a machine (but this time, with an observable load).
With that being said, the collector that measures the CPU consumption within a container can be simplified and it can just reflect the overall load of the node that the container resides on such that the HAProxy load balancer schedules HTTP requests to nodes that have smaller processing loads. To that end, the usage
script within the LAMP container is simplified to:
#!/usr/bin/env bash ########################################################################### ## Copyright (C) Wizardry and Steamworks 2025 - License: MIT ## ########################################################################### printf "%d%%\n" $(vmstat 1 2 | awk 'END { print $15 }')
that uses vmstat
to poll the CPU idle time (the arithmetic is not needed anymore, because the previous script was counting CPU usage per process whereas vmstat
reports CPU idle time).
With the script replaced, the same supervisord configuration file supervisor.conf
can be used that will report CPU idle time on the configured ports for all nodes. HAProxy will connect to the port as usual and retrieve the idle time while adjusting the weights depending on what the script reports.
The stack "caddy", "HAProxy" and "Apache2" fully supports HTTP/2 throughout the pipeline back and forth. Obviously, since caddy will be the one that performs the SSL decapsulation / termination, the "h2" protocol cannot be used but rather "h2c", namely HTTP/2 over cleartext, that will be used behind caddy for the rest of the high availability and load-balancing stack. Similarly, caddy can still serve upstream requests using "h2" or "h3".
Some modifications must be made to caddy itself because caddy does not support h2c by default due to security concerns. However, depending on the architecture, such as a closed network, those concerns can be irrelevant and thus the following configuration can be made in order to enable h2c:
{ servers :80 { protocols h1 h2 h3 h2c } servers :443 { protocols h1 h2 h3 h2c } } *.site.tld, site.tld { tls { dns cloudflare aaaaaaaaaaaaaaaaaaaaaaaaaaa resolvers 8.8.8.8 propagation_timeout 5m } handle { reverse_proxy h2c://docker:6000 { header_up X-Forwarded-For {http.request.header.CF-Connecting-IP} } } }
Note the addition of the general configuration that enables "h2c" as well as now passing the traffic via h2c to docker:6000
via the directive reverse_proxy h2c://docker:6000
.
Next, for HAProxy, the configuration file must be modified in order to add availability for h2©. Here is the whole haproxy.cfg
file again, with the changes made:
frontend www mode http bind :6000 proto h2 default_backend lmap backend lamp mode http balance leastconn server docker1 docker1:6001 check weight 100 agent-check agent-port 5005 agent-inter 5s proto h2 server docker2 docker2:6002 check weight 100 agent-check agent-port 5005 agent-inter 5s proto h2 server docker3 docker3:6003 check weight 100 agent-check agent-port 5005 agent-inter 5s proto h2 server docker4 docker4:6004 check weight 100 agent-check agent-port 5005 agent-inter 5s proto h2 server docker4 docker5:6005 check weight 100 agent-check agent-port 5005 agent-inter 5s proto h2
The changes being made consist mainly in the addition of proto h2
for all servers as well as within the frontend
stanza that defines the listening port of HAProxy itself.
Finally, the http2
Apache module has to be enabled, which will require a rebuild of the Docker container in order to include the module in the list of modules to be enabled by default. Here is a snippet of the Dockerfile used to build the LAMP container again:
# enable apache modules RUN a2enmod \ http2 \ authz_host \ ... \ dav_lock || true
Finally, an Apache configuration must be added, somewhere within config-enabled
, perhaps named protocols.conf
that contains the following directive:
# h2 - ssl # h2c - cleartext Protocols h2c http/1.1
that enables "h2c" as a protocol.
When dealing with stateless protocols such as HTTP, distributing the LAMP container into anything that can execute instructions by running the container concurrently as replicas of the same container, is highly beneficial, because in the event that the container ends up on a hosed machine, the user does not care too much because other machines will be running other replicas as well and with the benefit of the load-balancer that will be able to tell and will just end up scheduling HTTP requests to the other machines that are still working well. Conversely, if one would have ran just one single LAMP container, it might just have ended up somewhere on the network on a server that runs other programs that end up competing with the container. In fact, one could say, that stateless services should always be turned into replicas (clones of each other), especially if they are on-demand services and if there exists a solution to ensure coherent access in terms of concurrency of shared storage that has to be shared amongst the replicas. Compared to stateful services, HTTP services are by default idle, such that running multiple instances does not deduplicate the resource usage, which is always one of the concerns when desiring to achieve high availability and failover.