One of the problems with running software within isolated containers and in particular given the ease with which programs can be deployed, one of the problem that arises is the difficulty to track the individual errors that take place within a container that might render the running program defunct. In order to counter that, typically docker allows a healthcheck to be implemented that can be used to test whether the program within the container is running properly but many times the healthcheck is not rigorous enough and can only really report in practice whether a service port is open or whether some part of the service is functional without being sure in case the service partially fails.
This section uses "graylog", "elasticsearch" and "mongodb" in order to centralize log files from all containers running within a swarm such that the logs can then be inspected or grok patterns built in order to take actions upon the errors being reported from all containers.
Graylog is software that is meant to centralize logs from various sources via different sinks. One such sink is the gelf
sink that will be used to pass the logs from docker containers to Graylog. Furthermore, compared to logstash, Graylog also has built-in interface along with various embedded functionality such as triggering actions depending on log file matches, all of which can be configured with just a browser, which makes Graylog much preferable and contained than logstash. Setting up Graylog varies depending on the environment that is desired with distributions typically packaging all the necessary requirements and the developers of Graylog themselves having pre-made binary packages available for various distributions.
Since the guide will focus on monitoring the containers within a docker swarm, Graylog will be made to run within the swarm itself as well, in order to keep everything together without too many external dependencies.
The following is a Graylog compose file that is tailored with a single monolithical build without data nodes that should be suitable for most small to medium swarms.
version: '3.8' services: graylog: image: graylog/graylog:6.1.2 user: root ports: - "5044:5044/tcp" # Beats - "5140:5140/udp" # Syslog - "5140:5140/tcp" # Syslog - "5555:5555/tcp" # RAW TCP - "5555:5555/udp" # RAW UDP - "9000:9000/tcp" # Server API - "12201:12201/tcp" # GELF TCP - "12201:12201/udp" # GELF UDP #- "10000:10000/tcp" # Custom TCP port #- "10000:10000/udp" # Custom UDP port - "13301:13301/tcp" # Forwarder data - "13302:13302/tcp" # Forwarder config volumes: - /mnt/docker/data/graylog/data:/usr/share/graylog/data/data - /mnt/docker/data/graylog/journal:/usr/share/graylog/data/journal environment: GRAYLOG_PASSWORD_SECRET: "..." GRAYLOG_ROOT_PASSWORD_SHA2: "..." GRAYLOG_NODE_ID_FILE: "/usr/share/graylog/data/data/node-id" GRAYLOG_HTTP_BIND_ADDRESS: "0.0.0.0:9000" GRAYLOG_HTTP_EXTERNAL_URI: "http://localhost:9000/" GRAYLOG_MONGODB_URI: "mongodb://...:...@docker.tld/graylog" GRAYLOG_ELASTICSEARCH_HOSTS: "http://docker.tld:9200" GRAYLOG_ELASTICSEARCH_INDEX_PREFIX: "graylog" deploy: replicas: 1 placement: max_replicas_per_node: 1
The following changes have to be made:
/mnt/docker/data/graylog
must be adjusted to point to some directory where docker and Graylog can store files, along with the subdirectories data
and journal
.docker.tld
hostname corresponds to the FQDN of the docker swarm (for instance a load-balanced IP or the IP of any of the swarm nodes).GRAYLOG_NODE_ID_FILE
has to be created by the user, for instance by creating a file at /mnt/docker/data/graylog/data/node-id
with any string.GRAYLOG_MONGODB_URI
must be set to the URL of a MongoDB database with an optional username and password.GRAYLOG_PASSWORD_SECRET
can be generated using the command line tool pwgen
for instance by running pwgen -N 1 -s 96
and then pasting the result within the quotes.GRAYLOG_ROOT_PASSWORD_SHA2
can be generated using the command line tool sha256sum
by issuing the command echo -n "Enter Password: " && head -1 </dev/stdin | tr -d '\n' | sha256sum | cut -d" " -f1
.
MongoDB is a NOSQL database system that Graylog needs to be setup. Unfortunately, MongoDB since version 5 requires that the platform that it is running on to have a CPU with AVX extensions. That is very weird, such that a service with MongoDB compiled without AVX is preferrable. There are several available but for this tutorial the l33tlamer/mongodb-without-avx
image was chosen at the latest version.
The following is an example compose file that can be used with docker that will launch a MongoDB variant with the AVX requirement compiled out.
version: '3.9' services: mongo: image: l33tlamer/mongodb-without-avx:6.2.1 healthcheck: test: echo 'db.stats().ok' | mongo localhost:27017/test --quiet interval: 10s timeout: 10s retries: 5 user: root volumes: - /mnt/docker/data/mongo/db:/data/db - /mnt/docker/data/mongo/configdb:/data/configdb - /mnt/docker/data/mongo/init:/docker-entrypoint-initdb.d/:ro ports: - 27017:27017 environment: - MONGO_INITDB_ROOT_USERNAME=myuser - MONGO_INITDB_ROOT_PASSWORD=mypassword deploy: replicas: 1 placement: max_replicas_per_node: 1
Note that the path /mnt/docker/data/mongo/
must be adjusted to point to some storage space where docker can save the mongo files and that MONGO_INITDB_ROOT_USERNAME
and MONGO_INITDB_ROOT_PASSWORD
must be adjusted in order to set the root username and password.
In order to create a database to be used with Graylog, the instructions on the MongoDB FUSS page can be followed. The MongoDB will then be accessed from Graylog, by changing the Graylog docker compose file and setting GRAYLOG_MONGODB_URI
to an URL corresponding to the username, password and the database generated.
mongodb://user:pass@docker.tld/graylog ^ ^ ^ ^ | | | | + | + | username | docker + | hostname database + password
After running Graylog, a GELF input has to be set up by following the menu System
→Inputs
and adding a new GELF UDP
input. Choosing UDP over TCP has the benefit that UDP maximizes the data payload that can be carried within an UDP packet at the expense of data consistency. However, in this setup it is more important to pass data fast and not really ensure that all log entries produced are received consistently.
The next part involves editing docker compose files in order to set logging to use GELF and then pass the log output to the Graylog instance. The following could be an example on how to accomplish the former:
logging: driver: gelf options: gelf-address: "udp://docker.tld:12201" tag: ...
where:
docker.tld
is the hostname running Graylog,12201
is the GELF input port
The tag
here filled in with dots can be pretty much anything and helps identification on Graylog where logs can be observed and acted upon depending on various properties and pattern matching.
Managing large clusters can be difficult without being able to centrally audit the state of the individual machines and the software running on the machines. Centralizing logs with Graylog is not a complete solution because Graylog does not include any performance statistics about the swarm nodes nor the containers themselves. However, Graylog is useful in order to check upon the consistency of the programs running within the container which is a perfect approach for the Docker paradigm itself that shifts the focus from technology to data particularly given that the programs and software itself is what generates data in this case.