About
Solutions
- Docker-in-Docker
- Load Balancing and Failover
Conclusions

About

At the time of writing, Docker does not properly support hardware acceleration, even if the container is passed the direct rendering device as documented via a volume mount. The problem is actually the "Secure Computing" feature of recent Linux kernels, a very ineffective, seriously damaging as well as damning fake "security measure", that consists just in a glorified blacklist of system calls, just like other solutions such as apparmor and the Brady bunch of the security circus.

Intuitively, what happens is that the user passes the direct rendering device(s) to the container, as in:

-v /dev/dri/renderD128:/dev/dri/renderD128

or, using the Docker compose format:

volumes:
  - /dev/dri/renderD128:/dev/dri/renderD128

in order to make the device available to the software within the container. This should be alright and most software, such as PVRs, Plex or Jellyfin, do pick up the device and even recognize the make and brand of the device.

However, for some mysterious reason, Plex and Jellyfin cannot seem to be able to use the device. The usual worry orbits the Linux permission system and, as far as the LinuxServer Plex container is concerned, there is a suggestion to add a script within the container that will change the permissions to the direct rendering devices such that the user within the container (abc) is able to access the device nodes under /dev/dri. Unfortunately, whilst the fix is sensible, it seems that the solution still does not work and something else seems at play.

To debug this, one can shell into the container and then just attempt to access the direct rendering device, as per:

cat /dev/dri/renderD128

which is a sensible command given that not only does the device have to be accessible for reading but also for writing.

Accessing /dev/dri/renderD128 with cat as per the above, seems to fail with the error message "Operation not permitted", which is different from "Permission denied", the usual indicator that file, or in this case, device permissions are broken.

It takes a while to look around and find the culprit but it turns out that "Secure Computing", an option within the kernel that cannot be turned off at runtime via kernel parameters, is directly responsible for prohibiting access to devices from within the container. Note that the former does not apply to starting docker containers manually, given that starting Docker containers manually can always be done by setting the –privilged container flag that allows accesses to anything. Apparently, the combination between the Docker swarm container, "Secure Computing" (seccomp), prevents the access to the DRI device. Furthermore, the documented workaround is to supply a list of denied system calls to Docker itself, in order to allow accesses to devices (in particular, the ones covered by CAP_RAW_IO Linux capabilities). However, yet again, that seems to fail for Docker swarms where the modified list of capabilities does not seem to take effect and the same error message "Operation not permitted" is observed.

To repeat for clarity, all the issues above do not take effect when containers are started via "docker run", such that it is a non-issue from the perspective of running containers, or rather an issue with losing Docker swarm orchestration for containers that need to access the hardware directly.

Solutions

Here is a list of solutions that are not ideal and range from zero to partial compatibility with Docker swarm.

Docker-in-Docker

The docker-in-docker solution is to create a surrogate Docker service that, in turn, trivially executes a docker run command and starts the "real container" that must access the hardware, such that starting the container via the CLI would allow passing devices via the –device parameters.

This is not ideal because in doing so the inner container is launched but the ports are not propagated via the Docker network mesh which, in principle, eliminates some of the benefits from swarm orchestration.

Here is what a compose file for Docker swarm looks like that starts a Plex Media Server inner container as a child:

version: "3.8"

services:
  plex:
    image: docker:stable
    entrypoint: [sh, -c]
    environment:
      TASK_NAME: '{{.Task.Name}}'
    command: >-
      'exec docker run
      --interactive
      --user 0:0
      -e PUID=1000
      -e GUID=1000
      -p 32400:32400
      -v /mnt/docker/data/plex/data:/config
      -v /mnt/docker/data/plex/init:/etc/cont-init.d:ro
      -v /mnt/docker/data/plex/transcode:/transcode
      -v /mnt/docker/data/plex/repair:/repair
      -v /mnt/storage/Movies:/movies
      -v /mnt/storage/TV:/tv
      --device=/dev/dri/renderD128:/dev/dri/renderD128
      --label com.docker.stack.namespace=$$(docker container inspect --format "{{index .Config.Labels \"com.docker.stack.namespace\"}}" $${TASK_NAME})
      --volumes-from=$${TASK_NAME}
      --rm
      lscr.io/linuxserver/plex:latest'
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

This solution is repurposed from a StackOverflow question from someone attempting to achieve more or less the same effects. Aside from losing the mesh networking benefits, there are additional concerns, namely regarding the propagation of Linux signals through the container stack. As far as Plex Media Server is concerned, pressing Ctrl+C, that is, delivering a SIGINT signal to the server does not make the server terminate. The former leads to issues where containers are left in zombie state, requiring the user to terminate the inner container manually and thereby eliminating the benefits of automation.

Load Balancing and Failover

Whilst Docker swarm is a load-balancing and fail-over solution implicit via its architecture, in this case it might be preferable to just pin the container to a single machine within the swarm, and to erect some HTTP proxy as the single-point of entry, such that the correct machine will be accessed when the Plex server is accessed via its hostname.

This solution is preferable and goes well with typical setups where some HTTP load-balancer exists such as caddy or traefik, along with a whole slew of DNS names corresponding to the various docker containers running within the swarm. In other words, a good solution for homelabs and PVR setups.

The only problem with this solution is the complete loss of Docker swarm orchestration such that the container would have to be started otherwise via the operating system. To that end, here is a SystemD service file that will start and maintain a container with Plex Media Server:

[Unit]
Description=Plex Media Server Docker Container
After=docker.service
Requires=docker.service

[Service]
Restart=always
ExecStartPre=/usr/bin/docker pull lscr.io/linuxserver/plex:latest
ExecStart=/usr/bin/docker run --name=plex \
  --rm \
  --interactive \
  --user 0:0 \
  --privileged \
  -e PUID=1000 \
  -e PGID=1000 \
  -e TZ=Etc/UTC \
  -e VERSION=docker \
  --device=/dev/dri/renderD128:/dev/dri/renderD128 \
  -p 32400:32400 \
  -v /mnt/docker/data/plex/data:/config \
  -v /mnt/docker/data/plex/transcode:/transcode \
  -v /mnt/docker/data/plex/repair:/repair \
  -v /mnt/storage/Movies:/movies \
  -v /mnt/storage/TV:/tv \
  lscr.io/linuxserver/plex:latest

[Install]
WantedBy=multi-user.target

The file should be adjusted to change the volume paths and make other changes necessary, then be copied to /etc/systemd/system/plex.service, SystemD refreshed with systemctl daemon-reload, then enabled via systemctl enable plex.service and finally stated with systemctl start plex.

Finally, caddy should be setup to proxy requests to the Plex container by monitoring the Docker swarm members. This will provide some fail-over in case Plex and the service file is moved to a different machine within the Docker cluster. The configuration is rather simple:

plex.DOMAIN.duckdns.org {
        tls mail@mail.com

        reverse_proxy docker1:32400 docker2:32400 {
                lb_policy first
                lb_try_duration 5s
                lb_try_interval 250ms

                fail_duration 10s
                max_fails 1
                unhealthy_status 5xx
                unhealthy_latency 5s
                unhealthy_request_count 1

                trusted_proxies 192.168.1.0/24

                header_up Host {host}
                header_up X-Real-IP {remote}
                header_up X-Forwarded-Host {hostport}
                header_up X-Forwarded-For {remote}
                header_up X-Forwarded-Proto {scheme}
        }
}

with the main highlight being the line reverse_proxy docker1:32400 docker2:32400 that will attempt to connect to the machine docker1 and to docker2 on the local network in order to serve the Plex daemon. The other parameters, starting with lb_policy are meant to configure failover and verification of services on the machines docker1 and docker2 by setting the load balancing policy and then attempting connections. All the rest is the usual recommended setup for reverse-proxying services.

With this setup it is even possible to run the SystemD service file on all nodes within a Docker swarm, thereby obtaining some-sort of spread of cost for all machines running Docker swarm. In doing so, the database must be duplicated and each Plex Media Server instance must be running on top of its own copy of the database in order to not corrupt files due to concurrent accesses. Since that seems a separate project, please see the Docker Plex load balancing page that describes a setup capable of fail over and load-balancing between two Plex instances.

Conclusions

The solutions shown should be sufficient till Docker is patched to be able to bypass seccomp such that direct rendering devices can be accessed. The current development indicates that patches are in the works to that end.

Table of Contents

About

Solutions

Docker-in-Docker

Load Balancing and Failover

Conclusions