Table of Contents

About

Assuming that a virtual private network (VPN) has to be established between two clients, A and B then it is conventional to have one of the clients A connect to the a client that acts as a server B. Typically in this scenario is that the server B will always be ready to accept connections given that a server assumes stable network conditions.

However, imagine that both A and B are road-runner clients such that the networks that A and B participate in are unstable or unknown, then far more care has to be taken such that a connection can be stable between A and B.

Back and Forth Topology

As A and B both move through networks, it is far better to have A connect to B and then also B connect to A, given that it might happen that either A or B is firewalled in the incoming direction, as it is typical for commercial ISPs.

This page will document establishing an OpenVPN VPN between two machines, back and forth, where machine A will connect to B and then machine B will connect to A, such that both machines will be acting as both a client and a server.

On top of that, since there will be two established routes between the machines, a load balancer is added on top of the established networks in order to commute automatically between the connections or to aggregate them both for better throughput.

Setting up Interfaces

Using ifupdown for its simplicity at configuring interfaces, the /etc/network/interfaces.d directory is populated with three new interfaces:

with the contents as follows for each.

Note that one of the interesting things here is that the hardware address (MAC) is set for each virtual interface and even though this is not necessary, it is done in order to keep the networks predictable and stable though various reboots and restarts by pinning the virtual interfaces with fixed hardware addresses.

bond0

The bonding interface is brought up but without adding the interfaces ovn0 and ovn1 as slaves. The slave interfaces will be added later through OpenVPN and on demand as the connections are established or broken.

auto bond0
allow-hotplug bond0
iface bond0 inet static
    address 10.0.0.1
    netmask 255.255.255.0
    metric 1
    bond-mode balance-alb
    hwaddress ether 16:b9:3f:58:cb:4c
    mtu 1500

The address 10.0.0.1 will be different on each machine, such that one machine might be set to 10.0.0.1 and the other to 10.0.0.2.

ovn0

auto ovn0
allow-hotplug ovn0
iface ovn0 inet manual
    hwaddress ether 72:2d:a8:a3:a2:01

ovn1

auto ovn1
allow-hotplug ovn1
iface ovn1 inet manual
    hwaddress ether b2:0f:a3:5b:84:04

Load Balancing and Failover

Looking at the sketch in the topology section, it seems that the setup is more interesting when each side of the client-server pairs connect through a different route, or a different ISP, where it would make sense to aggregate the links between the machines in order to additionally boost performance rather than just attaining failover.

If the ISP is the same for both machine A, and machine B, then it makes more sense to use failover given that the network speed should be the same as a single connection, even if aggregation would be used, however and maybe with a slightly better throughput between machines.

Nevertheless, the silver bullet seems to be balance ALB, that is well-supported natively in Linux, does not require a switch with LACP-capable ports and will provide both failover between interfaces as well as aggregation when both interfaces are up.

Setting up OpenVPN

OpenVPN has a few quirks that does not allow for a minimal setup; namely that for the purpose of maintaining a VPN connection between two machines, an entire Private Key Infrastructure (PKI) is not needed at all and is more of a bother than advantageous. However, as it so happens OpenVPN supports peer-to-peer connections between exactly two participants only through a tunneling interface that works at the network layer 3 whereas interface bonding works at the network layer 2 such that a peer-to-peer connection cannot be used alongside additional link aggregation and failover.

So, regrettably, TAP interfaces have to be used, that, in turn, need a PKI and certificates and certificate authorities to be setup individually for both machines. This paintstaking task to generate certificates, a Diffie-Hellman key and a TLS encryption key is not documented here because it tends to vary between distributions as well as change from time to time. Typically, a wrapper, as part of the easy-rsa package is used, along with a convenience script that can generate all the necessary certificates.

Irrespective, when both OpenVPN connections are established and tested, OpenVPN must carry out the additional task of enslaving the interfaces ovn0 and ovn1 to the bonding interface bond0. On modern Linux distributions, the ifenslave binary is obsolete and sysfs is used to change parameters, including setting up slave interfaces and master bonding interfaces. To that end, the OpenVPN configurations will contain the following additional configuration parameters:

script-security 2
route-up /etc/openvpn/scripts/ovn-to-bond.sh
route-pre-down /etc/openvpn/scripts/ovn-remove-bond.sh

such that the OpenVPN instances will execute /etc/openvpn/scripts/ovn-to-bond.sh once the connection and routes are established, as well as /etc/openvpn/scripts/ovn-remove-bond.sh when the connection is severed.

ovn-to-bond.sh

The purpose of this script is to be run from OpenVPN and then add the connecting interface to the bonding interface defined by BONDING_INTERFACE.

#!/bin/sh
BONDING_INTERFACE=bond0
 
sleep 10
 
/usr/sbin/ifconfig "$dev" down
echo "+$dev" > "/sys/class/net/$BONDING_INTERFACE/bonding/slaves"
/usr/sbin/ifconfig "$dev" up

ovn-remove-bond.sh

When the connection is torn down, this script will be responsible for removing the connecting interface from the bonding interface.

#!/bin/sh
BONDING_INTERFACE=bond0

echo "-$dev" > "/sys/class/net/$BONDING_INTERFACE/bonding/slaves"

Testing

Given that both machines have a bond0 interface configured, following the example, with 10.0.0.1 and respectively 10.0.0.2, both machines should check that they can contact each other over the bonding interface.

The next thing to test would be failover such that either OpenVPN interface should be brought down and then the bonding interface monitored to check that the interface got removed from the bond, that the other interface is still there and that the machines can still contact each other over a single interface. To that end, monitoring the bonding interface can be done by polling the /proc subsystem:

cat /proc/net/bonding/bond0

for details about the bond. Within the same test, the "failed" interface should be brought up again, just by staring the OpenVPN daemon and then the bonding interface should be polled to see the interface being added back to the enslaved interfaces.

If all goes well, a load-balancing and dynamic link aggregated bond would have been established between two road-runner machines with a topology that should be more resilient in case either of the connections fail.

Monitoring with Monit

Using monit, the connection of both ovn0 and ovn1 interfaces can be watched and OpenVPN restarted in case the interfaces fail.

The following files can be added to monit in order to ensure that the interfaces are up. The monit files assume that one of the OpenVPN services is named openvpn@ovn0 and the other openvpn@ovn1. Similarly, both files have to be added to both machines, where one machine will be configured to check the address 10.0.0.2 and the other 10.0.0.1.

ovn0

###########################################################################
##  Copyright (C) Wizardry and Steamworks 2023 - License: GNU GPLv3      ##
###########################################################################

check process ovn0-pid with pidfile /run/openvpn/ovn0.pid
    start program = "/usr/bin/systemctl start openvpn@ovn0"
    stop program = "/usr/bin/systemctl stop openvpn@ovn0"

check host ovn0 with address 10.0.0.2
    start program = "/usr/bin/systemctl start openvpn@ovn0"
    stop program = "/usr/bin/systemctl stop openvpn@ovn0"
    if failed
        icmp type echo count 5 with timeout 15 seconds
    then restart

ovn1

###########################################################################
##  Copyright (C) Wizardry and Steamworks 2023 - License: GNU GPLv3      ##
###########################################################################

check process ovn1-pid with pidfile /run/openvpn/ovn1.pid
    start program = "/usr/bin/systemctl start openvpn@ovn1"
    stop program = "/usr/bin/systemctl stop openvpn@ovn1"

check host ovn1 with address 10.0.0.2
    start program = "/usr/bin/systemctl start openvpn@ovn1"
    stop program = "/usr/bin/systemctl stop openvpn@ovn1"
    if failed
        icmp type echo count 5 with timeout 15 seconds
    then restart