Table of Contents

Description

The following article describes a setup meant to do route-based load balancing using different Internet providers on disjunct network using advanced Linux routing. The purpose is to distribute outbound connections over multiple uplinks and to allow uplinks to fail such that new outbound connections will seamlessly be routed over the remaining active Internet outbound uplinks.

For that purpose, Linux advanced routing is used, notably the ability to route and distribute packets out of different interfaces. The downside of the presented approach, for instance, contrasted to BGP routing, is that active connections over links that fail will fail as well and will need to be rebuilt - in essence, this affects statefull protocols where established connections have to be maintained but will affect stateless protocols (such as HTTP) somewhat less.

The approach differs from the the multiple connection load-balancing and routing article in that marking packets with iptables is not mandatory - although, given well-defined advanced routing tables, it would still allow sending site-specific marked packets through specific ISPs.

Informally, this article will allow you to distribute your Internet connection over multiple ISPs, with the added bonus that once one of your ISPs crap out, new routes will be established through the other ISPs that are still active. However, stuff like "games", "Skype" and similar will still suffer an interruption in case the connection was established over an ISP that failed but it will allow you to reconnect and you will reconnect through an ISP that is not down. Stuff like "web browsing" will rock because there are no permanently established connections such that it will appear like a seamless transition from a failing ISP to the other. You also get the added bonus that once an ISP recovers, it will be added back to the pool of active ISPs. Ah yea, the shit about BGP is something better than this where even statefull connections failover ("games" and such) but it requires your ISPs to cooperate and they surely shallen't…

Assumptions

Topology

Configuring the Interfaces

On Debian, the interfaces can be added to /etc/network/interfaces. As per the example topology of this article, one would need to add the following lines to /etc/network/interfaces:

# Internet - ISP1
auto eth1
iface eth1 inet dhcp

# Internet - ISP2
auto eth2
iface eth2 inet dhcp

where eth1 corresponds to the first ISP and eth2 corresponds to the second ISP.

If there are more than two ISPs through which balancing outbound connections is desired, then those have to be added as well.

Configuring the DHCP Client

If the Linux server is Debian-based and uses dhclient to retrieve DHCP information, then the /etc/dhcp/dhclient.conf will have to be amended to remove the routers parameter from request.

However, that is not enough and Debian might still add the default route for each interface. To make dhclient ignore the gateway for each queried interface, a file can be created at /etc/dhcp/dhclient-enter-hooks.d/no-default-route with the following contents:

# Do not add default routes.
case $reason in
  BOUND|RENEW|REBIND|REBOOT)
    unset new_routers
    ;;
esac

which should make dhclient avoid setting-up default routes for queried interfaces.

For testing, one can issue:

dhclient eth1

and then checking whether there is any default route with:

ip route show default

Defining Advanced Routing Tables

On Debian, advanced routing tables can be defined in /etc/iproute2/rt_tables line-by-line using a number (that can be referenced using iptables marking, but this article does not) and a descriptive name for the table.

A table for each ISP will have to be created, for instance, for two ISPs, the following lines will be added in /etc/iproute2/rt_tables:

501	isp1
502	isp2

These are just definitions and they do absolutely nothing right now - they are like, labels, if you will for what will follow.

Using a Script to Automatically Add or Remove Interfaces

When interfaces go up or down, a script is used to add and remove routes and for adding multiple interfaces to the default route in order to achieve the load-balancing effect. Once an interface goes up that is also an ISP interface (configurable in the script), routes are added and then if the interface is not part of default route pool, the default root pool is rebuilt to include it. In case an interface goes down, the corresponding routes are removed but the default route pool is not recreated in order to leave a marker for the kernel, indicating that the interface is dead (it will appear as dead by issuing ip route show default).

The script has to be placed both at /etc/network/if-up.d/uplink-load-balancing (and will be executed when interfaces go up) and at /etc/network/if-down.d/uplink-load-balancing (and will be executed when interfaces go down).

One non-trivial (ie: copying or creating the file twice) way to add the script in both locations is to create a hard-link between /etc/network/if-up.d/uplink-load-balancing and /etc/network/if-down.d/uplink-load-balancing that has the added bonus that changes to either one of them will reflect in the other (in case the configuration must be changed).

The script uses arrays where:

Concerning interface weighting, the current script will prefer building routes eth1 twice over eth2 due to the first entry in IFACE_DEVICE being eth1 and the first entry in IFACE_WEI being 2 - contrasted to the column corresponding to the eth2 device.

Finally, the script can be marked executable by issuing:

chmod +x /etc/network/if-up.d/uplink-load-balancing
chmod +x /etc/network/if-down.d/uplink-load-balancing

and the setup is ready to go.

#!/bin/bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2017 - License: GNU GPLv3      ##
##  Please see: http://www.gnu.org/licenses/gpl.html for legal details,  ##
##  rights of fair usage, the disclaimer and warranty conditions.        ##
###########################################################################
 
###########################################################################
##                            CONFIGURATION                              ##
###########################################################################
 
###########################################################################
# This configuration section defines a few correlated arrays:             #
# - IFACE_TABLES                                                          #
# - IFACE_TABLES_PRIO                                                     #
# - IFACE_DEVICE                                                          #
# - IFACE_IPS                                                             #
# - IFACE_NMS                                                             #
# - IFACE_GWS                                                             #
# - IFACE_WEI                                                             #
# - IFACE_MARK                                                            #
# where each entry in one array corresponds to an entry in all the other  #
# arrays. In other words, to add an additional interface to this          #
# configuration, you would have to add a corresponding entry to all the   #
# other arrays.                                                           #
###########################################################################
 
# These are tables that must be defined in /etc/iproute2/rt_tables before
# running the script.
IFACE_TABLES=( isp1 isp2 )
# Priority for each table.
IFACE_TABLES_PRIO=( 250 251 )
# These are the interfaces through which traffic should be load-balanced.
IFACE_DEVICE=( eth1 eth2 )
# These are the IP addresses of the interfaces.
IFACE_IPS=( 23.206.132.14 95.173.136.70 )
# The corresponding netmask of both interfaces.
IFACE_NMS=( /24 /24 )
# The gateways of the interfaces.
IFACE_GWS=( 23.206.132.1 95.173.136.1 )
# Weights corresponding to the defined interfaces.
IFACE_WEI=( 2 1 )
# Interface mark for iptables marking (must be in hexadecimal).
IFACE_MARK=( 0x97 0x98 )
# General route options that will be added to routing definitions. It is
# recommended to at least specify "proto static" due to "proto static"
# being a hint to the operations system that these routes are voluntary and
# that other executing scripts should not consider them discardable.
ROUTE_OPTIONS="proto static initcwnd 70 initrwnd 70 quickack 1"
 
###########################################################################
##                              INTERNALS                                ##
###########################################################################
 
# Search the interface that just went up or down and then add the
# necessary rules and routes in order to send packets back out the same
# interface.
for i in ${!IFACE_DEVICE[*]}; do
    # Search for the device in the list of interfaces.
    if [ "$IFACE" = ${IFACE_DEVICE[$i]} ]; then
        case "$MODE" in
            start)
                ip rule add from ${IFACE_IPS[$i]}${IFACE_NMS[$i]} table ${IFACE_TABLES[$i]} priority ${IFACE_TABLES_PRIO[$i]}
                ip route replace default $ROUTE_OPTIONS via ${IFACE_GWS[$i]} dev ${IFACE_DEVICE[$i]} table ${IFACE_TABLES[$i]}
                ip route append prohibit default table ${IFACE_TABLES[$i]} metric 1 proto static
                while [ `ip rule show | grep "fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]}" | wc -l` != 0 ]; do
                    ip rule del fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]}
                done
                ip rule add fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]}
                # The default route will not be replaced because other
                # interfaces may also be active, in which case multi-path
                # routing would be chosen.
                # ip route replace default $ROUTE_OPTIONS via ${IFACE_GWS[$i]}
                ;;
            stop)
                while [ `ip rule show | grep "fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]}" | wc -l` != 0 ]; do
                    ip rule del fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]}
                done
                ip route del default $ROUTE_OPTIONS via ${IFACE_GWS[$i]} dev ${IFACE_DEVICE[$i]} table ${IFACE_TABLES[$i]}
                ip route del prohibit default table ${IFACE_TABLES[$i]} metric 1 proto static
                ip rule del from ${IFACE_IPS[$i]}${IFACE_NMS[$i]} table ${IFACE_TABLES[$i]} priority ${IFACE_TABLES_PRIO[$i]}
                # Delete the default route if it is set to the interface
                # that went down.
                ip route del default $ROUTE_OPTIONS via ${IFACE_GWS[$i]}
                ;;
        esac
        # Only one interface is reported by the if-up and if-down script hooks.
        break
    fi
done
 
# The default load-balancing route has to be re-initialized by adding new
# interfaces when they are brought up without removing when they are dead.
if [ "$MODE" = "start" ]; then
    ACTIVE_INTERFACES=`ip route show default | grep nexthop | wc -l`
    case "$ACTIVE_INTERFACES" in
        ${#IFACE_DEVICE[@]})
            # All definied interfaces have been added.
            ;;
        *)
            # Set up a standard default route statement.
            REPLACE_DEFAULT_ROUTE_COMMAND="ip route replace default $ROUTE_OPTIONS"
 
            # Add all active interfaces as nexthop to the command.
            for i in ${!IFACE_DEVICE[*]}; do
                # Retrieve the IP address of the interface - this filters
                # out dead interfaces implicitly.
                INTERFACE_ADDRESS=`ifconfig ${IFACE_DEVICE[$i]} |  grep 'inet' | egrep -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" | head -n 1`
                if [ ! -z "$INTERFACE_ADDRESS" ]; then
                    REPLACE_DEFAULT_ROUTE_COMMAND="$REPLACE_DEFAULT_ROUTE_COMMAND nexthop via ${IFACE_GWS[$i]} dev ${IFACE_DEVICE[$i]} weight ${IFACE_WEI[$i]}"
                fi
            done
            # Execute the command to replace the default route.
            eval $REPLACE_DEFAULT_ROUTE_COMMAND
            ;;
    esac
fi
 
# Flush the cache
ip route flush cache

Testing

In order to test the configuration, one would take all interfaces down manually using ifdown, in the discussed case, by issuing:

ifdown eth1
ifdown eth2

and then ensuring that there are no left-overs by checking the routing rules:

ip rule show

The script will create rules that will look similar to:

32761:	from 23.206.132.14 lookup isp1 
32762:	from 95.173.136.70 lookup isp2 

Next one would check that there is no default route set by issuing the following command:

ip route show default

Note that the article assumes that there exists a third interface for the LAN - in case the server is connected only to the ISPs to load-balance over, then an extra precaution has to be taken to not get disconnected by mistake when setting up (for instance, if you connect to the server remotely over SSH through one of the ISPs, you might end up being disconnected once the default route is removed).

If connectivity to the server is ensured, the following command can be used to remove the default route:

ip route del default

Finally, with all the configurations made and in-place, each interface can be brought up one-by-one. For instance, in the scope of the article:

ifup eth2

that will bring up the eth2 interface. Once an interface goes up, the recommended procedure is to check that the corresponding advanced routing rules are created and that a default path is created. The result of the aforementioned command, in the context of the command will trigger the script and should create the following:

After that, other interfaces are brought up, for instance by issuing:

ifup eth1

that will have the following effect (assuming that eth2 has previously been brought up):

and the default route will be changed to a load-balanced route:

default  proto static  initcwnd 70 initrwnd 70 quickack 1
	nexthop via 95.173.136.1  dev eth1 weight 2
	nexthop via 23.206.132.1  dev eth2 weight 1

Whenever either eth1 or eth2 go down, one branch will be marked dead and the kernel will route through the active interface. In case both interfaces go down, then the server will have no default route. In case eth1 goes down, eth1 will be marked dead and any new connections will be established through eth2. At a later time, when eth1 recovers, the script will rebuild the load-balanced default route and new connections will be scheduled either through eth1 or eth2 depending on the weights specified in the IFACE_WEI configuration key of the script.

Automatically Querying Gateways for Availability

Whilst the previous scripts are responsible for updating the routes once an interface toggles status, an additional script is needed to query the gateway to ensure there is connectivity.

Querying the gateway could be performed periodically by polling with a tool such as ping. The script below is meant to run every minute as part of a crontab in order to check that status of the interfaces and bring them up or down them in case their gateways are not responding.

#!/bin/bash
###########################################################################
##  Copyright (C) Wizardry and Steamworks 2017 - License: GNU GPLv3      ##
##  Please see: http://www.gnu.org/licenses/gpl.html for legal details,  ##
##  rights of fair usage, the disclaimer and warranty conditions.        ##
###########################################################################
 
###########################################################################
##                            CONFIGURATION                              ##
###########################################################################
 
###########################################################################
# This configuration section defines a few correlated arrays:             #
# - IFACE_DEVICE                                                          #
# - IFACE_GWS                                                             #
# where each entry in one array corresponds to an entry in all the other  #
# arrays. In other words, to add an additional interface to this          #
# configuration, you would have to add a corresponding entry to all the   #
# other arrays.                                                           #
###########################################################################
 
# These are the interfaces through which traffic should be load-balanced.
IFACE_DEVICE=( eth1 eth2 )
 
# The gateways of the interfaces.
IFACE_GWS=( 23.206.132.1 95.173.136.1 )
 
###########################################################################
##                              INTERNALS                                ##
###########################################################################
 
for i in ${!IFACE_DEVICE[*]}; do
    # Bring the interface up if it is down.
    INTERFACE_STATUS=`ifconfig ${IFACE_DEVICE[$i]} | grep UP`
    if [ -z "$INTERFACE_STATUS" ]; then
        ifconfig ${IFACE_DEVICE[$i]} up
    fi
    # Check whether the gateway is responding.
    ping -r -n -c 1 -W 5 \
        -I ${IFACE_DEVICE[$i]} \
        -q ${IFACE_GWS[$i]} &>/dev/null
    case "$?" in
        0)
            # If the gateway is responding and the interface is configured
            # there is nothing to do so continue with all other interfaces
            ifquery --state ${IFACE_DEVICE[$i]} &>/dev/null
            if [ "$?" = 0 ]; then
                continue;
            fi
            # If the gateway is responsive but the interface is not
            # configured, then bring the interface up.
            ifup ${IFACE_DEVICE[$i]} &>/dev/null
            ;;
        *)
            # Otherwise, take the interface down.
            ifdown ${IFACE_DEVICE[$i]} &>/dev/null
            ifconfig ${IFACE_DEVICE[$i]} down
            ;;
    esac
done

UPnP/NatPMP Issues

Since the route selection is performed at the kernel level, a daemon meant to provide UPnP to an internal network will not work since UPnP will only select a single WAN interface such client reconnects, or related connections to other IPs belonging to the same service, may balance out of a different interface thereby making the inserted NAT firewall rule useless.

In cases where UPnP is necessary, a systemd unit file can be created in order to start linux-igd on all WAN interfaces thereby providing port mappings regardless of outbound route balancing.