The following article describes a setup meant to do route-based load balancing using different Internet providers on disjunct network using advanced Linux routing. The purpose is to distribute outbound connections over multiple uplinks and to allow uplinks to fail such that new outbound connections will seamlessly be routed over the remaining active Internet outbound uplinks.
For that purpose, Linux advanced routing is used, notably the ability to route and distribute packets out of different interfaces. The downside of the presented approach, for instance, contrasted to BGP routing, is that active connections over links that fail will fail as well and will need to be rebuilt - in essence, this affects statefull protocols where established connections have to be maintained but will affect stateless protocols (such as HTTP) somewhat less.
The approach differs from the the multiple connection load-balancing and routing article in that marking packets with iptables
is not mandatory - although, given well-defined advanced routing tables, it would still allow sending site-specific marked packets through specific ISPs.
Informally, this article will allow you to distribute your Internet connection over multiple ISPs, with the added bonus that once one of your ISPs crap out, new routes will be established through the other ISPs that are still active. However, stuff like "games", "Skype" and similar will still suffer an interruption in case the connection was established over an ISP that failed but it will allow you to reconnect and you will reconnect through an ISP that is not down. Stuff like "web browsing" will rock because there are no permanently established connections such that it will appear like a seamless transition from a failing ISP to the other. You also get the added bonus that once an ISP recovers, it will be added back to the pool of active ISPs. Ah yea, the shit about BGP is something better than this where even statefull connections failover ("games" and such) but it requires your ISPs to cooperate and they surely shallen't…
dhclient
is used. Some other DHCP client can be used provided that it can be configured to not retrieve routes because the routes will be managed automatically by the script provided in the final section.
On Debian, the interfaces can be added to /etc/network/interfaces
. As per the example topology of this article, one would need to add the following lines to /etc/network/interfaces
:
# Internet - ISP1 auto eth1 iface eth1 inet dhcp # Internet - ISP2 auto eth2 iface eth2 inet dhcp
where eth1
corresponds to the first ISP and eth2
corresponds to the second ISP.
If there are more than two ISPs through which balancing outbound connections is desired, then those have to be added as well.
If the Linux server is Debian-based and uses dhclient
to retrieve DHCP information, then the /etc/dhcp/dhclient.conf
will have to be amended to remove the routers
parameter from request
.
However, that is not enough and Debian might still add the default route for each interface. To make dhclient
ignore the gateway for each queried interface, a file can be created at /etc/dhcp/dhclient-enter-hooks.d/no-default-route
with the following contents:
# Do not add default routes. case $reason in BOUND|RENEW|REBIND|REBOOT) unset new_routers ;; esac
which should make dhclient
avoid setting-up default routes for queried interfaces.
For testing, one can issue:
dhclient eth1
and then checking whether there is any default
route with:
ip route show default
On Debian, advanced routing tables can be defined in /etc/iproute2/rt_tables
line-by-line using a number (that can be referenced using iptables
marking, but this article does not) and a descriptive name for the table.
A table for each ISP will have to be created, for instance, for two ISPs, the following lines will be added in /etc/iproute2/rt_tables
:
501 isp1 502 isp2
These are just definitions and they do absolutely nothing right now - they are like, labels, if you will for what will follow.
When interfaces go up or down, a script is used to add and remove routes and for adding multiple interfaces to the default route in order to achieve the load-balancing effect. Once an interface goes up that is also an ISP interface (configurable in the script), routes are added and then if the interface is not part of default route pool, the default root pool is rebuilt to include it. In case an interface goes down, the corresponding routes are removed but the default route pool is not recreated in order to leave a marker for the kernel, indicating that the interface is dead (it will appear as dead
by issuing ip route show default
).
The script has to be placed both at /etc/network/if-up.d/uplink-load-balancing
(and will be executed when interfaces go up) and at /etc/network/if-down.d/uplink-load-balancing
(and will be executed when interfaces go down).
One non-trivial (ie: copying or creating the file twice) way to add the script in both locations is to create a hard-link between /etc/network/if-up.d/uplink-load-balancing
and /etc/network/if-down.d/uplink-load-balancing
that has the added bonus that changes to either one of them will reflect in the other (in case the configuration must be changed).
The script uses arrays where:
/etc/iproute2/rt_table
have to be defined one after the other in IFACE_TABLES
,IFACE_DEVICE
,IFACE_IPS
,IFACE_GWS
and,IFACE_WEI
.
Concerning interface weighting, the current script will prefer building routes eth1
twice over eth2
due to the first entry in IFACE_DEVICE
being eth1
and the first entry in IFACE_WEI
being 2
- contrasted to the column corresponding to the eth2
device.
Finally, the script can be marked executable by issuing:
chmod +x /etc/network/if-up.d/uplink-load-balancing chmod +x /etc/network/if-down.d/uplink-load-balancing
and the setup is ready to go.
#!/bin/bash ########################################################################### ## Copyright (C) Wizardry and Steamworks 2017 - License: GNU GPLv3 ## ## Please see: http://www.gnu.org/licenses/gpl.html for legal details, ## ## rights of fair usage, the disclaimer and warranty conditions. ## ########################################################################### ########################################################################### ## CONFIGURATION ## ########################################################################### ########################################################################### # This configuration section defines a few correlated arrays: # # - IFACE_TABLES # # - IFACE_TABLES_PRIO # # - IFACE_DEVICE # # - IFACE_IPS # # - IFACE_NMS # # - IFACE_GWS # # - IFACE_WEI # # - IFACE_MARK # # where each entry in one array corresponds to an entry in all the other # # arrays. In other words, to add an additional interface to this # # configuration, you would have to add a corresponding entry to all the # # other arrays. # ########################################################################### # These are tables that must be defined in /etc/iproute2/rt_tables before # running the script. IFACE_TABLES=( isp1 isp2 ) # Priority for each table. IFACE_TABLES_PRIO=( 250 251 ) # These are the interfaces through which traffic should be load-balanced. IFACE_DEVICE=( eth1 eth2 ) # These are the IP addresses of the interfaces. IFACE_IPS=( 23.206.132.14 95.173.136.70 ) # The corresponding netmask of both interfaces. IFACE_NMS=( /24 /24 ) # The gateways of the interfaces. IFACE_GWS=( 23.206.132.1 95.173.136.1 ) # Weights corresponding to the defined interfaces. IFACE_WEI=( 2 1 ) # Interface mark for iptables marking (must be in hexadecimal). IFACE_MARK=( 0x97 0x98 ) # General route options that will be added to routing definitions. It is # recommended to at least specify "proto static" due to "proto static" # being a hint to the operations system that these routes are voluntary and # that other executing scripts should not consider them discardable. ROUTE_OPTIONS="proto static initcwnd 70 initrwnd 70 quickack 1" ########################################################################### ## INTERNALS ## ########################################################################### # Search the interface that just went up or down and then add the # necessary rules and routes in order to send packets back out the same # interface. for i in ${!IFACE_DEVICE[*]}; do # Search for the device in the list of interfaces. if [ "$IFACE" = ${IFACE_DEVICE[$i]} ]; then case "$MODE" in start) ip rule add from ${IFACE_IPS[$i]}${IFACE_NMS[$i]} table ${IFACE_TABLES[$i]} priority ${IFACE_TABLES_PRIO[$i]} ip route replace default $ROUTE_OPTIONS via ${IFACE_GWS[$i]} dev ${IFACE_DEVICE[$i]} table ${IFACE_TABLES[$i]} ip route append prohibit default table ${IFACE_TABLES[$i]} metric 1 proto static while [ `ip rule show | grep "fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]}" | wc -l` != 0 ]; do ip rule del fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]} done ip rule add fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]} # The default route will not be replaced because other # interfaces may also be active, in which case multi-path # routing would be chosen. # ip route replace default $ROUTE_OPTIONS via ${IFACE_GWS[$i]} ;; stop) while [ `ip rule show | grep "fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]}" | wc -l` != 0 ]; do ip rule del fwmark ${IFACE_MARK[$i]} lookup ${IFACE_TABLES[$i]} done ip route del default $ROUTE_OPTIONS via ${IFACE_GWS[$i]} dev ${IFACE_DEVICE[$i]} table ${IFACE_TABLES[$i]} ip route del prohibit default table ${IFACE_TABLES[$i]} metric 1 proto static ip rule del from ${IFACE_IPS[$i]}${IFACE_NMS[$i]} table ${IFACE_TABLES[$i]} priority ${IFACE_TABLES_PRIO[$i]} # Delete the default route if it is set to the interface # that went down. ip route del default $ROUTE_OPTIONS via ${IFACE_GWS[$i]} ;; esac # Only one interface is reported by the if-up and if-down script hooks. break fi done # The default load-balancing route has to be re-initialized by adding new # interfaces when they are brought up without removing when they are dead. if [ "$MODE" = "start" ]; then ACTIVE_INTERFACES=`ip route show default | grep nexthop | wc -l` case "$ACTIVE_INTERFACES" in ${#IFACE_DEVICE[@]}) # All definied interfaces have been added. ;; *) # Set up a standard default route statement. REPLACE_DEFAULT_ROUTE_COMMAND="ip route replace default $ROUTE_OPTIONS" # Add all active interfaces as nexthop to the command. for i in ${!IFACE_DEVICE[*]}; do # Retrieve the IP address of the interface - this filters # out dead interfaces implicitly. INTERFACE_ADDRESS=`ifconfig ${IFACE_DEVICE[$i]} | grep 'inet' | egrep -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" | head -n 1` if [ ! -z "$INTERFACE_ADDRESS" ]; then REPLACE_DEFAULT_ROUTE_COMMAND="$REPLACE_DEFAULT_ROUTE_COMMAND nexthop via ${IFACE_GWS[$i]} dev ${IFACE_DEVICE[$i]} weight ${IFACE_WEI[$i]}" fi done # Execute the command to replace the default route. eval $REPLACE_DEFAULT_ROUTE_COMMAND ;; esac fi # Flush the cache ip route flush cache
In order to test the configuration, one would take all interfaces down manually using ifdown
, in the discussed case, by issuing:
ifdown eth1 ifdown eth2
and then ensuring that there are no left-overs by checking the routing rules:
ip rule show
The script will create rules that will look similar to:
32761: from 23.206.132.14 lookup isp1 32762: from 95.173.136.70 lookup isp2
Next one would check that there is no default route set by issuing the following command:
ip route show default
Note that the article assumes that there exists a third interface for the LAN - in case the server is connected only to the ISPs to load-balance over, then an extra precaution has to be taken to not get disconnected by mistake when setting up (for instance, if you connect to the server remotely over SSH through one of the ISPs, you might end up being disconnected once the default route is removed).
If connectivity to the server is ensured, the following command can be used to remove the default route:
ip route del default
Finally, with all the configurations made and in-place, each interface can be brought up one-by-one. For instance, in the scope of the article:
ifup eth2
that will bring up the eth2
interface. Once an interface goes up, the recommended procedure is to check that the corresponding advanced routing rules are created and that a default path is created. The result of the aforementioned command, in the context of the command will trigger the script and should create the following:
default via 95.173.136.1 dev eth2
plus any options specified in ROUTE_OPTIONS
(verifiable by issuing: ip route show default
),from 95.173.136.70 lookup isp2
for isp2
(verifiable by issuing ip route show
),default via 95.173.136.1 dev eth2
plus any options specified in the ROUTE_OPTIONS
configuration section of the script (verifiable by issuing ip route show table isp2
)After that, other interfaces are brought up, for instance by issuing:
ifup eth1
that will have the following effect (assuming that eth2
has previously been brought up):
from 23.206.132.14 lookup isp1
for isp1
(verifiable by issuing ip route show
),default via 23.206.132.1 dev eth1
plus any options specified in the ROUTE_OPTIONS
configuration section of the script (verifiable by issuing ip route show table isp1
)and the default route will be changed to a load-balanced route:
default proto static initcwnd 70 initrwnd 70 quickack 1 nexthop via 95.173.136.1 dev eth1 weight 2 nexthop via 23.206.132.1 dev eth2 weight 1
Whenever either eth1
or eth2
go down, one branch will be marked dead
and the kernel will route through the active interface. In case both interfaces go down, then the server will have no default route. In case eth1
goes down, eth1
will be marked dead
and any new connections will be established through eth2
. At a later time, when eth1
recovers, the script will rebuild the load-balanced default route and new connections will be scheduled either through eth1
or eth2
depending on the weights specified in the IFACE_WEI
configuration key of the script.
Whilst the previous scripts are responsible for updating the routes once an interface toggles status, an additional script is needed to query the gateway to ensure there is connectivity.
Querying the gateway could be performed periodically by polling with a tool such as ping
. The script below is meant to run every minute as part of a crontab in order to check that status of the interfaces and bring them up or down them in case their gateways are not responding.
#!/bin/bash ########################################################################### ## Copyright (C) Wizardry and Steamworks 2017 - License: GNU GPLv3 ## ## Please see: http://www.gnu.org/licenses/gpl.html for legal details, ## ## rights of fair usage, the disclaimer and warranty conditions. ## ########################################################################### ########################################################################### ## CONFIGURATION ## ########################################################################### ########################################################################### # This configuration section defines a few correlated arrays: # # - IFACE_DEVICE # # - IFACE_GWS # # where each entry in one array corresponds to an entry in all the other # # arrays. In other words, to add an additional interface to this # # configuration, you would have to add a corresponding entry to all the # # other arrays. # ########################################################################### # These are the interfaces through which traffic should be load-balanced. IFACE_DEVICE=( eth1 eth2 ) # The gateways of the interfaces. IFACE_GWS=( 23.206.132.1 95.173.136.1 ) ########################################################################### ## INTERNALS ## ########################################################################### for i in ${!IFACE_DEVICE[*]}; do # Bring the interface up if it is down. INTERFACE_STATUS=`ifconfig ${IFACE_DEVICE[$i]} | grep UP` if [ -z "$INTERFACE_STATUS" ]; then ifconfig ${IFACE_DEVICE[$i]} up fi # Check whether the gateway is responding. ping -r -n -c 1 -W 5 \ -I ${IFACE_DEVICE[$i]} \ -q ${IFACE_GWS[$i]} &>/dev/null case "$?" in 0) # If the gateway is responding and the interface is configured # there is nothing to do so continue with all other interfaces ifquery --state ${IFACE_DEVICE[$i]} &>/dev/null if [ "$?" = 0 ]; then continue; fi # If the gateway is responsive but the interface is not # configured, then bring the interface up. ifup ${IFACE_DEVICE[$i]} &>/dev/null ;; *) # Otherwise, take the interface down. ifdown ${IFACE_DEVICE[$i]} &>/dev/null ifconfig ${IFACE_DEVICE[$i]} down ;; esac done
Since the route selection is performed at the kernel level, a daemon meant to provide UPnP to an internal network will not work since UPnP will only select a single WAN interface such client reconnects, or related connections to other IPs belonging to the same service, may balance out of a different interface thereby making the inserted NAT firewall rule useless.
In cases where UPnP is necessary, a systemd unit file can be created in order to start linux-igd
on all WAN interfaces thereby providing port mappings regardless of outbound route balancing.