When looking at packet marking, it seems desirable to have a way to mark, or rather track, packets across the entire path to the final gateway, in order to be able to perform routing decisions without having to import the routes to all machines behind the gateway.
Unfortunately, packet marking is only local to the machine and is not part of any packet field, such that packets can only be marked and tracked only on the same machine and within the same IP stack, without carrying over to upstream machines after the packet leaves the packet filer.
For this task, it seems that the ToS field of packets seems the most suitable place to add some information such that the packet can be treated conditionally on an upstream machine. Ultimately, the ToS field was specifically conceived in order to conditionally route packets depending on the type of service that they require.
The ToS field has been redefined as an 8-bit long differentiated services (DS) that consists of a 6-bit long Differentiated Services Code Point (DSCP) field and 2-bit long ECN field.
It is possible to mark packets with DSCP simply using iptables
, very similar to marking. For example, the following command:
iptables -t mangle -A OUTPUT -p icmp -j DSCP --set-dscp 0x7
will mangle all ICMP packets such that all outbound ICMP packets exiting the machine will have a DSCP value of . For example, using ping
and sending ICMP requests whilst monitoring traffic on the gateway will reveal output along the lines of:
00:80:43.414253 IP (tos 0x1c, ttl 64, id 12090, offset 0, flags [DF], proto ICMP (1), length 84) a > b : ICMP echo request, id 4919, seq 7, length 64
with the ToS field set to .
Notice that the DSCP was set to the number , which, represented as a binary number with 6 digits is:
000111
but due to the entire ToS field being 8 bits long, the complete number will include the last two s for the ECN field:
00011100
Now, converting to hexadecimal, yields the observable with the tcpdump
packet sniffer.
Naturally, on the upstream it is possible to match against the DSCP value and then perform any operations upon the packet. With iptables
there exists a DSCP match that can be used to modify packets. For example, the following line, will match the ICMP packets with a DSCP value of from the previous iptables
command, as the packet enters the packet filter on an upstream machine:
iptables -t nat -A PREROUTING -p icmp -m dscp --dscp 0x7 -j DROP
One use could be to mark, match and route packets originating from various sources that are difficult to match against. For example, rtorrent
is a torrent client that does not allow specifying the source ports. The result is that there is little left to match against if one wishes to redirect torrent traffic conditionally.
Imagine a setup where rtorrent
is running on a machine "A" and routing through the upstream gateway on machine "B". In turn, machine "B" has a direct uplink to the Internet, but also maintains a connection to the Internet via a VPN.
The goal is now to make it such that all rtorrent
traffic somehow gets routed through the VPN connection, after the upstream gateway, instead of taking the straight common path to the Internet. This scenario is fairly realistic given that some ISPs practice various policies for torrenting such that a VPN must be used.
The first step would be to create a group on the machine running rtorrent
(following the example, machine "A") and ensure that the rtorrent
executable runs under the newly created group. In doing so, the iptables
owner match can then be used to match the packets stemming from the user running rtorrent
still on machine "A" and additionally set a DSCP value for all packets leaving the machine:
iptables -A OUTPUT -m owner --gid-owner RTORRENT_GROUP_ID -j DSCP --set-dscp 0x7
where:
RTORRENT_GROUP_ID
is the group id that the rtorrent
client is running on the client machine.Then, on the upstream machine, following the example, machine "B", the packets are matched by DSCP value () and marked locally, still on machine "B", () for routing:
iptables -t nat -A PREROUTING -m dscp --dscp 0x7 -j MARK --set-mark 0x3
Next, it is assumed that corresponds to a routing table previously created with iproute2 that will route the package through the VPN. For example, a rule could be established based on the firewall mark:
ip rule add fwmark FIREWALL_MARKlookup table VPN_TABLE
FIREWALL_MARK
is the firewall mark, following the example, 0x3
.VPN_TABLE
is a descriptive name for the lookup table (must be added to /etc/iproute2/rt_tables
)and then the table populated with a route through the VPN:
ip route add default via VPN_GATEWAY lookup table VPN_TABLE
where:
VPN_GATEWAY
is the IP address of the upstream VPN gateway,VPN_TABLE
is a descriptive name for the lookup table (must be added to /etc/iproute2/rt_tables
)It is worth remembering that compared to marking, the cool thing is that DSCP marking is preserved across any and all hops, provided that there is no packet mangling being done that rewrites the DSCP value for whatever reason.
Using tcpdump
it is possible to observe a DSCP value in particular by using a carefully crafted IP field matcher:
tcpdump -vvv -i eth0 (ip and (ip[1] & 0xfc) >> 2 == 20)
Looking closer at the following tcpdump
formula:
(ip and (ip[1] & 0xfc) >> 2 == 20)
it can all be meticulously unpacked from the start (inner-most equation):
ip[1]
is the second field in the IP header, thus the ToS / DSCP field (first is version number),11111100
) and then,00xxxxxx
), 00xxxxxx
is then compared to the decimal number 20
Taking the previous example, where the ToS value was observed with tcpdump
and the DSCP to match was , the numeric values can be substituted into the formula and then reduced further to the equality:
which seems to check out.
Thus, the following tcpdump
filter:
<code bash>
tcpdump -vvv -i eth0 (ip and (ip[1] & 0xfc) » 2 == 7)
</code
that can be used to match packets with a DSCP value of .