Differences

This shows you the differences between two versions of the page.

--- fuss:networking [2016/04/20 15:43] – [Determining Open Outbound Ports] office
+++ fuss:networking [2020/05/16 22:55] – [Determine ISP Address Blocks] office
@@ Line 1: / Line 1: @@
+====== Conditionally Routing Packets ======
+<ditaa>
+              +--------+
+              | Server |
+              +--------+
+    if:eth1/br0   ^                          +-----------------+
+   ip:192.168.0.5 |                          | Internet        |
+                  |       +---------+  eth0  |                 |
+                  +------>|   Us    +<------->                 |
+                          +---------+        |                 |
+                  if:tap0      ^             |    +--------+   |
+                gw:29.145.62.1 |             |    | Client |   |
+                  port:9999    +------------>+    +--------+   |
+                                             |                 |
+                                             +-----------------+
+</ditaa>
+First, add the table to ''/etc/iproute2/rt_tables'':
+<code>
+     output
+</code>
+Set the default route of the table ''output'' to go out through ''tap0'' and make a rule such that all packets with mark ''501'' will use that route in the table ''output'':
+<code bash>
+ip route add default via 29.145.62.1 dev tap0 table output
+ip rule add fwmark 501 lookup output
+ip route flush cache
+</code>
+Mark all outgoing packets from port ''9999'' with mark ''501'' and NAT them to the local ''IP'':
+<code bash>
+iptables -t mangle -A PREROUTING -s 192.168.0.5 -p tcp --sport 9999 -j MARK --set-mark 501
+iptables -t nat -A PREROUTING -i tap0 -p tcp --dport 9999 -j DNAT --to 192.168.0.5
+# This is not needed if you masquerade:
+iptables -t nat -A POSTROUTING -o tap0 -j SNAT --to 29.145.62.1
+</code>
+====== Enable TSO ======
+TSO is meant for high-bandwidth networks and offloads the CPU workload by queueing up buffers and letting the network card split them into packets.
+===== Linux =====
+TSO can be enabled for a network card using:
+<code bash>
+ethtool -K eth0 tso on
+</code>
+and on Debian it can be enabled by editing ''/etc/network/interfaces'':
+<code>
+# The primary network interface
+allow-hotplug eth0
+iface eth0 inet dhcp
+        up sleep 5; ethtool -K eth0 tso on
+</code>
+===== Windows =====
+  * Go to ''My Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters''.
+  * Create a ''DWORD'' named ''DisableLargeSendOffload''.
+  * Set the value to 0.
+  * Reboot.
+====== Tuning Initial Congestion Window Size ======
+In [[http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/|simple terms]], this reduces latency by allowing more packets to be sent during a TCP handshake. The initial congestion window size can be tuned when the interface goes up by creating a file in ''/etc/networking/if-up/'' named ''iniconwin'' containing the following:
+<code bash>
+#!/bin/sh -e
+##########################################################
+##   (C) Wizardry and Steamworks 2014, license: GPLv3   ##
+##########################################################
+# Do not bother to do anything if the interface does not
+# correspond to the interface for the default route.
+if [ "$IFACE" != eth0 ]; then
+	exit 0
+fi
+ip route change $(ip route show | grep '^default' | sed 's/initcwnd [0-9]+//' | sed 's/initrwnd [0-9]+//' ) initcwnd 12 initrwnd 12
+</code>
+The script assumes that the default interface is ''eth0'' and the script will have to be adapted by changing ''eth0'' to the default interface.
+====== Set Type of Service for Traffic Shaping ======
+Assuming that you have ''wondershaper'' installed and configured, the following ''TOS'' rules using ''iptables'' will help you prioritize traffic:
+<code bash>
+## ToS
+for table in OUTPUT PREROUTING; do
+	# HTTP / HTTPS
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --dport 80 -j TOS --set-tos Maximize-Throughput
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --sport 80 -j TOS --set-tos Maximize-Throughput
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --dport 443 -j TOS --set-tos Maximize-Throughput
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --sport 443 -j TOS --set-tos Maximize-Throughput
+	# DNS
+	iptables -t mangle -A $table -p udp -m state --state NEW,ESTABLISHED,RELATED --dport 53 -j TOS --set-tos Minimize-Delay
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --sport 53 -j TOS --set-tos Minimize-Delay
+	# SSH
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --dport 22 -j TOS --set-tos Minimize-Delay
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --sport 22 -j TOS --set-tos Minimize-Delay
+	# Samba
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --dport 137 -j TOS --set-tos Maximize-Throughput
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --sport 138 -j TOS --set-tos Maximize-Throughput
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --dport 139 -j TOS --set-tos Maximize-Throughput
+	iptables -t mangle -A $table -p tcp -m state --state NEW,ESTABLISHED,RELATED --sport 445 -j TOS --set-tos Maximize-Throughput
+done
+</code>
+====== Get Available Congestion Control Algorithms ======
+<code bash>
+sysctl net.ipv4.tcp_available_congestion_control
+</code>
+====== Calculate Transmit Queue Length ======
+The following formula can be used to calculate the ''txqueue'' size using the ''BDP'' rule:
+\begin{eqnarray*}
+TXQ = \frac{v_{d} * t * 0.125}{MTU}
+\end{eqnarray*}
+where:
+  * $v_{d}$ is the downlink speed in bits (from the gateway).
+  * $t$ is the delay in seconds (measured to the gateway using ''ping'').
+  * $MTU$ is the packet size in bytes (usually ''1500'' MTU).
+The result can then be set under Linux with:
+<code bash>
+ifconfig <interface> txqueuelen <value>
+</code>
+====== Calculate Address Range from IP and Netmask ======
+Cnvert the ''IP'' and the ''netmask'' to a binary representation. For example, for the ''IP'' address ''192.168.1.101'' we obtain ''11000000 10101000 00000001 01100101'' and for the ''netmask'' ''255.255.255.224'' we obtain ''11111111 11111111 11111111 11100000'' (the ''netmask'' must be a sequential series of ''1''s without any ''0'' gaps between the ''1''s).
+In order to obtain the first address, take the binary representation of the ''IP'' and ''AND'' it with the ''netmask'':
+<code>
+11000000 10101000 00000001 01100101 (IP)
+11111111 11111111 11111111 11100000 (Netmask)
+----------------------------------- AND
+11000000 10101000 00000001 01100000 = 192.168.1.96 (first network address)
+</code>
+Then take the ''netmask'' and invert the bits (''NOT'') which will give you the size of the range:
+<code>
+11111111 11111111 11111111 11100000 (Netmask)
+----------------------------------- NOT
+00000000 00000000 00000000 00011111 = 31 addresses
+</code>
+Finally, the range for the ''IP'' address ''192.168.1.101'' with subnet mask ''255.255.255.224'' is ''31'' addresses starting from ''192.168.1.96'' which gives the range: ''192.168.1.96-192.168.1.127.127''.
+====== Private Networks ======
+^ CIDR              ^ Range                          ^ Addresses    ^ Description ^
+| ''10.0.0.0/8''     | ''10.0.0.0–10.255.255.255''     | $16777216$ | For private networks as described in [[http://tools.ietf.org/html/rfc1918|RFC1918]]. |
+| ''100.64.0.0/10''  | ''100.64.0.0–100.127.255.255''  | $4194304$  | ISP NAT [[http://tools.ietf.org/html/rfc6598|RFC6598]]. |
+| ''172.16.0.0/12''  | ''172.16.0.0–172.31.255.255''   | $1048576$  | For private networks as described in [[http://tools.ietf.org/html/rfc1918|RFC1918]]. |
+| ''192.0.0.0/29''   | ''192.0.0.0–192.0.0.7''         | $8$        | ''DS-Lite'' transition mechanism as specified by [[http://tools.ietf.org/html/rfc6333|RFC6444]]. |
+| ''192.168.0.0/16'' | ''192.168.0.0–192.168.255.255'' | $65536$    | For private networks as described in [[http://tools.ietf.org/html/rfc1918|RFC1918]]. |
+| ''198.18.0.0/15''  | ''198.18.0.0–198.19.255.255''   | $131072$   | Inter-network communications between two separate subnets as specified in [[http://tools.ietf.org/html/rfc2544|RFC2544]]. |
+| ''fc00::/7''       | ''fc00::–fdff:ffff:ffff:ffff:ffff:ffff:ffff:ffff'' | $2^{121}$ | Unique local address. |
+====== Adjusting Ring Parameters ======
+On Linux you can get the ring parameters with ''ethtool''. For example, for the ''eth0'' interface:
+<code bash>
+ethtool -g eth0
+</code>
+which lists the pre-set maximums and the current settings:
+<code>
+Ring parameters for eth0:
+Pre-set maximums:
+RX:		1024
+RX Mini:	255
+RX Jumbo:	255
+TX:		1024
+Current hardware settings:
+RX:		512
+RX Mini:	0
+RX Jumbo:	128
+TX:		512
+</code>
+You might observe that the pre-set maximums may not match the current settings, so they can be set using ''ethtool'':
+<code bash>
+ethtool -G eth0 rx 1024 rx-mini 255 rx-jumbo 255 tx 1024
+</code>
+This can be made permanent on distribution such as Debian by editing ''/etc/network/interfaces'':
+<code>
+allow-hotplug eth0
+iface eth0 inet static
+        up sleep 5; /sbin/ethtool -G eth0 rx 1024 rx-mini 255 rx-jumbo 255 tx 1024
+</code>
+and adding the ''up'' directive which applies the setting on boot.
+====== Port-Test without Tools ======
+The following command can be used to connect to any host and port by using ''/dev/tcp'':
+<code bash>
+exec 7<>/dev/tcp/www.bing.com/80; cat <&7 & cat >&7; exec 7>&-
+</code>
+where:
+  * ''www.bing.com'' is the hostname to connect to
+  * ''80'' is the destination port
+The command uses ''exec'' to set up a redirect to file descriptor ''7'' (can be any number), after which a redirect is launched from file descriptor ''7'' to ''STDOUT'' and sent into the background (which causes the PID to be displayed) and then redirect ''STDIN'' to the same descriptor with the second ''cat''. Finally, when ''cat'' terminates (the connection is closed), the file descriptor is cleaned-up with ''exec''.
+====== Block QUIC ======
+QUIC is a protocol that uses UDP instead of TCP to serve content, working on port 80 and 443 and used widely by Google, Youtube, etc... Unfortunately, UDP reveals the connecting address since it bypasses HTTP entirely. In order to disable QUIC you can add the following rules to your firewall:
+<code bash>
+iptables -A FORWARD -i br0 -p udp -m udp --dport 80 -j REJECT --reject-with icmp-port-unreachable
+iptables -A FORWARD -i br0 -p udp -m udp --dport 443 -j REJECT --reject-with icmp-port-unreachable
+iptables -A FORWARD -s 192.168.1.0/24 ! -d 192.168.1.1 -p tcp -m tcp --dport 80 -m state --state RELATED,ESTABLISHED -j DROP
+iptables -A FORWARD -s 192.168.1.0/24 ! -d 192.168.1.1 -p tcp -m tcp --dport 443 -m state --state RELATED,ESTABLISHED -j DROP
+</code>
+where:
+  * ''192.168.1.0/24'' is the network subnet.
+  * ''192.168.1.1'' is the gateway.
+Additionally, you can have squid block alternate protocols by adding the following line:
+<code>
+# Disable alternate protocols
+request_header_access Alternate-Protocol deny all
+reply_header_access Alternate-Protocol deny all
+</code>
+to the squid configuration file.
+====== Disable ICP ======
+squid will broadcast ICP requests and in order to disable them, edit the squid configuration file and add:
+<code>
+# disable ICP
+icp_port 0
+icp_access deny all
+# plug ICP leaks
+reply_header_access X-Cache-Lookup deny !localnets
+reply_header_access X-Squid-Error deny !localnets
+reply_header_access X-Cache deny !localnets
+</code>
+where ''localness'' is an ACL defined in your configuration file that should point to the local network.
+====== Determining Open Outbound Ports ======
+Using [[http://portquiz.net|portquiz]] a trick is to get ''nmap'' to connect to portquiz.net on a port range:
+<code bash>
+nmap portquiz.net -p 1024-65535 -Pn --reason
+</code>
+where:
+  * ''-p 1024-65535'' is the port range between ''1024'' and ''65535''
+  * ''-Pn'' tells ''nmap'' not to ping and just to connect
+  * ''--reason'' will make ''nmap'' explain why a port was considered closed
+====== Determine ISP Address Blocks ======
+Either starting from a hostname, for instance ''tb1060.lon.100tb.com'' by issuing:
+<code bash>
+nslookup tb1060.lon.100tb.com
+</code>
+to determine the IP address, or from the IP address itself (in this case, ''146.185.28.59''), [[http://www.radb.net|RADb]] can be used to determine an ISP's address block.
+First, lookup the IP itself to determine which ISP it belongs to:
+<code bash>
+whois 146.185.28.59
+</code>
+Then, lookup the Autonomous System (AS) number (an ISP identifier code, if you will) of that ISP:
+<code bash>
+whois -h whois.radb.net 146.185.28.59 | grep ^origin
+</code>
+which should output:
+<code>
+origin:          AS29302
+</code>
+There may be more AS numbers for small internet providers that are, in turn, customers of a larger network.
+To make sure that the IP you are after is part of the AS, lookup the AS itself:
+<code bash>
+whois AS29302
+</code>
+and make sure that the ISP is listed.
+The final step is to get all known routes for the AS:
+<code bash>
+whois -h whois.radb.net -- -i origin -T route AS29302 | grep ^route | awk '{ print $2 }'
+</code>
+which should output all IPv4 address blocks allocated to that ISP line-by-line (easy to automate):
+<code bash>
+.185.16.0/20
+</code>
+IPv6 can also be queried in the same way:
+<code bash>
+whois -h whois.radb.net -- -i origin -T route6 AS29302 | grep ^route | awk '{ print $2 }'
+</code>
+and will yield similar results:
+<code bash>
+a01:5a80::/32
+</code>
+====== Solving Issues with PXE Servers not Working with Network Bridges with Spanning Tree Protocol Enabled ======
+A typical scenario of a non-working PXE server is a PXE server that has been set up on a Linux server running virtual machines that automatically join an STP-enabled network bridge once the virtual machine boots.
+The phenomenon is due to STP itself that runs through various stages (''Blocking'', ''Listening'', ''Learning'') before reaching the ''Forwarding'' state. When the virtual machine adds its interface to the STP-enabled bridge, the bridge switches to the ''Learning'' state, where, by default, the bridge spends at least 10 seconds (on Linux). For 10 seconds, the STP-enabled networking bridge will listen to packets and learn the new topology introduced by the addition of the interface. libvirt virtual machines run SeaBIOS as the default BIOS and, at version ''1.12'', the PXE boot code does not wait sufficiently for the bridge to switch to the ''Forwarding'' state and the network interface will not even be configured.
+Cisco routers have a (nasty) hack named ''portfast'' that can be set on a bridge that, when enabled, will skip over the ''Learning'' stage of the bridge and commute directly into the ''Forwarding'' state. Since the bridge will immediately forward packets, the issues with libvirt virtual machines should be resolved.
+In order to resolve the issue, STP can be turned off for the entire bridge:
+<code bash>
+brctl stp br0 off
+</code>
+but that means losing the extra benefits of having the STP protocol.
+Instead, and even better than Cisco ''portfast'', the forwarding delay can be lowered sufficiently for the SeaBIOS PXE boot code to obtain an IP address via DHCP:
+<code bash>
+brctl setfd br0 2
+</code>
+where:
+  * ''2'' is the number of seconds to spend in the ''Learning'' state (default ''10'' seconds).
+On Debian, in case the bridge is configured via ''/etc/network/interfaces'' the following changes can be made to the bridge in order to make the forwarding delay permanent:
+<code>
+auto br0
+iface br0 inet static
+...
+        # Enable STP
+        bridge_stp on
+        # Fix PXE with STP
+        bridge_fd 2
+...
+</code>