-
-
Save evanslai/78f7720e5fd36d2797fde290eb00457b to your computer and use it in GitHub Desktop.
HFSC - linux traffic shaping's best kept secret
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# As the "bufferbloat" folks have recently re-discovered and/or more widely | |
# publicized, congestion avoidance algorithms (such as those found in TCP) do | |
# a great job of allowing network endpoints to negotiate transfer rates that | |
# maximize a link's bandwidth usage without unduly penalizing any particular | |
# stream. This allows bulk transfer streams to use the maximum available | |
# bandwidth without affecting the latency of non-bulk (e.g. interactive) | |
# streams. | |
# In other words, TCP lets you have your cake and eat it too -- both fast | |
# downloads and low latency all at the same time. | |
# However, this only works if TCP's afore-mentioned congestion avoidance | |
# algorithms actually kick in. The most reliable method of signaling | |
# congestion is to drop packets. (There are other ways, such as ECN, but | |
# unfortunately they're still not in wide use.) | |
# Dropping packets to make the network work better is kinda counter-intuitive. | |
# But, that's how TCP works. And if you take advantage of that, you can make | |
# TCP work great. | |
# Dropping packets gets TCP's attention and fast. The sending endpoint | |
# throttles back to avoid further network congestion. In other words, your | |
# fast download slows down. Then, as long as there's no further congestion, | |
# the sending endpoint gradually increases the transfer rate. Then the cycle | |
# repeats. It can get a lot more complex than that simple explanation, but the | |
# main point is: dropping packets when there's congestion is good. | |
# Traffic shaping is all about slowing down and/or dropping (or ECN marking) | |
# packets. The thing is, it's much better for latency to simply drop packets | |
# than it is to slow them down. Linux has a couple of traffic shapers that | |
# aren't afraid to drop packets. One of the most well-known is TBF, the Token | |
# Bucket Filter. Normally it slows down packets to a specific rate. But, it | |
# also accepts a "limit" option to specify the maximum number of packets to | |
# queue. When the limit is exceeded, packets are dropped. | |
# TBF's simple "tail-drop" algorithm is actually one of the worst kinds of | |
# "active queue management" (AQM) that you can do. But even still, it can make | |
# a huge difference. Applying TBF alone (with a short enough limit) can make a | |
# maddeningly high-latency link usable again in short order. | |
# TBF's big disadvantage is that it's a "classless" shaper. That means you | |
# can't prioritize one TCP stream over another. That's where HTB, the | |
# Hierarchical Token Bucket, comes in. HTB uses the same general algorithm as | |
# TBF while also allowing you to filter specific traffic to prioritized queues. | |
# But HTB has a big weakness: it doesn't have a good, easy way of specifying a | |
# queue limit like TBF does. That means, compared to TBF, HTB is much more | |
# inclined to slow packets rather than to drop them. That hurts latency, bad. | |
# So now we come to Linux traffic shaping's best kept secret: the HFSC shaper. | |
# HFSC stands for Hierarchical Fair Service Curve. The linux implementation is | |
# a complex beast, enough so to have a 9 part question about it on serverfault | |
# ( http://serverfault.com/questions/105014/does-anyone-really-understand-how-hfsc-scheduling-in-linux-bsd-works ). | |
# Nonetheless, HFSC can be understood in a simplified way as HTB with limits. | |
# HFSC allows you to classify traffic (like HTB, unlike TBF), but it also has | |
# no fear of dropping packets (unlike HTB, like TBF). | |
# HFSC does a great job of keeping latency low. With it, it's possible to fully | |
# saturate a link while maintaining perfect non-bulk session interactivity. | |
# It is the holy grail of traffic shaping, and it's in the stock kernel. | |
# To get the best results, HFSC should be combined with SFQ (Stochastic | |
# Fairness Queueing) and optionally an ingress filter. If all three are used, | |
# it's possible to maintain low-latency interactive sessions even without any | |
# traffic prioritization. Further adding prioritization then maximizes | |
# interactivity. | |
# Here's how it's done: | |
# set this to your internet-facing network interface: | |
WAN_INTERFACE=eth0 | |
# set this to your local network interface: | |
LAN_INTERFACE=eth1 | |
# how fast is your downlink? | |
MAX_DOWNRATE=3072kbit | |
# how close should we get to max down? e.g. 90% | |
USE_DOWNPERCENT=0.90 | |
# how fast is your uplink? | |
MAX_UPRATE=384kbit | |
# how close should we get to max up? e.g. 80% | |
USE_UPPERCENT=0.80 | |
# what port do you want to prioritize? e.g. for ssh, use 22 | |
INTERACTIVE_PORT=22 | |
## now for the magic | |
# remove any existing qdiscs | |
/sbin/tc qdisc del dev $WAN_INTERFACE root 2> /dev/null | |
/sbin/tc qdisc del dev $WAN_INTERFACE ingress 2> /dev/null | |
/sbin/tc qdisc del dev $LAN_INTERFACE root 2> /dev/null | |
/sbin/tc qdisc del dev $LAN_INTERFACE ingress 2> /dev/null | |
# computations | |
MAX_UPNUM=`echo $MAX_UPRATE | sed 's/[^0-9]//g'` | |
MAX_UPBASE=`echo $MAX_UPRATE | sed 's/[0-9]//g'` | |
MAX_DOWNNUM=`echo $MAX_DOWNRATE | sed 's/[^0-9]//g'` | |
MAX_DOWNBASE=`echo $MAX_DOWNRATE | sed 's/[0-9]//g'` | |
NEAR_MAX_UPNUM=`echo "$MAX_UPNUM * $USE_UPPERCENT" | bc | xargs printf "%.0f"` | |
NEAR_MAX_UPRATE="${NEAR_MAX_UPNUM}${MAX_UPBASE}" | |
NEAR_MAX_DOWNNUM=`echo "$MAX_DOWNNUM * $USE_DOWNPERCENT" | bc | xargs printf "%.0f"` | |
NEAR_MAX_DOWNRATE="${NEAR_MAX_DOWNNUM}${MAX_DOWNBASE}" | |
HALF_MAXUPNUM=$(( $MAX_UPNUM / 2 )) | |
HALF_MAXUP="${HALF_MAXUPNUM}${MAX_UPBASE}" | |
HALF_MAXDOWNNUM=$(( $MAX_DOWNNUM / 2 )) | |
HALF_MAXDOWN="${HALF_MAXDOWNNUM}${MAX_DOWNBASE}" | |
# install HFSC under WAN to limit upload | |
/sbin/tc qdisc add dev $WAN_INTERFACE root handle 1: hfsc default 11 | |
/sbin/tc class add dev $WAN_INTERFACE parent 1: classid 1:1 hfsc sc rate $NEAR_MAX_UPRATE ul rate $NEAR_MAX_UPRATE | |
/sbin/tc class add dev $WAN_INTERFACE parent 1:1 classid 1:10 hfsc sc umax 1540 dmax 5ms rate $HALF_MAXUP ul rate $NEAR_MAX_UPRATE | |
/sbin/tc class add dev $WAN_INTERFACE parent 1:1 classid 1:11 hfsc sc umax 1540 dmax 5ms rate $HALF_MAXUP ul rate $HALF_MAXUP | |
# prioritize interactive ports | |
/sbin/tc filter add dev $WAN_INTERFACE protocol ip parent 1:0 prio 1 u32 match ip sport $INTERACTIVE_PORT 0xffff flowid 1:10 | |
/sbin/tc filter add dev $WAN_INTERFACE protocol ip parent 1:0 prio 1 u32 match ip dport $INTERACTIVE_PORT 0xffff flowid 1:10 | |
# add SFQ | |
/sbin/tc qdisc add dev $WAN_INTERFACE parent 1:10 handle 30: sfq perturb 10 | |
/sbin/tc qdisc add dev $WAN_INTERFACE parent 1:11 handle 40: sfq perturb 10 | |
# install ingress filter to limit download to 97% max | |
MAX_DOWNRATE_INGRESSNUM=`echo "$MAX_DOWNNUM * 0.97" | bc | xargs printf "%.0f"` | |
MAX_DOWNRATE_INGRESS="${MAX_DOWNRATE_INGRESSNUM}${MAX_DOWNBASE}" | |
/sbin/tc qdisc add dev $WAN_INTERFACE handle ffff: ingress | |
/sbin/tc filter add dev $WAN_INTERFACE parent ffff: protocol ip prio 1 u32 match ip sport $INTERACTIVE_PORT 0xffff flowid :1 | |
/sbin/tc filter add dev $WAN_INTERFACE parent ffff: protocol ip prio 1 u32 match ip dport $INTERACTIVE_PORT 0xffff flowid :1 | |
/sbin/tc filter add dev $WAN_INTERFACE parent ffff: protocol ip prio 50 u32 match ip src 0.0.0.0/0 police rate $MAX_DOWNRATE_INGRESS burst 20k drop flowid :2 | |
# install HFSC under LAN to limit download | |
/sbin/tc qdisc add dev $LAN_INTERFACE root handle 1: hfsc default 11 | |
/sbin/tc class add dev $LAN_INTERFACE parent 1: classid 1:1 hfsc sc rate 1000mbit ul rate 1000mbit | |
/sbin/tc class add dev $LAN_INTERFACE parent 1:1 classid 1:10 hfsc sc umax 1540 dmax 5ms rate 900mbit ul rate 900mbit | |
/sbin/tc class add dev $LAN_INTERFACE parent 1:1 classid 1:11 hfsc sc umax 1540 dmax 5ms rate $HALF_MAXDOWN ul rate $NEAR_MAX_DOWNRATE | |
# prioritize interactive ports | |
/sbin/tc filter add dev $LAN_INTERFACE protocol ip parent 1:0 prio 1 u32 match ip sport $INTERACTIVE_PORT 0xffff flowid 1:10 | |
/sbin/tc filter add dev $LAN_INTERFACE protocol ip parent 1:0 prio 1 u32 match ip dport $INTERACTIVE_PORT 0xffff flowid 1:10 | |
# add SFQ | |
/sbin/tc qdisc add dev $LAN_INTERFACE parent 1:10 handle 30: sfq perturb 10 | |
/sbin/tc qdisc add dev $LAN_INTERFACE parent 1:11 handle 40: sfq perturb 10 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment