Skip to content

Instantly share code, notes, and snippets.

@tobyxdd
Created November 9, 2023 05:19
Show Gist options
  • Save tobyxdd/0993ac063b2eee94f7d36ddd786f52ce to your computer and use it in GitHub Desktop.
Save tobyxdd/0993ac063b2eee94f7d36ddd786f52ce to your computer and use it in GitHub Desktop.

My response to the recent controversy about TCP Brutal

Last week, I released TCP Brutal, a TCP port of one of the congestion control algorithms available in Hysteria, my other project which is a QUIC-based anti-censorship proxy. I’m surprised by its surge in popularity and the controversy it has generated, having been featured on sites like Hacker News, Zhihu, even drawing criticism from David Reed, an early TCP/IP developer. While many points are perfectly valid, I believe many critics are overlooking very important context, which I intend to clarify in this post.

Brutal, both in its TCP and QUIC forms, was purposefully built for anti-censorship proxies to deal with China's unique (I hope) internet situation. China has only a handful of government-controlled nodes for connecting its Internet to the rest of the world, called "Internet cross-border security gateway" (数据跨境安全网关) in official government documents. Beyond the bizarre censorship that blocks virtually every foreign website and service—forcing software engineers to use proxies and VPNs for basic development activities, preventing any sites from having local CDN nodes—these gateways suffer from extreme congestion. The limited bandwidth, disproportionate to the amount of users (plus the extensive censorship algorithms), results in what many see as "intentional service degradation." Technically, China remains connected to the global internet, but the government definitely doesn’t have the motivation to improve the fanqiang (”wall-breaking”) experience for the average Zhous. At times you may be able to connect directly to GitHub and even download files from S3, but at speeds of 2-3kb/s it’s practically useless.

The origins of Hysteria go back to 2015, a time before BBR/QUIC and the like. It started as a proxy project based on a reliable UDP protocol that I rewrote from scratch in Java, inspired by IBM Aspera. During testing, I discovered that packet loss rates on international connections stayed the same regardless of the speed (until you hit the ISP's traffic policing limits, of course.) This was something of a breakthrough in Chinese proxy circles, as BBR wasn't a thing at the time, and the default choice was Cubic, which relies on loss as a congestion signal and is a major reason for low connection speeds. As a college student without networking expertise, I was thrilled by my project’s ability to stream YouTube in 1080p using just a random VPS in Japan, which was basically impossible with Cubic. Over the years, it has been completely rewritten in Go, and my custom reliable UDP protocol has been replaced by QUIC. I also adapted BBR in Hysteria, realizing that it was a much more elegant algorithm than whatever Brutal is. At one point, I considered dropping Brutal and only doing BBR for the Hysteria 2 update, but people in the community said that Brutal was still faster than BBR, so I stuck with it.

Today, with a better understanding of networking, I know that Brutal essentially breaks congestion fairness by taking bandwidth from other users. A fun fact is that Brutal originally had no name. It was only after I fully realized its aggressive nature that I named it, hoping that the name would signal its behavior to users. I will be updating the documentation for both Hysteria and TCP Brutal to ensure that users are fully aware of the implications of using it. However, I have no intention of removing Brutal from Hysteria or removing the TCP Brutal repository. I believe that given its purpose and the context of its use, Brutal has done more good than harm. It has allowed more people to use the global Internet at functional speeds. Brutal is not responsible for "destroying the internet." If anything, the global connectivity aspect of the internet in China had been destroyed long before Brutal existed. It merely shifts the rubble to make a path forward.

PS: For users in other countries who are concerned about whether this will affect their Internet experience - it's highly unlikely. Brutal is designed for a specific use case, and it's basically impossible to use outside of it, requiring application support and manual bandwidth settings. This is simply not going to happen outside of a small subset of anti-censorship proxy circles.

@tavimori
Copy link

tavimori commented Nov 9, 2023

Censorship and congestion, although often co-existing, are two different things. It is unethical to rationalize unfair rate control algorithms by blaming censorship. In fact, for most users, the reason there is severe congestion may actually be that they are not paying enough.

@escape0707
Copy link

escape0707 commented Nov 9, 2023

Those criticisms came with no understanding of the context and the status duo of Chinese severe Internet censorship.

Brutal is brutal while GFW is ruthless.

Blame the institution which forces people to use Brutal. Blame unethical usage and selfish users who used it ignoring Brutal's design goal. And use simple detection mechanism against unfair usage.

Blame the causing side of a nuclear war, but don't blame Einstein or Oppenheimer.

@dtaht
Copy link

dtaht commented Nov 9, 2023

https://datatracker.ietf.org/doc/html/draft-mathis-iccrg-relentless-tcp-00 describes, well, a similar algorithm, and its constraints and side-effects. I was also on the hackernews thread. If your algo was also delay sensitive, it would help. If the GFW had fq_codel derived algos in it, it might help also. BBR is less likely to harm others than brutal is.

@xiaokangwang
Copy link

TCP's congestion avoidance system are designed based on the assumption of network neutrality, and non-interactive traffic. Network neutrality means the traffic from all network users are treated equally, without discrimination. Non-interactive means the no end user is waiting in front of screen for transfer to finish, and additional wait time does not matter. TCP's congestion avoidance system are elegant and makes sense in this case. It back off transfer speed at the same time, so all tcp connection with the same congestion avoidance system have similar speed, and thus fair. Since no one is waiting in front of screen, it doesn't matter if a transfer takes 2 minutes instead of 1 minute to finish, as the coffee is still boiling hot. After all, if the connection is still too slow, one can ask the network operator to upgrade the connection. It makes sense when its assumption is still hold true.

And in the environment where these assumption are no longer true, it shouldn't be hold as the gold standard. In network environment without network neutrality, network can be segmented into different priority group and if there is insufficient network capacity, packet with lower priority are always dropped/delayed first. In this system, congestion avoidance system does not split traffic between all users equally or proportionally, and instead just allow user with high priority to receive the majority of network capability. And this is why users are describing their experience of insanely slow traffic, that is because they are not using a network with network neutrality, but a deprioritized network under a network without network neutrality. In that case, users with a deprioritized network couldn't reduce network congestion by reduce their own network usage, as prioritized traffic won't experience the same packet loss, and keep the network congested for deprioritized users. In this way, TCP's congestion avoidance system does not have its intended effect of avoiding congestion, it just make network unbearable slow for deprioritized users.

It would still be 'unfair' for users in the same deprioritized group to have some users with a more aggressive congestion avoidance system than others, one might say. This is the second assumption that does not hold in the TCP's congestion avoidance system's design. Not are users are transferring data non-interactively, and can tolerate things take longer. Protocols like utp give way to tcp traffic as it is designed for background transfer of big files, and it means there could be protocol that are more aggressive than tcp's cubic to help users maintain an useable speed when network that is otherwise too slow to be used for modern interactive web browsing. The end effect is non-interactive traffic give way to interactive traffic where users's time and focus are tied with. In the end, with a practically useless speed of 2-3 kb/s, the tea would be ice cold before a webpage is presented, and the user will need to refresh it anyway because of timeout in modern web application.

Finally, one might say if one wants to improve the network speed, they should pay their ISP more. Remember in the original design of TCP if the network is too slow, the user could ask network operator to upgrade it. It is not true. Today, if a user wants to upgrade their network, then they will need to pay to upgrade their network priority level. This step does not increase overall network capacity and simply make network worse for users that are paying less. This has the exact the same effect as having a more aggressive congestion avoidance system, other than paying ISP more. This also give ISP no incentive to upgrade the overall network capacity since they are profiting from bad network performance for deprioritized users.

This post would be unfair to Chinese ISP if just end here. There are reason why Chinese ISP are not upgrading the link speed with the rest of internet: transit costs. In addition to cable costs, for every mbps of internet link, ISP are required to pay their upstream ISP money for internet connectivity. Despite having vast amount of users, no ISP from China mainland are given settlement free peering access to internet, and thus, need to pay more than ISP in other privileged regions, and thus have more incentive to avoid increasing overall network capacity. Thus, there is no real antagonist here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment