Bufferbloat is confusing. Questions are natural. I can’t tell you how much hair I’ve lost scratching my head about what I was seeing. I didn’t have much hair to begin with, and I have much less now.
This FAQ is organized in approximately increasing order of technical difficulty. The last FAQ’s answer applies to everyone.
- What is bufferbloat, anyway? What does it do to me?
See my definition. It wrecks latency (in other words, everything takes longer and the internet is slower).
- A 100 Gigabit network is always faster than a 1 megabit network, isn’t it? More bandwidth is always better! I want a faster network!
No, such a network can easily be much slower. Bandwidth is a measure of capacity, not a measure of how fast the network can respond. You pick up the phone to send a message to Shanghai immediately, but dispatching a cargo ship full of blu-ray disks will be amazingly slower than the telephone call, even though the bandwidth of the ship is billions and billions of times larger than the telephone line. So more bandwidth is better only if it’s latency (speed) meets your needs. More of what you don’t need is useless. Bufferbloat destroys the speed we really need.
- Why is latency as important as bandwidth? more bandwidth is faster, isn’t it?
Stuart Chesire has said this better than I ever will, in “It’s the Latency, Stupid“. The details of the technology have changed since he wrote that, but everything he says is as true today as it was in 1996. More bandwidth does not mean faster. To quote Stuart: “In fact, if you were really in a hurry to get to London quickly, you’d take Concorde, which cruises around 1350 miles per hour. It only seats 100 passengers though, so it’s actually the smallest of the three. Size and speed are not the same thing.”
- What’s not commonly realized?
I believe bufferbloat triggered the network neutrality debate, and bufferbloat, by destroying low latency, certainly has serious consequences in this area. And for technical geeks, that buffers much larger than the actual path latency destroys congestion avoidance in transport protocols, and bufferbloat occurs in operating systems, not just routers.
- What sort of operations may induce bufferbloat suffering by myself or others who share the network?
Examples include uploading videos to YouTube, emailing large mail (such as those with many images attached), backing up large file(s) or file systems over many current broadband or 3g networks, downloading large files, such as movies, ISO’s, your kid downloading a Linux distro (or movie) via bittorrent, and, unfortunately, even visiting certain kinds of web pages. On shared networks, such as 3g or 802.11 networks at conferences or hotels, even general traffic may congest links, inducing the severe problems you notice at certain locations and times. The buffers are full, and cause delay, slowing down the network’s “feel”. The network is slow.
- What kind of applications suffer most?
Applications, such as VOIP (voice over IP), multi-user games, and teleconferencing suffer most. Bufferbloat is often causing your network to be ten or even a hundred times slower than it should be when suffering from bufferbloat. Even web browsing can become very painful. And your experience will never be consistent, as you or others around you induce bufferbloat. “Daddy, the Internet is slow today” was a constant refrain in my household.
- Can I do anything personally to reduce my suffering from bufferbloat?
Yes, there may be steps you can take immediately to reduce your suffering. Knowledge is power; it is why I’m doing this blog. Ultimately, by understanding the problem and educating others (including, often, your ISP’s or IT departments), we can make a big difference quickly. Other solutions will take more time, and ultimately voting with your pocket book will be necessary to fully solve the bufferbloat problem. But see the last question below, about bozos.
- Has my (fill in your favorite ISP, hardware vendor, operating system vendor) deliberately done this?
I don’t think so. Bufferbloat costs them all too much in service calls and missed business opportunities for me to buy into a paranoid view of the world. The conspiracy theorists will have great fun trying to prove otherwise anyway, of course. See the last question below about bozos.
- Is this a TCP specific problem?
No. Suffering from bufferbloat can be induced by any transport protocol, including UDP, when links saturate, which they always will in many commonly encountered situations.
- Is bufferbloat new? What happens when you add bloated (semi-infinite) buffers to the network?
No. John Nagle’s cogent explanation of the phenomena in RFC 970 dates from 1985. Some of the buffers now observed in the Internet are, to first approximation, infinite in size. Other buffers found are merely grossly excessive for the bandwidth of the connection. Besides high latency (slow internet behavior), they can cause partial or complete failure of applications, network services or networks themselves.
- How common is bufferbloat? How can I tell if I’m suffering from bufferbloat?
Dismayingly common as the ICSI Netalyzr results have shown. Other papers show problems elsewhere in the Internet. And since empty buffers can be hidden anywhere, just because you aren’t suffering from bufferbloat now doesn’t mean you won’t in the (even short term) future. You can test for whether you are currently suffering from bufferbloat by running ICSI’s Netalyzr test or performing other tests as I’ve shown in this blog.
- Does bufferbloat only occurs when there is 2 (or more) flows? Or can bufferbloat also occur with only 1 single flow? Demonstrating bufferbloat is easy to do with a single TCP connection (two one line commands), on all operating systems other that Windows XP (which does not implement window scaling by default), so long as that machine can saturate a network path. You can use any other protocol to show bufferbloat; it just may be more involved. The Netalyzr test for bufferbloat is UDP based, for example.
- What exactly happens when bufferbloat occurs (ie: when an excessively big buffer fills up, right?)?
Buffers can easily fill from any traffic, from any protocol. Bufferbloat is when those buffers are not being managed, and therefore are oversize for extended periods, imposing excessive latency to traffic transiting those buffers.When there is bufferbloat, your queues are excessively long. In a packet switched network, queues should, on average, be very short, not running continually full. That is what AQM can do for you.
- Is bufferbloat referring to (1) the fact that an excessively big buffer fills up, or to (2) the fact that a flow can experience excessive delay when an excessively large buffers fills up?
- Is bufferbloat referring to excessively large buffers preventing TCP flow control?
There is still flow control, but again, as there are long delays due to the buffering, the remaining flow control is poor and may become “bursty”. As you can see in my tcp traces, when the buffers are much larger than the normal RTT should be, you’ve destroyed the fast response of the servo system. So TCP flow control is also suffering from the imposed latency. This actually induces a certain amount of packet loss (significantly more than you would have had in a good network) overall bandwidth usage is not destroyed on modern TCP implementations as SACK and fast retransmit can keep the pipe reasonably full. In the quest to avoid losing bits, we actually lose more bits. What bufferbloat does cause by no timely packet loss or ECN marking is destroying congestion avoidance in TCP and other protocols.
- Does the nefarious effect of bufferbloat only occur under network congestion?
Yes, exactly. When the network is congested, if there is no queue management on that buffer, the buffers fill and bad things happen.You only suffer pain from bufferbloat just before a saturated bottleneck link. Congestion routinely occurs in everyday life, whenever you move bulk data; it is therefore very common in our home networks, both due to the commonness of problems in the broadband gear as ICSI’s Netalyzr work showed, and also in our home routers and computers (typically on 802.11). Similarly we see aggregate forms of bufferbloat both on busy 802.11 networks and 3G networks. And some network operators (both corporate and ISP’s) may fail to run with AQM, and therefore inflict pain on their customers.
- What is this AQM thing you keep talking about?
AQM == Active queue management. The Wikipedia article will get you going. I use AQM in this blog in the most general sense: active management of the queues. There are many techniques to control your queue lengths, but many of them may not be available to you or applicable to your circumstances.
- Will TCP by itself fill buffers? Even without bufferbloat?
Yes; this is why queue management was developed in the first place, starting at least 20 years ago.
- What is ECN?
ECN == Explicit Congestion Notification. Traditionally, the only way to signal congestion is by packet drop. But that does not distinguish from other random loss; packet drop is the only guaranteed way signal of congestion in a packet network.
ECN is a second way to signal congestion, by marking a packet with a bit if it transits a congested node in the network. It’s advantage is the work already expended in moving the packet to that point is preserved. ECN hasn’t been used much, as a bunch of broken hardware was shipped years ago that would crash if it saw an ECN bit. Most of that hardware (mostly in home environments) should have been retired by now. So the question is whether we could/should start using ECN, and if so, in what parts of the network.
- But I need to manage my queues if I am classifying my traffic?
Traffic classification is orthogonal to AQM. If you don’t signal congestion at a congested hop in the network (by either packet drop or ECN), there is no way the end points will slow down and keep the queue size sane and transport protocols will fill the buffers at the bottleneck. For whatever class of service that traffic is classified in, the latencies will go to hell.Traffic classification is a useful tool, and may help performance of classes of applications and operation during periods of congestion, but fundamentally not a solution in any sense for bufferbloat. For that, you need to enable some way to manage the length of the queues (e.g. RED). All buffers need to be managed in some fashion. If you manage your queues, classification is much less necessary.
- Huh? Haven’t we known about AQM for a long time? Isn’t it deployed everywhere?
That was my reaction. As to the story behind why AQM isn’t deployed everywhere, see my blog posting on RED in a Different Light. Be paranoid: “dark buffers” may be laying to trap you.That we need to think about managing buffers in our hosts better, I think has not been a widespread realization. I only got there by the round about investigation as to why I was seeing occasional 802.11 disasters.
- Can we use the increasing RTT to work around bufferbloat?
The other guy’s traffic will kill you. (your kid’s, your wife’s, your co-worker). So you’ve just fixed your contribution to the problem, so you get to suffer more… Game theory says this isn’t a stable solution, and unlikely to be adopted. The technique itself is being explored in the IETF LEDBAT working group to allow for friendlier behavior for bulk transfer protocols like BitTorrent and will be very useful. But as I noted in one of my posts, much of the pain of BitTorrent only occurred because of bufferbloat in the first place. Fundamentally, the queues need to be managed. So we need to fix bufferbloat overall, not just try to work around it.
- Am I a bozo to have not understood bufferbloat before?
No more than the rest of us. I think we are all bozos on this bus. And the bozo bus is extremely large. Bufferbloat is something we’ve all caused, by not realizing that latency is (at least as) important as bandwidth. Engineers (myself included) have made the same mistake many times over the years, and failed to internalize that packet networks are different than wires. Moore’s law has made memory really cheap. So you are a bozo only if you don’t start fixing the problem, are unwilling to believe there is a problem in the first place, or call others bozos; we’ve all caused this problem together. We’ve turned the Internet of sports cars into the Internet of the Queen Mary. Just think of driving the Queen Mary on a crowded highway at rush hour someday (with your driveway directly attached), and you’ll get my drift.
- Where can I get more information and/or help solve bufferbloat for me or others?
Head over to bufferbloat.net where you will find mailing lists, wiki’s, trackers, and other tools to help.