Archive for the ‘Puzzle’ Category

Bufferbloat demonstration videos

February 1, 2012

If people have heard of bufferbloat at all, it is usually just an abstraction despite having personal experience with it. Bufferbloat can occur in your operating system, your home router, your broadband gear, wireless, and almost anywhere in the Internet.  They still think that if experience poor Internet speed means they must need more bandwidth, and take vast speed variation for granted. Sometimes, adding bandwidth can actually hurt rather than help. Most people have no idea what they can do about bufferbloat.

So I’ve been working to put together several demos to help make bufferbloat concrete, and demonstrate at least partial mitigation. The mitigation shown may or may not work in your home router, and you need to be able to set both upload and download bandwidth.

Two  of four cases we commonly all suffer from at home are:

  1. Broadband bufferbloat (upstream)
  2. Home router bufferbloat (downstream)
Rather than attempt to show worst case bufferbloat which can easily induce complete failure, I decided to demonstrate these two cases of “typical” bufferbloat as shown by the ICSI data. As the bufferbloat varies widely as the ICSI data shows, your mileage will also vary widely.

There are two versions of the video:

  1. A short bufferbloat video, of slightly over 8 minutes, which includes both demonstrations, but elides most of the explanation. It’s intent is to get people “hooked” so they will want to know more.
  2. The longer version of the video clocks in at 21 minutes, includes both demonstrations, but gives a simplified explanation of bufferbloat’s cause, to encourage people to dig yet further.
Since bufferbloat only affects the bottleneck link(s), and broadband and WiFi bandwidth are often similar and variable, it’s very hard to predict where you will have trouble. If you to understand that the bloat grows just before the slowest link in a path, (including in your operating system!) you may be able to improve the situation. You have to take action where the queues grow. You may be able to artificially move the bottleneck from a link that is bloated to one that is not. The first demo moves the bottleneck from the broadband equipment to the home router, for example.
To reduce bufferbloat in the home (until the operating systems and home routers are fixed), your best bet is to ensure your actual wireless bandwidth is always greater than your broadband bandwidth (e.g., by using 802.11n and possibly multiple access points) and use bandwidth shaping in the router to “hide” the broadband bufferbloat.  You’ll still see problems inside your house, but at least, if you also use the mitigation demonstrated in the demo, you can avoid problems accessing external web sites.
The most adventurous of you may come help out on the CeroWrt project, an experimental OpenWrt router where we are working on both mitigating and eventually fixing bufferbloat in home routers. Networking and ability to reflash routers required!

CACM: BufferBloat: What’s Wrong with the Internet?

December 8, 2011

Communications of the ACM: Bufferbloat: What’s Wrong with the Internet?

February issue of the Communications of the ACM.

Some puzzle pieces of a picture puzzle.

A discussion with Vint Cerf, Van Jacobson, Nick Weaver, and Jim Gettys

This is part of an ACM Queue case study, accompanying Kathie Nichols and my article that appeared in the January 2012 CACM (Communications of the ACM).

IEEE Internet Computing “Backspace” column on bufferbloat

May 4, 2011

Vint Cerf asked me to write his usual “Backspace” column for IEEE Internet Computing magazine on bufferbloat.  It appeared in the current May/June issue. You can find an online copy of the article on the web site (with permission of the IEEE).

Presentation for the Prague IETF 80 Transport Area Open Meeting

March 28, 2011

I’m on the agenda for the Transport Area meeting of the Prague IETF meeting.  In it, I have 30 minutes to try to convey the gist and severity of the bufferbloat problem to that audience. I have had the opportunity to present this presentation three times in preparation; once at BattlemeshV4, and twice internally in Bell Labs, so it is much more polished than the original Murray Hill presentation.

Due to the preciousness of meeting time at the IETF, I had to choose what to elide from the much longer original presentation, which includes information of how to mitigate bufferbloat and much additional detail.  On the other hand, I will attempt to be speaking more slowly at the IETF, so it may be more understandable to people listening (or so I hope!).

If you are attending IETF 80, I urge you to attend, and not just those who are interested in transport.  Bufferbloat is terribly damaging to applications (particularly interactive and low latency applications) and general network operations. The draft of the talk itself is already available and the audio and should be available as well as part of the IETF 80 activities. It is currently scheduled (subject to change) for Wednesday morning (Prague time) in the Congress Hall III room. I’m sure hallway conversations will cause me to tweak the talk before I present it Wednesday, but it’s getting close.

Goings on at…

February 10, 2011

Kathie Nichols wrote in pointing out an insightful talk of Van Jacobson’s entitled A Rant on Queues. She also points out:

Also, as someone who spent a lot of time examining the dysfunctionality of 93 RED and various changes Van came up with in response to the problems: examine any AQM very carefully. It’s likely not working the way you think it is. What I learned is that almost anything works with nicely behaved long-lived TCPs and that almost nothing works well with mixtures of mice and elephants. It’s also instructive to examine what is really happening when your AQM drops packets. We found that 93 RED spent a lot of time doing what we called “forced drops” which means you are in the “drop everything” part of your control law. What we would find with 93 RED were a bunch of back to back drops. There’s tons of stuff on this in our unfinished “RED in a different light” paper, but look at the first 10 slides or so of

“We have Met the Enemy and [S/]He is Us”:A View of Internet Research and Analysis and you’ll see some of it without wading through a lot of boring prose. So, please, please, don’t make a “fix” that is perhaps worse than the original problem.

I think we need to take Kathie’s first hand experience to heart.

Richard Scheffenegger has done a quick NS2 simulation/animation of bufferbloat well worth a quick look.  Helping him out on that would be a great service.

Dave Täht came up with an interesting idea of possibly using NTP data to get a view into bufferbloat on a global scale. In short, bloat badly disturbs RTT’s and that may be a way to see what’s going on. He’s calling it the “cosmic background bufferbloat detector“.

Bufferbloat talk & is up

January 28, 2011

I gave a talk at Bell Labs in Murray Hill last week, pulling together most of the threads of bufferbloat into a single presentation. You can find the talk and audio of the talk. I still have areas I haven’t gone into in detail on the blog; they will be coming over the next month or two.

Courtesy of large amount of Dave Täht’s and ISC‘s generosity for hosting the systems, we have software installed to support work on Mailing lists are there along with a wiki, tracking system, news and similar facilities.  Please help solve bufferbloat, wherever it may be found.

Bufferbloat in 802.11 and 3G Networks

January 3, 2011

Any network system with buffering shared among many users is much like a
congested highway.  We’ll call them
“big fat networks”. Two such network technologies which show this problem are 802.11 (abgn), and 3g wireless.  In one, the buffers are distributed among the clients (and may also be in the access points and routers); in the other, both possibly in the clients, and the radio controllers they talk to, but also possibly in the backhaul networks.

You have suffered unusable networks at conferences.  Wonder why no more. You can make your life less painful by mitigating your operating system’s and access point’s buffering.

Moral of the Story

Whether you call what we see on 802.11 and 3g networks “congestion collapse” as the 1980’s NSFnet event was called (with high packet loss rates), or something different such as bufferbloat (exhibiting much lower, but still significant packet losses), the effect is the same: horrifyingly bad latency and the resulting application failures. Personally, I’m just as happy with “congestion collapse” as with bufferbloat.

The moral of the story is clear: when the network is running slowly, we really need to absolutely minimize the amount of buffering to achieve anything like decent latencies on shared media. Yet when the network is unloaded, we want to fill this network pipe that may be hundred megabits or more in size. On such a shared, variable performance network: there is no single right answer for buffering. You cannot just “set it, and forget it”. Read on…


RED in a Different Light

December 17, 2010

Update May 8, 2012:Controlling Queue Delay” describes a new AQM algorithm by Kathie Nichols and Van Jacobson.

Congestion and queue management in Internet routers has been a topic since the early years of the Internet and its first congestion collapse. “Congestion collapse” does not have the same visceral feel to most using the Internet today, as it does for a few of us older people. Large parts of the early Internet actually stopped functioning almost entirely, and a set of algorithms were added to TCP/IP to ensure collapse not happen in the future.  These include slow start,  congestion avoidance, fast recovery, and at a later date, ECN (Explicit Congestion Notification), which has not so far seen wide use,  and is a subject of ongoing research to determine if it can be deployed.

Bufferbloat much larger than the RTT’s of the paths destroys the fundamental congestion avoidance of the TCP protocol’s servo system as I documented. We have destroyed congestion avoidance, and as I’ll discuss soon when I return to the topic of web browsers and servers, we are playing with fire. Even if nothing as bad as I fear ever happens, the bufferbloat situation today is bad, with multiple second latencies being common.

Bufferbloat was first understood encountered in very early experiments with satellite networking, and active queue management is a very active area of research since the 1980’s and continuing to this day.  With the invention and wide deployment of algorithms such as RED and others, I had thought that the problem was solved. To my surprise (I am not in the field, but due to history have been a somewhat interested bystander), I was wrong, and that queue management is often not enabled even on significant routers in both enterprise networks and the Internet. The reasons why and the limitations of existing AQM algorithms shed light on this aspect of today’s problems.

Conclusions: Active Queue Management is often not enabled or tuned in today’s Internet and corporate networks. Broadband (and some network’s) performance is therefore often significantly worse than necessary since your ISP may never have enabled AQM. If you are operating a network, check that you have correctly enabled and tuned your AQM on all types of your gear. You can have happier customers and fewer service calls. Finally, we need better queue management algorithms than “classic RED” or closely related algorithms  for today’s wireless routers and operating systems.

Read on for more detail…


Mitigations and Solutions of Bufferbloat in Home Routers and Operating Systems

December 13, 2010

As discussed several days ago we can mitigate (but not solve) broadband bufferbloat to a decent, if not ideal, degree by using bandwidth shaping facilities found in many recent home routers. Unfortunately, life is more complicated and home routers themselves are often typically at fault (if you find a recently designed home router that works right, it may want to be enshrined in a museum where its DNA and evolution analyzed, and its implementors both admired for their accomplishment and despised, for not telling us about what they discovered. Complete robust solutions, unfortunately, will be difficult in the short term (wireless makes it an “interesting” problem) for reasons I’ll get to in this and future posts.

Confounding the situation further, your computer’s/ smartphone/ netbook’s/ tablet’s operating system may also be suffering from bufferbloat, and the its severity may/almost certainly does depend upon the hardware. Your mileage will vary.

You may or may not have enough access to the devices to even manipulate the bufferbloat parameters. Locked down systems come back to bite you. But again, you can probably make the situation much better for you personally, if you at a minimum understand what is causing your pain, and are willing to experiment.


Since any number you pick for buffering is guaranteed to be wrong for many use cases we care about, the general solution will await operating systems implementers revisiting buffering strategies to deal with the realities of the huge dynamic range of today’s networks, but we can mitigate the problem (almost) immediately by tuning without waiting for nirvana to arrive.


Mitigations versus Solutions of Bufferbloat in Broadband

December 8, 2010

I have distinguished in my writing between what I call “mitigations” and “solutions”.

  • mitigations are actions we can take, often immediately, which make the situation better, and improve (possibly greatly) the current grim situation.  Since they may only work some of the time, and may require conscious thought tuning and action by network operators and users, or have other limitations that are often far from optimal, they won’t work in some circumstances or necessarily be implemented everywhere. Often these mitigations will come at some cost, as in the case today’s posting below.
  • solutions are full solutions for a problem that get behavior to something approximating optimal.  Sometimes they may be mitigations that can be widely applied in an ISP, even though though they may require thought there. The “just work” for everyone.

But observed facts (e.g. RED or other AQM is far from universally used; more about this in a future post) shows that anything that does not “just work” is often distrusted and under-used (and seldom enabled by default), so such a solution is seldom the optimal solution we should be looking for: really “solving” the problem once and for all.  As good engineers and scientists, we should always be striving for “just works” quality solutions, which we don’t have for bufferbloat in all its forms.

The full “solution” for the entire Internet is going to be hard; we need to solve too many different problems (as you will see) at too many points in all paths your data may traverse, to wave a wand eliminate bufferbloat overnight.  Some of the point solutions will actually require replacement of hardware, and time to research and engineer such hardware along with economics will often take time.  Does that mean we should do nothing?  Of course not: we can immediately make the situation much better than it is, particularly for consumer home Internet service. And remember, your competitor will eventually beat you if you sit on your hands.

Gamers and others have been mitigating bufferbloat in broadband for years. Read on. You’ll suffer much less. Mitigation of home router bufferbloat itself will be tomorrow’s installment.