A Milestone Reached: CoDel is in Linux!

May 22, 2012

Some puzzle pieces of a picture puzzle.The CoDel AQM algorithm by Kathie Nichols and Van Jacobson provides us with an essential missing tool to control queues properly. This work is the culmination of their at three major attempts to solve the problems with AQM algorithms over the last 14 years.

 

Eric Dumazet wrote the codel queuing discipline (based on a quick prototype by Dave Täht, who spent the last year working 60 hour weeks on bufferbloat) which landed in net-next a week or two ago; yesterday, net-next was merged into the Linux mainline for inclusion in the next Linux release.  Eric also implemented a fq_codel queuing discipline, combining fair queuing and CoDel  (pronounced “coddle”), and it works very well.  The CoDel implementation was dual licensed BSD/GPL to help the *BSD community. Eric and others have tested CoDel on 10G Ethernet interfaces; as expected, CoDel performance is good in what’s been tested to date.

Linux 3.5 will likely release in August. So it was less than a month from first access to the algorithm (which was formally published in the AQM Queue May 6) to Linux mainline; it should be about four total from availability of the algorithm to Linux release.  Not bad at all :-).

Felix Fietkau merged both the codel and fq_codel into the OpenWrt mainline last week for release this summer. 37 architectures, 150 separate routing platforms, no waiting…

The final step should be to worry about all the details, and finally kill pfifo_fast once we’ve tested CoDel enough.

While I don’t think that this is the end of the story, fair queuing in combination with CoDel, and Tom Herbert’s great BQL work together go a very long way toward dealing with bufferbloat on Ethernet devices in Linux.  I think there is more to be done, but we’re much/most of the way to what is possible.

Some Ethernet hardware (both NIC’s and many Ethernet switches) has embedded bufferbloat (in this case, large FIFO buffers) that software may not be able to easily avoid; as always you must test before you are sure you are bloat free! Unfortunately, we’ll see a lot of this:  a very senior technologist of a major router vendor started muttering “line cards” in a troubled voice at an IETF meeting when he really grokked bufferbloat. Adding AQM, simple as CoDel is, as an afterthought may be very hard: do not infer from the speed of implementation in Linux on Ethernet that all operating systems, drivers, and hardware will it will be so easy or retrofit possible; for example, if your OS has no equivalent to BQL, you’ll have that work to do (and easily other work too).

Wireless is much more of a challenge than Ethernet for Linux, particularly 802.11n wireless; the buffering and queuing internal to these devices and drivers is much more complex, and the designers of those software stacks are still understanding the implications of AQM, particularly since the driver boundary has partitioned the buffering in unfortunate ways. I think it will be months, to even a year or two before those have good implementations that get us down to anything close to theoretical minimum latency. But running CoDel is likely a lot better than nothing anyway if you can: that’s certainly our CeroWrt/OpenWrt experience, and the tuning headaches Dave Täht was having trying to use RED go away. So give it a try….

Let me know of CoDel progress in other systems please!

The Next Nightmare is Coming

May 14, 2012

BitTorrent was NEVER the Performance Nightmare

BitTorrent is a lightning rod on two fronts: it is used to download large files, which Some puzzle pieces of a picture puzzle.the MPAA sees as a nightmare to their business model, and BitTorrent has been a performance nightmare to ISP’s and some users. Bram Cohen has taken infinite grief for BitTorrent over the years, when the end user performance problems are not his fault.

Nor is TCP the performance problem, as Bram Cohen recently flamed about TCP on his blog.

I blogged about this before but several key points seem to have been missed by most: BitTorrent was never the root cause of most of the network speed problems BitTorrent triggered when BitTorrent deployed. The broadband edge of the Internet was already broken when BitTorrent deployed, with vastly too much uncontrolled buffering, which we now call bufferbloat. As my demonstration video shows, even a single simple TCP file copy can cause horrifying speed loss in an overbuffered network.  Speed != bandwidth, despite what the ISP’s marketing departments tell you.

But almost anything can induce bufferbloat suffering (filling bloated buffers) too: I can just as easily fill the buffers with UDP or other protocols as with TCP. So long as uncontrolled, single queue devices pervade the broadband edge, we will continue to have problems.
But new nightmares will come….
Read the rest of this entry »

Fundamental Progress Solving Bufferbloat

May 8, 2012

Kathie Nichols and Van Jacobson today published an article entitled “Controlling Queue Delay” in the ACM Queue. which describes a new adaptive active queue management algorithm (AQM), called CoDel (pronounced “coddle”). This is their third attempt over a 14 year period to solve the adaptive AQM problem and it is finally a successful solution. The article will appear sometime this summer in the Communications of the ACM. Additionally, another independent adaptive AQM algorithm by other authors is also working its way through the academic publication cycle.

A working adaptive AQM algorithm is essential to any full solution  to bufferbloat. Existing AQM algorithms are inadequate, particularly in wireless with its very rapid changes in bandwidth.

Everyone working in networking, not just those interested in AQM systems, should read the article, as it dispels common misunderstandings about how TCP interacts with queuing.

Read the rest of this entry »

Bufferbloat goings on…

May 1, 2012

The bufferbloat front has appeared quiet for several months since two publications hit CACM (1), (2) and several videos hit YouTube, though I have one more article to write for IEEE Spectrum (sigh…).

There has been a lot going on behind the lines, however, and some major announcements are imminent on ways to really fix bufferbloat. But I wanted to take a moment to acknowledge other important work in the meanwhile so they do not get lost in the noise, and to get your juices flowing.

  1. First off, Linux 3.3 shipped with BQL (byte queue limits) done by Tom Herbert of Google.  This is good stuff: finally, the transmit rings in Linux network device drivers won’t cause hundreds of packets of buffering.
  2. Dave Taht has had good success prototyping in CeroWrt a combination of Linux’s SFQ and RED to good effect: SFQ ensures decent sharing among short lived interactive flows which receive preference to long lived elephant flow TCP sessions. As transient bufferbloat and TSO/GSO GRO/LRO smart NIC’s make clear, no comprehensive solutions for achieving good latency are possible without some sort of “fair” queuing and/or classification. As in all RED based AQM algorithms, tuning SFQRED is a bitch and a better AQM is badly needed; news at 11 on that front. CeroWrt is approaching its first release with all sorts of nice features and I’ll blog about it when it’s soup. In the meanwhile, adventurers can find all they want to know about CeroWrt at the links here.
  3. The DOCSIS changes to mitigate bufferbloat in cable modems continues on its way.  While I haven’t checked in to see when deployment really starts (driven by modification to cable carrier deployment systems), we should see this major improvement later this year.

And, as outlined in other writings on this blog, and demonstrated in this video, you can do things about bufferbloat in your home today.

So there is hope.  Really…  Stay tuned…

I’ll be attending Penguicon on April 28, 29.

April 19, 2012

I’ve never been to Penguicon before; but they invited me and John Scalzi (one of my favorite recent SF authors) to be guests of honor, so how could I possibly say no? I haven’t been to a SF con for decades; much less one crossed with a Linux conference.  I think it should be fun; certainly there are a lot of interesting topics. There are quite a few other fun people attending, including Bruce Schneier, and I’ll get to embarrass myself about Heinlein with Eric Raymond as well as doing a little different take on bufferbloat than my usual talk, more what people can do about it themselves.

An Minor Diversion into DNSSEC….

March 21, 2012

I realised recently an interesting milestone has been reached, that will thrill the people who have slaved on DNSSEC for over a decade: DNSSEC running end-to-end, into the house the way “it should really work” without requiring any configuration or intervention has happened.  After all the SOPA/PIPA anguish, seeing DNSSEC come to life is really, really nice. This post may also interest router hackers who’d like a real home router they can afford, particularly one that will do mesh routing.

Read the rest of this entry »

Diagnosing Bufferbloat

February 20, 2012

People (including in my family) ask how to diagnose bufferbloat.

Bufferbloat’s existence is pretty easy to figure out; identifying which hop is the current culprit is harder.  For the moment, let’s concentrate on the edge of the network.

The ICSI Netalyzr project is the easiest way for most to identify problems: you should run it routinely on any network you visit. as it will tell you of lots of problems, not just bufferbloat.  For example, I often take the Amtrak Acela express, which has WiFi service (of sorts).  It’s DNS server did not randomize its ports properly, leaving you vulnerable to man-in-the-middle attacks (so it would be unwise to do anything that requires security); this has since been fixed, as today’s report shows (look at the “network buffer measurements”).  This same report shows very bad buffering, in both directions, of about 6 seconds up, and 1.5 seconds downstream.  Other runs today show much worse performance, including an inability to determine the buffering entirely (netalyzr cannot always determine the buffering in the face of cross traffic or other problems; it conservatively only reports buffering if it makes sense).

Netalyzer Uplink buffer test results

Netalyzer Uplink buffer test results

As you’d expect, performance is terrible (you can see what even “moderate” bufferbloat does in my demo video on a fast cable connection).  The train buffering is similar to what my brother has on his DSL connection at home; but as the link is busy with other users, the performance is continually terrible, rather than intermittently terrible.  6 seconds is commonplace; but the lower right hand netalyzr data is cut off since ICSI does not want their test to run for too long.

In this particular case,  with only a bit more investigation, we can guess most of the problems are in the train<->ISP hop, because my machine reports high bandwidth on its WiFi interface (130Mbps 802.11n), with the uplink speeds a small fraction of that, so the bottleneck to the public internet is usually in that link, rather than the WiFi hop (remember, it’s just *before* the lowest bandwidth hop that the buffers fill in either direction).  In your home (or elsewhere on this train), you’d have to worry about the WiFi hop as well unless you are plugged directly into the router. But further investigation shows additional problems.

If netalyzr isn’t your cup of tea, you may be able to observe what is happening with “ping”, while you (or others) load your network.

By “ping”ing the local router on the train and also somewhere else, you can glean additional information. As usual, a dead giveaway for bufferbloat is high and variable RTT’s with little packet loss (but sometimes packets are terribly delayed and out of order; packets stuck in buffers for even 10′s of seconds are not unusual). Local pings vary much more that you might like, sometimes as much as several hundred milliseconds, but occassionally even multiple seconds on occasion.  Here, I hypothesize bloat in the router on the train, just as I saw inside my house when I first understood that bufferbloat was a generic problem with many causes. Performance is terrible at times due to the train’s connection; but also a fraction of the time due to serving local content with bloat in the router.

Home router bloat

Specifically, if the router has lots of buffering (as most modern routers do; often 256-1250 packets), and is using a default FIFO queuing discipline, it is easy for a router to fill these buffers with packets all destined for the same machine that is operating at a fraction of the speed that WiFi might go.  Ironically, modern home routers tend to have much larger buffering than old routers, due to changes in upstream operating systems optimized toward bandwidth, whose systems were not tested for latency.

Even if “correct” buffering were present (actually an oxymoron), the bandwidth can drop from the 130 Mbps I see to the local router all the way down to 1Mbps, the minimum speed WiFi will operate at, so your buffering can be very much too high even at the best of times.  Moving your laptop/pad/device a few centimeters can make a big difference in bandwidth. But since we have no AQM algorithm to control the amount of buffering, recent routers have been tuned (to the extent they’ve been tuned at all) to operate at maximum bandwidth, even though this means the buffering available can easily be 100 times too much when running slowly (which all turns into delay).  One might also hope that a router would prevent starvation to other connections in such circumstances, but as these routers are typically running with a FIFO queuing disciple, they won’t.  A local (low RTT) flow can get a much higher fraction of bandwidth than a long distance flow.

To do justice to the situation, it is also possible that the local latency variation is partially caused by device driver problems in the router: Dave Taht’s experience has been that 802.11n WiFi device drives often buffer many more packets than they should (beyond that required for good performance when aggregating packets for 802.11n), and he, Andrew McGregor, and Felix Fietkau spent a lot of time last fall reworking one of those Linux device drivers. Since wireless on the train supports 802.11n, we know implies that these device drivers are in play; fixing these problems for the CeroWrt project was a prerequisite for later work on queuing and AQM algorithms.

Bufferbloat demonstration videos

February 1, 2012

If people have heard of bufferbloat at all, it is usually just an abstraction despite having personal experience with it. Bufferbloat can occur in your operating system, your home router, your broadband gear, wireless, and almost anywhere in the Internet.  They still think that if experience poor Internet speed means they must need more bandwidth, and take vast speed variation for granted. Sometimes, adding bandwidth can actually hurt rather than help. Most people have no idea what they can do about bufferbloat.

So I’ve been working to put together several demos to help make bufferbloat concrete, and demonstrate at least partial mitigation. The mitigation shown may or may not work in your home router, and you need to be able to set both upload and download bandwidth.

Two  of four cases we commonly all suffer from at home are:

  1. Broadband bufferbloat (upstream)
  2. Home router bufferbloat (downstream)
Rather than attempt to show worst case bufferbloat which can easily induce complete failure, I decided to demonstrate these two cases of “typical” bufferbloat as shown by the ICSI data. As the bufferbloat varies widely as the ICSI data shows, your mileage will also vary widely.

There are two versions of the video:

  1. A short bufferbloat video, of slightly over 8 minutes, which includes both demonstrations, but elides most of the explanation. It’s intent is to get people “hooked” so they will want to know more.
  2. The longer version of the video clocks in at 21 minutes, includes both demonstrations, but gives a simplified explanation of bufferbloat’s cause, to encourage people to dig yet further.
Since bufferbloat only affects the bottleneck link(s), and broadband and WiFi bandwidth are often similar and variable, it’s very hard to predict where you will have trouble. If you to understand that the bloat grows just before the slowest link in a path, (including in your operating system!) you may be able to improve the situation. You have to take action where the queues grow. You may be able to artificially move the bottleneck from a link that is bloated to one that is not. The first demo moves the bottleneck from the broadband equipment to the home router, for example.
To reduce bufferbloat in the home (until the operating systems and home routers are fixed), your best bet is to ensure your actual wireless bandwidth is always greater than your broadband bandwidth (e.g., by using 802.11n and possibly multiple access points) and use bandwidth shaping in the router to “hide” the broadband bufferbloat.  You’ll still see problems inside your house, but at least, if you also use the mitigation demonstrated in the demo, you can avoid problems accessing external web sites.
The most adventurous of you may come help out on the CeroWrt project, an experimental OpenWrt router where we are working on both mitigating and eventually fixing bufferbloat in home routers. Networking and ability to reflash routers required!


CACM: BufferBloat: What’s Wrong with the Internet?

December 8, 2011

Communications of the ACM: Bufferbloat: What’s Wrong with the Internet?

February issue of the Communications of the ACM.

Some puzzle pieces of a picture puzzle.

A discussion with Vint Cerf, Van Jacobson, Nick Weaver, and Jim Gettys

This is part of an ACM Queue case study, accompanying Kathie Nichols and my article that appeared in the January 2012 CACM (Communications of the ACM).

CACM: Bufferbloat: Dark Buffers in the Internet

December 6, 2011

Vint Cerf recommended that I start immediately blogging about bufferbloat a year or so ago, given the severity of the problem to avoid the usual publication delays; Some puzzle pieces of a picture puzzle.that’s why things appeared here first.

But more formal publication has its merits; in particular, having articles for less directly involved in networking and/or more managerially oriented technical managers is very important. So I’ve been working with/in ACM queue to put together a case study. It has now appeared as an article in the January 2012 issue of CACM (Communications of the ACM) in dead-tree form.  There will also be a full paper posted in ACM queue, but to make the January CACM, we put that aside to finish the (much shorter) article.


Follow

Get every new post delivered to your Inbox.

Join 685 other followers