Archive for the ‘Puzzle’ Category

Home products that fix/mitigate bufferbloat…

February 2, 2017

My New Years resolution is to restart blogging.jigsawfish2

Bufferbloat is the most common underlying cause of most variable bad performance on the Internet; it is called “lag” by gamers.

Trying to steer anything the size of the Internet into a better direction is very slow and difficult at best. From the time changes in the upstream operating systems are complete to when consumers can buy new product is typically four years caused by the broken and insecure ecosystem in the embedded device market. Chip vendors, box vendors, I’m looking at you… So much of what is now finally appearing in the market is based on work that is often four years old. Market pull may do what push has not.

See What to do About Bufferbloat for general information. And the DSLReports Speedtest makes it easy to test for bufferbloat. But new commercial products are becoming increasingly available.  Here’s some of them.

Introduction

The fq_codel & cake work going on in the bufferbloat project is called SQM – “smart queue management.” This SQM work is specifically targeted at mitigating the bufferbloat in the “last mile,” your cable/DSL/fiber connection, by careful queue management and an artificial bandwidth bottleneck added in your home router (since most modems do no perform flow control to the home router, unfortunately).

Modems require built in AQM algorithms, such as those just beginning to reach the market in DOCSIS 3.1. I just ordered one of these for my house to see if it functions better than the SQM mitigation (almost certainly not), but at least these should not require the manual tuning that SQM does.

To fix bufferbloat in WiFi requires serious changes in the WiFi driver in your home router (which typically runs Linux), and in your device (laptop/phone/tablet).  The device driver work was first released as part of the LEDE project, in January 2017 for initially just a couple of WiFi chip types.

Evenroute IQrouter

First up, I’d like call out the Evenroute IQrouter, which has a variant of SQM that deals with “sag”.

DSL users have often suffered more than other broadband users, due to bad bloat in the modems compounded by minimal bandwidth, so the DSL version of the IQrouter is particularly welcome.   Often DSL ISP’s seem to have the tendency (seemingly more often than ISPs with other technologies) to under provision their back haul, causing “sag” at different times of day/week.  This makes the static configuration techniques we’ve used in LEDE/OpenWrt SQM ineffective, as you have to give away too much bandwidth if a fixed bandwidth is used.  I love the weasel words “up to” some speed used by many ISPs. It is one thing for your service to degrade for a short period of days or weeks while an ISP takes action to provision more bandwidth to an area; it is another for your bandwidth to routinely vary by large factors for weeks/months and years.

I sent a DSL Evenroute IQrouter to my brother in Pennsylvania recently and arranged for one for a co-worker, and they are working well, and Rich Brown has had similarly good experiences. Evenroute has been working hard to make the installation experience easy. Best yet, is that the IQrouter is autoconfiguring and figures out for you what to do in the face of “sag” in your Internet service, something that may be a “killer feature” if you suffer lots of “sag” from your ISP. The IQrouter is therefore the first “out of the box” device I can recommend to almost anyone, rather than just my geek friends.

The IQRouter does not yet have the very recent wonderful WiFi results of Toke and Dave (more about coming this in a separate post), but has the capability for over the air updates and one hopes debloated WiFi and ATF will come to it reasonably soon. The new WiFi stack is just going upstream into Linux and LEDE/OpenWRT as I write this post. DSL users seldom have enough bandwidth for the WiFi hop to be the bottleneck; so the WiFi work is much more important for Cable and fiber users at higher bandwidth than for DSL users stuck at low bandwidth.

The Evenroute is effective on all technologies, not just DSL. It is just particularly important for DSL users, which suffer from sag more than most…

Ubiquiti Edgerouter

I’ve bought an Ubiquiti Edgerouter X on recommendation of Dave Taht but not yet put it into service. Router performance can be an issue on high end cable or fiber service. It is strictly an Ethernet router, lacking WiFi interfaces; but in my house, where the wiring is down in the basement, that’s what I need.  The Edgerouter starts at around $50; the POE version I bought around $75.

The Edgerouter story is pretty neat – Dave Taht did the backport 2? years back. Ubiquti’s user community jumped all over it and polished it up, adding support to their conf tools and GUI, and Ubiquiti recognized what they had and shipped it as part of their next release.

SQM is available in recent releases of Ubituiti’s Edgerouter firmware.  SQM itself is easy to configure. But the Edgerouter overall requires considerable configuration before it is useful in the home environment, however, and its firmware web interface is aimed at IT people rather than most home users. I intend this to replace my primary router TP-Link Archer C7v2 someday soon, as it is faster than the TP-Link since Comcast keeps increasing my bandwidth without asking me.  I wish the Ubiquiti had a “make me into a home router” wizard that would make it immediately usable for most people, as its price is low enough for some home users to be interested in it.   I believe one can install LEDE/OpenWrt on the Edgerouter, which I may do if I find its IT staff oriented web interface too unusable.

LEDE/OpenWrt and BSD for the Geeks

If you are adventurous enough to reflash firmware, anything runnable on OpenWrt/LEDE of the last few years has SQM available. You take the new LEDE release for a spin. If your router has an Ath9k WiFi chip (or a later version of the Ath10k WiFi chip), or you buy a new router with the right chips in them, you can play with the new WiFi goodness now in LEDE (noted above). There is a very wide variety of home routers that can benefit from reflashing. Its web UI is tolerably decent, better than many commercial vendors I have seen.

WiFi chip vendors should take careful note of the stupendous improvements available in the Linux mac802.11 framework for bufferbloat elimination and air time fairness. If you don’t update to the new interfaces and get your code into LEDE, you’re going to be at a great disadvantage to Atheros in the market.

dd-wrt, asuswrt, ipfire, all long ago added support for SQM. It will be interesting to see how long it takes them to pick up the stunning WiFi work.

The pcengines APU2 is a good “DIY” router for higher speeds. Dave has not yet tried LEDE on it yet, but will. He uses it presently on Ubuntu….

BSD users recently got fq_codel in opnsense, so the BSD crowd are making progress.

Other Out of the Box Devices

The Turris Omnia is particularly interesting for very fast broadband service and can run LEDE as well; but unfortunately,  it seems only available in Europe at this time.  We think the Netduma router has SQM support, though it is not entirely clear what they’ve done; it is a bit pricey for my taste, and I don’t happen to know anyone who has one.

Cable Modems

Cable users may find that upgrading to a new DOCSIS 3.1 modem is helpful (though that does not solve WiFi bufferbloat).  The new DOCSIS 3.1 standard requires AQM.  While I don’t believe PIE anywhere as good as fq_codel (lacking flow queuing), the DOCSIS 3.1 standard at least requires an AQM, and PIE should help and does not require manual upstream bandwidth tuning.  Maybe someday we’ll find some fq_codel (or fq_pie) based cable modems.  Here’s hoping…

Under the Covers, Hidden

Many home routers vendors make bold claims they have proprietary cool features, but these are usually smoke and mirrors. Wireless mesh devices without bufferbloat reduction are particularly suspect and most likely to require manual RF engineering beyond most users. They require very high signal strength and transfer rates to avoid the worst of bufferbloat. Adding lots more routers without debloating and not simultaneously attacking transmit power control is a route to WiFi hell for everyone. The LEDE release is the first to have the new WiFi bits needed to make wireless mesh more practical. No one we know of has been working on minimizing transmit power to reduce interference between mesh nodes. So we are very skeptical of these products.

There are now a rapidly increasing number of products out there with SQM goodness under the covers, sometimes implemented well, and sometimes not so well, and more as the months go by.

One major vendor put support for fq_codel/SQM under the covers of one product using a tradename, promptly won an award, but then started using that tradename on inferior products in their product line that did not have real queue management. I can’t therefore vouch for any product line tradename that does not acknowledge publicly how it works and that the tradename means that it really has SQM under the covers. Once burned, three times shy. That product therefore does not deserve a mention due to the behavior of the vendor. “Bait and switch” is not what anyone needs.

Coming Soon…

We have wind of a number of vendors’ plans who have not quite reached the market, but it is up to them to announce their products.

If you find new products or ISP’s that do really well, let us know, particularly if they actually say what they are doing. We need to start some web pages to keep track of commercial products.

Traditional AQM is not enough!

July 10, 2013

Note: Updated October 24, 2013, to fix some editorial nits, and to clarify the intended point that it is the combination of a working mark/drop algorithm with flow scheduling that is the “killer” innovation, rather than the specifics of today’s fq_codel algorithm.

Latency (called “lag” by gamers), once incurred, cannot be undone, as best first explained by Stuart Cheshire in his rant: “It’s the latency, Stupid.” and more formally in “Latency and the Quest for Interactivity,” and noted recently by Stuart’s 12 year old daughter, who sent Stuart a link to one of the myriad “Lag Kills” tee shirts, coffee bugs, and other items popular among gamers.lag_kills_skeleton_dark_tshirt

Out of the mouth of babes…

Any unnecessary latency is too much latency.

Many networking engineers and researchers express the opinion that 100 milliseconds latency is “good enough”. If the Internet’s worst latency (under load) was 100ms, indeed, we’d be much better off than we are today (and would have space warp technology as well!). But the speed of light and human factors research easily demonstrate this opinion is badly flawed.

Many have understood bufferbloat to be a problem that primarily occurs when a saturating “elephant flowis present on a link; testing for bufferbloat using elephants is very easy, and even a single elephant TCP flow from any modern operating system may fill any size uncontrolled buffer given time, but this is not the only problem we face. The dominant application, the World Wide Web, is anti-social to any other application on the Internet, and its collateral damage is severe.

Solving the latency problem requires a two prong attack.

(more…)

TCP Small Queues

October 1, 2012

Some puzzle pieces of a picture puzzle.Linux 3.6 just shipped.  As I’ve noted before, bloat occurs in multiple places in an OS stack (and applications!). If your OS TCP implementation fills transmit queues more than needed, full queues will cause the RTT to increase, etc. , causing TCP to misbehave. Net result: additional latency, with no increase in bandwidth performance. TCP small queues reduces the buffering without sacrificing performance, reducing latency.

To quote the Kernel Newbies page:

TCP small queues is another mechanism designed to fight bufferbloat. TCP Small Queues goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) and < 8ms on 100Mbit (instead of 132 ms).

Eric Dumazet (now at Google) is the author of TSQ. It is covered in more detail at LWN.  Thanks to Eric for his great work!

The combination of TSQ, fq_codel and BQL (Byte Queue Limits) gets us much of the way to solving bufferbloat on Ethernet in Linux. Unfortunately, wireless remains a challenge (the drivers need to have a bunch of packets for 802.11n aggregation, and this occurs below the level that fq_codel can work on), as do other device types.  For example, a particular DSL device we looked at last week has a minimum ring buffer size of 16, again, occurring beneath Linux’s queue discipline layer.  “Smart” hardware has become a major headache. So there is much to be done yet in Linux, much less other operating systems.

I’m attending the International Summit for Community Wireless Networks

September 24, 2012

I will be giving a updated version of my bufferbloat talk there on Saturday, October 6.  The meeting is about community wireless networks (many of which are mesh wireless networks) on which bufferbloat is a particular issue.  It is in Barcelona, Spain, October 4-7.

We tried (and failed) to make ad-hoc mesh networking work when I was at OLPC, and I now know that one of the reasons we were failed was bufferbloat.

I’ll also be giving a talk at the UKNOF (UK Network Operator’s Forum) in London on October 9, but that is now full and there is no space for new registrants.

The Internet is Broken, and How to Fix It

June 26, 2012

Some puzzle pieces of a picture puzzle.

Many real time applications such as VOIP, gaming,  teleconferencing, and performing music together, require low latency. These are increasingly unusable in today’s internet, and not because there is insufficient bandwidth, but that we’ve failed to look at the Internet as a end to end system. The edge of the Internet now often runs congested. When it does, bufferbloat causes performance to fall off a cliff.

Where once a home user’s Internet connection consisted of a single computer, it now consists of a dozen or more devices – smart phones, TV’s, Apple TV’s/Roku devices, tablet devices, home security equipment, and one or more computer per household member. More Internet connected devices are arriving every year, which often perform background activities without user’s intervention, inducing transients on the network. These devices need to effectively share the edge connection, in order to make each user happy. All can induce congestion and bufferbloat that baffle most Internet users.

The CoDel (“coddle”) AQM algorithm provides the “missing link” necessary for good TCP behavior and solving bufferbloat. But CoDel by itself is insufficient to solve provide reliable, predictable low latency performance in today’s Internet.

Bottlenecks are most common at the “edge” of the Internet and there you must be very careful to avoid queuing delays of all sorts. Your share of a busy 802.11 conference network (or a marginal WiFi connection, or one in a congested location) might be 1Mb/second, at which speed a single packet represents 13 milliseconds. Your share of a DSL connection in the developing world may similarly limited. Small business often supports many people on limited bandwidth. Budget motels commonly use single broadband connections among all guests.

Only a few packets can ruin your whole day!  A single IW10 TCP open has immediately blown any telephony jitter budget at 1Mbps (which is about 16x the bandwidth of conventional POTS telephony).

Ongoing technology changes makes the problem more challenging. These include:

  • Changes to TCP, including the IW10 initial window changes and window scaling.
  • NIC Offload engines generate bursts of line rate packet streams at multi-gigabit rates. These features are now “on” by default even in cheap consumer hardware including home routers, and certainly in data centers. Whether this is advisable (it is not…) is orthogonal to the reality of deployed hardware and current device drivers and default settings.
  • Deployment of “abusive” applications (e.g. HTTP/1.1 using many > 2 TCP connections, sharded web sites, BitTorrent). As systems designers, we need to remove the incentives for such abusive application behavior, while protecting the user’s experience. Network engineers must presume software engineers will optimize their application performance, even to the detriment of other uses of the Internet, as the abuse of HTTP by web browsers and servers demonstrates.
  • The rapidly increasing number of devices sharing home and small office links.
All of these factors contribute to large line rate bursts of packets crossing the Internet to arrive at a user’s edge network, whether in his broadband connection, or more commonly, in their home router.
(more…)

The Bufferbloat Bandwidth Death March

May 23, 2012

Some puzzle pieces of a picture puzzle.Latency much more than bandwidth governs actual internet “speed”, as best expressed in written form by Stuart Chesire’s It’s the Latency, Stupid rant and more formally in Latency and the Quest for Interactivity.

Speed != bandwidth despite all of what an ISP’s marketing department will tell you. This misconception is reflected up to and including FCC Commissioner Julius Genachowski, and is common even among technologists who should know better, and believed by the general public. You pick an airplane to fly across the ocean, rather than a ship, even though the capacity of the ship may be far higher.

(more…)

A Milestone Reached: CoDel is in Linux!

May 22, 2012

Some puzzle pieces of a picture puzzle.The CoDel AQM algorithm by Kathie Nichols and Van Jacobson provides us with an essential missing tool to control queues properly. This work is the culmination of their at three major attempts to solve the problems with AQM algorithms over the last 14 years.

 

Eric Dumazet wrote the codel queuing discipline (based on a quick prototype by Dave Täht, who spent the last year working 60 hour weeks on bufferbloat) which landed in net-next a week or two ago; yesterday, net-next was merged into the Linux mainline for inclusion in the next Linux release.  Eric also implemented a fq_codel queuing discipline, combining fair queuing and CoDel  (pronounced “coddle”), and it works very well.  The CoDel implementation was dual licensed BSD/GPL to help the *BSD community. Eric and others have tested CoDel on 10G Ethernet interfaces; as expected, CoDel performance is good in what’s been tested to date.

Linux 3.5 will likely release in August. So it was less than a month from first access to the algorithm (which was formally published in the AQM Queue May 6) to Linux mainline; it should be about four total from availability of the algorithm to Linux release.  Not bad at all :-).

Felix Fietkau merged both the codel and fq_codel into the OpenWrt mainline last week for release this summer. 37 architectures, 150 separate routing platforms, no waiting…

The final step should be to worry about all the details, and finally kill pfifo_fast once we’ve tested CoDel enough.

While I don’t think that this is the end of the story, fair queuing in combination with CoDel, and Tom Herbert’s great BQL work together go a very long way toward dealing with bufferbloat on Ethernet devices in Linux.  I think there is more to be done, but we’re much/most of the way to what is possible.

Some Ethernet hardware (both NIC’s and many Ethernet switches) has embedded bufferbloat (in this case, large FIFO buffers) that software may not be able to easily avoid; as always you must test before you are sure you are bloat free! Unfortunately, we’ll see a lot of this:  a very senior technologist of a major router vendor started muttering “line cards” in a troubled voice at an IETF meeting when he really grokked bufferbloat. Adding AQM, simple as CoDel is, as an afterthought may be very hard: do not infer from the speed of implementation in Linux on Ethernet that all operating systems, drivers, and hardware will it will be so easy or retrofit possible; for example, if your OS has no equivalent to BQL, you’ll have that work to do (and easily other work too).

Wireless is much more of a challenge than Ethernet for Linux, particularly 802.11n wireless; the buffering and queuing internal to these devices and drivers is much more complex, and the designers of those software stacks are still understanding the implications of AQM, particularly since the driver boundary has partitioned the buffering in unfortunate ways. I think it will be months, to even a year or two before those have good implementations that get us down to anything close to theoretical minimum latency. But running CoDel is likely a lot better than nothing anyway if you can: that’s certainly our CeroWrt/OpenWrt experience, and the tuning headaches Dave Täht was having trying to use RED go away. So give it a try….

Let me know of CoDel progress in other systems please!

The Next Nightmare is Coming

May 14, 2012

BitTorrent was NEVER the Performance Nightmare

BitTorrent is a lightning rod on two fronts: it is used to download large files, which Some puzzle pieces of a picture puzzle.the MPAA sees as a nightmare to their business model, and BitTorrent has been a performance nightmare to ISP’s and some users. Bram Cohen has taken infinite grief for BitTorrent over the years, when the end user performance problems are not his fault.

Nor is TCP the performance problem, as Bram Cohen recently flamed about TCP on his blog.

I blogged about this before but several key points seem to have been missed by most: BitTorrent was never the root cause of most of the network speed problems BitTorrent triggered when BitTorrent deployed. The broadband edge of the Internet was already broken when BitTorrent deployed, with vastly too much uncontrolled buffering, which we now call bufferbloat. As my demonstration video shows, even a single simple TCP file copy can cause horrifying speed loss in an overbuffered network.  Speed != bandwidth, despite what the ISP’s marketing departments tell you.

But almost anything can induce bufferbloat suffering (filling bloated buffers) too: I can just as easily fill the buffers with UDP or other protocols as with TCP. So long as uncontrolled, single queue devices pervade the broadband edge, we will continue to have problems.
But new nightmares will come….
(more…)

Bufferbloat goings on…

May 1, 2012

The bufferbloat front has appeared quiet for several months since two publications hit CACM (1), (2) and several videos hit YouTube, though I have one more article to write for IEEE Spectrum (sigh…).

There has been a lot going on behind the lines, however, and some major announcements are imminent on ways to really fix bufferbloat. But I wanted to take a moment to acknowledge other important work in the meanwhile so they do not get lost in the noise, and to get your juices flowing.

  1. First off, Linux 3.3 shipped with BQL (byte queue limits) done by Tom Herbert of Google.  This is good stuff: finally, the transmit rings in Linux network device drivers won’t cause hundreds of packets of buffering.
  2. Dave Taht has had good success prototyping in CeroWrt a combination of Linux’s SFQ and RED to good effect: SFQ ensures decent sharing among short lived interactive flows which receive preference to long lived elephant flow TCP sessions. As transient bufferbloat and TSO/GSO GRO/LRO smart NIC’s make clear, no comprehensive solutions for achieving good latency are possible without some sort of “fair” queuing and/or classification. As in all RED based AQM algorithms, tuning SFQRED is a bitch and a better AQM is badly needed; news at 11 on that front. CeroWrt is approaching its first release with all sorts of nice features and I’ll blog about it when it’s soup. In the meanwhile, adventurers can find all they want to know about CeroWrt at the links here.
  3. The DOCSIS changes to mitigate bufferbloat in cable modems continues on its way.  While I haven’t checked in to see when deployment really starts (driven by modification to cable carrier deployment systems), we should see this major improvement later this year.

And, as outlined in other writings on this blog, and demonstrated in this video, you can do things about bufferbloat in your home today.

So there is hope.  Really…  Stay tuned…

Bufferbloat demonstration videos

February 1, 2012

If people have heard of bufferbloat at all, it is usually just an abstraction despite having personal experience with it. Bufferbloat can occur in your operating system, your home router, your broadband gear, wireless, and almost anywhere in the Internet.  They still think that if experience poor Internet speed means they must need more bandwidth, and take vast speed variation for granted. Sometimes, adding bandwidth can actually hurt rather than help. Most people have no idea what they can do about bufferbloat.

So I’ve been working to put together several demos to help make bufferbloat concrete, and demonstrate at least partial mitigation. The mitigation shown may or may not work in your home router, and you need to be able to set both upload and download bandwidth.

Two  of four cases we commonly all suffer from at home are:

  1. Broadband bufferbloat (upstream)
  2. Home router bufferbloat (downstream)
Rather than attempt to show worst case bufferbloat which can easily induce complete failure, I decided to demonstrate these two cases of “typical” bufferbloat as shown by the ICSI data. As the bufferbloat varies widely as the ICSI data shows, your mileage will also vary widely.

There are two versions of the video:

  1. A short bufferbloat video, of slightly over 8 minutes, which includes both demonstrations, but elides most of the explanation. It’s intent is to get people “hooked” so they will want to know more.
  2. The longer version of the video clocks in at 21 minutes, includes both demonstrations, but gives a simplified explanation of bufferbloat’s cause, to encourage people to dig yet further.
Since bufferbloat only affects the bottleneck link(s), and broadband and WiFi bandwidth are often similar and variable, it’s very hard to predict where you will have trouble. If you to understand that the bloat grows just before the slowest link in a path, (including in your operating system!) you may be able to improve the situation. You have to take action where the queues grow. You may be able to artificially move the bottleneck from a link that is bloated to one that is not. The first demo moves the bottleneck from the broadband equipment to the home router, for example.
To reduce bufferbloat in the home (until the operating systems and home routers are fixed), your best bet is to ensure your actual wireless bandwidth is always greater than your broadband bandwidth (e.g., by using 802.11n and possibly multiple access points) and use bandwidth shaping in the router to “hide” the broadband bufferbloat.  You’ll still see problems inside your house, but at least, if you also use the mitigation demonstrated in the demo, you can avoid problems accessing external web sites.
The most adventurous of you may come help out on the CeroWrt project, an experimental OpenWrt router where we are working on both mitigating and eventually fixing bufferbloat in home routers. Networking and ability to reflash routers required!