I received the following question today from Ralph Droms. I include an edited version of my response to Ralph.
On Thu, Jun 20, 2013 at 9:45 AM, Ralph Droms (rdroms) <email@example.com> wrote:
Someone suggested to me that bufferbloat might even be worse
in switches/bridges than in routers. True fact? If so, can
you point me at any published supporting data?
It is hard to quantify as to whether switches or routers are “worse”, and I’ve never tried, nor seen any published systematic data. I
wouldn’t believe such data if I saw it, anyway. What matters is whether you have unmanaged buffers before a bottleneck link.
I don’t have first hand information (to just point you at particular product specs; I tend not to try to find out whom is particularly guilty as it can only get me in hot water if I compare particular vendors). I’ve generally dug into the technology to understand how/why buffering is present to understand what I’ve seen.
You can go look at specs of switches yourself and figure out switches have problems from first principles.
Feel free to write a paper!
Here’s what I do know.
- The simplest switch case is where you have a 10G or 1G switch being operated at 1G or 100M; you end up 10x or 100x over buffered. I’ve never seen a switch that cuts its internal buffering depending on line rate. God forbid you happen to have 10Mbit gear still in that network, and Ethernet flow control can cause cascades between switches to to reduce you to the lowest bandwidth….
- Thankfully, enterprise switch gear does not emit Ethernet pause frames (though honors them if received): but all the commodity switch chips used in cheap unmanaged consumer switches does generate pause frames, that I looked at. Sigh…
- As I remember, when I described this kind of buffering problem to a high end router expert at Prague, he started muttering “line cards” at me; it wouldn’t surprise me if the same situation isn’t present in big routers supporting different line rate outputs. But I’ve not dug into them.
- We even got caught by this in CeroWrt, where the ethernet bridge chip was misconfigured, and due to jumbo-grams, was initially accidentally 8x overbuffered (resulting in 80-100ms of latency through the local switch in a cheap router, IIRC; Dave Taht will remember the exact details.)
- I then went and looked at the data sheets of a bunch of integrated cheap switch chips (around 10 of them, as I remember): while some (maybe half) were “correctly” buffered (not that I regard any static configuration as correct!), some had 2-4x more sram in the switch chips than were required for their bandwidth. So even without the bandwidth switching trap, sometimes the commodity switch chips have too much buffering. Without statistics of what chips are used in what products, it’s impossible to know how much equipment is affected (though all switches *should* run fq_codel or equivalent, IMHO, knowing what I know now)….
- I hadn’t even thought about how VLAN’s interacted with buffering until recently. Think about VLAN’s (particularly in combination with Ethernet flow control), and get a further Excedrin headache…About 6 months ago I talked to an engineer who had had terrible problems getting decent, reliable, latency in a customer’s VOIP system. He tracked it down (miraculously) to the fact that the small business (less than 50 employees) was sharing an enterprise switch using VLAN’s for isolation from other tenants in a building. The other tenants in the building sometimes saturated the switch, and the customer’s VLAN performance for their VOIP TRAFFIC would go to hell in a handbasket (see above about naive sysops not configuring different classes of service correctly). As the customer was a call center, you can imagine, they were upset.
Ethernet is actually very highly variable bandwidth: we can’t safely treat it as fixed bandwidth! Yet switch designers make this completely unwarranted presumption routinely.
This is part of why I see conventional QOS as a dead-end; most of the need for classic QOS goes away if we properly manage buffers in the first place. Our job as Internet engineers is to build systems that “just work” that system operators can’t mis-configure, or even worse, come from the factory mis-configured to fail under load (which is never properly tested in most customer’ sites).
Enterprise Ethernet Switches
Some enterprise switches sell additional buffer memory as a “feature”! And some of those switches require configuration of their buffer memory across various QOS classes; if you foolishly do nothing, some of them leave all memory configured to a single class and disaster ensues.
What do you think a naive sysop does???? Particularly one who listens to the salesman or literature of the switch vendor about the “feature” of more buffering to avoid dropping packets, and buy such additional RAM?
So the big disasters I’ve heard of are those switches, where deluded naive people have bought yet more buffer memory, and particularly if they fail to configure the switches for QOS classes. That report came off the NANOG list, as I remember, but it was a couple of years ago and I didn’t save the message.
After reading that report I looked at the specs for two or three such enterprise switches and confirmed that this scenario was real, resulting in potentially *very* large buffering (multiple hundreds of milliseconds reaching even to seconds). IIRC, one switch had decent defaults, but another defaulted to insane behavior.
So the NANOG report of such problems was not only plausible, but certain to happen, and I stopped digging further. Case closed. But I don’t know how common it is, nor if it is more common than associated routers in the network.
Router Bufferbloat problems
I *think* the worst router problems are in home routers, where we have uncontrolled buffering (often 1280 packets worth) and highly variable bandwidth before the WiFI links and classic AQM algorithms such as WRED are both not present, and if were present, would not be of any use due to highly variable bandwidth. Home routers are certainly located where one of the common bottlenecks in the path are located and therefore are extremely common offenders. Whether better or worse than broadband hop next to them is also impossible to quantify.
I’ve personally measured up to 8 second latency in my own home without deliberate experiments. In deliberate experiments I can make latency as large as you like. That’s why we like CoDel (fq_codel in particular) so much: it responds very rapidly to changes in bandwidth, which are perpetual in wireless. Fixing Linux and Linux’s WiFi stack is therefore where we’ve focused (not to mention the code is available, so we can actually do work rather than try to persuade clueless people of their mistakes, which is a difficult road to hoe. This one is the one we seem to see the most often, along with the hosts and either side of the broadband hop.
The depth and breadth of this swamp is immense. In short, there is bufferbloat everywhere: you have to be systematically paranoid….
But which bufferbloat problem is “worst” is I think, unanswerable. Once we fix one problem, it’s whack-a-mole on the next problem, until the moral sinks home: Any unmanaged buffer is one waiting to get you if it can ever be at a bottleneck link. Somehow we have to educate everyone that static buffers are landmines waiting for the next victim and never acceptable.