Bufferbloat and network neutrality – back to the past…

There are at least three topics in which I think bufferbloat intersects the currently hot topic of network neutrality. This is entirely personal opinion, and not that of my employer.

  • by happenstance, broadband ISP’s are enjoying a serious competitive advantage with any other provider of telephony  and gaming services. I believe it unlikely this advantage put in place with malice aforethought, though I expect conspiracy theorists will enjoy trying to prove otherwise.  I think we’d have heard of this by now or there are a very few, very smart people out there who have managed to figure bufferbloat out and keep their mouths firmly shut. But if they were that smart, why did they not foresee the pain of the next bullet?
  • the impact of bufferbloat on ISP’s needs to be well understood, to understand their motivations. I now believe bittorrent hit ISP’s and their customers much harder everyone understands; but that ISP’s diagnosis of root cause was flawed.
  • to preserve future innovation for new applications and services.

We should not set public policy going forward without understanding what may actually have happened, rather than a possibly flawed understanding of technical problems.

Unfortunately, everyone has taken now very public positions based on a probably a flawed analysis of  the very real, painful problems they have experienced. Getting everyone to stop and revisit their presumptions, and rethink and possibly change their positions is going to be hard. Please help!

Sherman, please set the Wayback machine to when bittorrent first deployed (2004).

Telephony and Bufferbloat

If you get your conventional telephone service from your broadband carrier, it may be/probably is provisioned independently of your data service.  This is certainly typically true for DSL (that has been one of its easy upgrade path features), I believe is typically true of telephone services provided by cable providers.  I don’t personally have any  clue how fiber services typically provision telephony. Perhaps some of you know the answers.

You can think of these systems as enabling telephony to access a separately provisioned “class of service”, over a different channel on the last mile (and it may be implemented as IP QOS classification internally, though I gather they may also use different signalling channels in the broadband access itself).  I personally don’t think that some traffic classification is entirely bad.  To get really reliable telephony under high load, some sort of traffic classification may be needed (at times), and QOS also enable strong guarantees hard to provide otherwise. But at current common broadband speeds, there is enough bandwidth available that even without traffic classification, we would be able to get a lot of calls to work very well even with a lot of competing flows, if we did we not suffer from bufferbloat (but see my previous and future posts regarding recent changes to web browser behavior).

The problem arises is that (I believe unintentionally by all concerned), bufferbloat in broadband services has put independently provided VOIP or Skype telephony over the IP data services at a serious disadvantage since this QOS classification is not available to the user’s devices; they have  to fight against the high latency and jitter imposed by the stupid and bufferbloated broadband data devices that provide no traffic classification to customer devices. Mitigating or solving bufferbloat makes alternative telephony services much more viable and more competitive. Whether this is good, or bad, depends on where you sit. Certainly mitigating bufferbloat in your home router can/does make such service work very much better, (I’m much less unhappy using skype that I used to be) and that is the subject of tomorrow’s installment.

Bittorrent and Bufferbloat

I’ve  heard claims made (by people independent of Comcast) that blocking bittorrent completely was unintentional on their part. I do know first hand that when the controversy hit, I personally tried testing bittorrent at home and found I could not make it function at all, and that Comcast were the responsible party that acted without disclosure; I do strongly believe that new applications should be able to deploy without playing “Mommy May I” with all the different ISP’s, which stifles innovation.

Please also remember in this discussion that bittorrent can induce several problems (e.g., transit and traffic stability problems when bittorent’s guess for locality is poor), and bittorrent’s issues are certainly not limited to customer and operator suffering caused directly by bufferbloat. For an ISP, there are multiple bittorrent pain points, not visible to most home users.

Start by remembering that any protocol can trigger bufferbloat suffering if it saturates a link; what I demonstrated was that a single TCP connection could/would/does do so. Personally, I started using bittorrent to download Linux distributions and similar large images somewhat later than 2004, and have educated my kids carefully about copyright. In my household, it has clearly been Dad who usually did in the wife and kids, rather than vice versa; this may be common among many other readers of this blog. In most households, however, it’s likely been the reverse, with the kids inflicting pain on their parents.

Video uploading to YouTube was in the future; video downloads were mostly in the future, certainly of large HD content streamed to disk most likely to saturate the customer’s links. Uploading of dead application carcasses for crash analysis was less common. Uplink bandwidth was so low that using cloud storage for backup was infeasible for most. So many of today’s applications that trigger bufferbloat misbehavior were significantly less common. The dominant desktop operating system was overwhelmingly Windows XP (and older) with > 90% market share. Browsers were still primarily obeying RFC 2068 rules about # of active connections (no more than 2).  Windows XP and before never has more than 64KB in flight at once, and browsers of that era typically would never use more than 2 TCP connections, and so would be unlikely to fill buffers (though might cause significant latency, a single user would not routinely saturate connections due to this limitation). At the time bittorrent deployed, it was the first time many uplinks were routinely filled for long periods. Any other application/web services with similar characteristics (e.g. YouTube uploads) could have triggered the problem.

By 2004-2005 era, bufferbloat was already well established in broadband networks.

The Motorola SB5100 series modems I experimented with was introduced in April 2003, for example; it was one of the standard cable modems provided by Comcast (until prices went up recently, I rented my modem). Bufferbloat had already been noticed by some, though not recognized as a generic problem.  DSL also has similar trouble trouble and similar history: I don’t know the fiber history.  Both cable and DSL broadband services are very asymmetric; the uplink bandwidth in that era was very low (but the buffer sizes the same as I observed).  Uplink speeds of 384 or 768Kbps were commonplace; IIRC, as a computer person I had paid for 768Kbps uplink service in that era (and was happy when at the same cost, it was increased greatly a few years later). Many/most customers only paid for 384Kbps  uplinks. The buffer sizes I see, as the Netalyzr data shows, are not unusual. And I’m really not picking on Motorola here; they just happens to be the vendor of the modems I have used: I have no reason to believe their bufferbloat is any larger or smaller than their competitors: I have no information at all there and what tiny anecdotal information I have is that there are likely far worse vendors. Netalyzr shows many different buffer sizes are present.

What happened to customers when their kids (or they) started using bittorrent for whatever purpose?

Bad thingsTM.

Let’s examine my data, and make the direct extrapolation based on my experiments.  Where I saw 1 to 1.5 second latency on that hardware, I would have  3-4.5 seconds latency in 2004 on my  784Kbps uplink, as the buffer in the modem was the same size. My uplink bandwidth was 1/3 of what I had when I took my ConPing2 dataset recently on the SB5101 modem.  Customers only buying 384Kbps uplinks would have been completely dying with latencies in the 10 second range.

Here’s serious speculation: I know the problems I saw over a year ago were enough to cause me to log multiple service calls,  and I attempted to debug it with second and third level support: I now think it likely, but not certain, that those problems were been bufferbloat in some guise. Unfortunately, that equipment is now scrap due to lightning so I can’t attempt to reproduce what I saw then now I have better understanding. Alternatively, it could in fact have been the cause I diagnosed at the time: hidden damage from lightning in a NIC either in my home router or the cable modem.  In my recent experiments, I’ve not seen the hideously high loss rates I sometimes observed then (though I’ve also had a few reports of others reproducing my bufferbloat result of extreme loss rates; but nothing reproducible so far). But I’ve not run enough experiments looking for bufferbloat packet loss to rule out much worse behavior than you see in my traces. I don’t know if I’ll ever know for sure.

I do know with 100% certainty I’d have been on the phone with support incessantly with 3-10 second latencies and multi-percent observable packet loss.

I believe many ISP’s with limited uplink bandwidth with bufferbloated infrastructure started to see a serious rise where it hurts them in the pocket book most: in service calls from customers (in addition to the other bittorrent issues which I’m not trying to minimize). No company wants admit to severe problems in their product in public. But I also believe they mis-diagnosed the root cause of their pain, and shot the messenger of the broken network (bittorrent) rather than fixing the network. So part of ISP’s motivations (far from all; I can’t see a CEO of a shareholder beholden corporation ignoring the opportunity to extract rent out of everyone), I hypothesize has been caused by the very real pain they felt reacting to what happened when a major new application deployed causing major headaches to both them and their customers.

At a later date, broadband ISP’s upped their uplink bandwidth; this brought the effects of bufferbloat back to a semi-manageable situation (by reducing service calls).  But bufferbloat is  getting worse again: the same phenomena that has encouraged ethernet and wireless toward larger buffers is at work again: the static buffers on latest equipment appears to be sized for the absolute highest delay/bandwidth product that the devices could ever possibly need (and then some), sized to paper over whatever performance bugs they may have; as you will see, when I double back to cover why the ethernet and wireless NIC buffers have been growing, buffers are often/usually being used to cover a number of sins.

If my bittorrent/bufferbloat hypothesis is correct, it helps explain ISP’s wish to control applications; they may be seeing control as an existential problem. But I believe the underlying undiagnosed bufferbloat problem made a situation much worse in a way most of us have not appreciated. And there are indeed valid times when traffic management may be needed to protect the network and even be imposed quickly; in the modern era, applications can deploy much faster than in the past, and I can certainly see emergency action may be needed (but not in secret).  Whatever you may think of network neutrality, I believe you need to understand the very real pain that I think ISP’s and some of you endured had a different cause than you may have understood.

Chilling of Innovation

Beyond the concrete example of harm of bufferbloat to the competitive market illustrated telephony example, let me note the following: so long as there is bufferbloat in broadband (and home routers), and only carriers have mechanisms to separately provision different classes of service over independent provisioning (as happens with telephony on broad-band today) without user’s having any access to that quality of service provisioning, any deployment of new innovative low latency applications (such as the immersive teleconferencing I work on for Bell Labs) will be greatly slowed.  In such systems as ours, carriers have a somewhat advantageous position in any case (they are closer to most users than the rest of the network, and it makes sense for them to host much of the infrastructure that can optimize what we are doing; their advantage locality and the speed of light!).

If bufferbloat is not solved in the Internet, not only are current low latency applications such as telephony and gaming problematic and present fair competition issues, but so are future applications. If deployment of a system like ours not only requires separate infrastructure and provisioning to make work well, I fear what we are inventing can never succeed. Separate paths and provisioning is less efficient and more expensive, prone to abusive rents, and deployment of new applications may languish for years or decades. If we can only make immersive teleconferencing  work with such separate provisioning as has been done for telephony, I feel our project is doomed. If broadband worked properly today, we would face none but the usual problems to take our innovations to marke.

ISP’s would have incentive to invest, as they would both have additional service opportunities, and ways to reduce load on the network. I see arguments that such separate provisioning of low latency services such as telephony as  “good thing” as being fundamentally flawed.  I want a single pipe, that works well, and which I (the consumer) can decide how much to pay for the what of service.  I have no problems  at all with congestion pricing (if I want to do my immersive teleconference at peak hours, and object to flaws in the service) I am happy to pay for the privilege. I am happy to pay for additional “added value” services (when priced fairly and competitively).

We must make the Internet work well to preserve innovation; to do that, bufferbloat must be overcome.

Conclusions

I may be incorrect about details in the above points; but I think I’m right. So my personal conclusions are:

We should not set public policy going forward without understanding what may actually have happened, rather than a possibly flawed understanding of technical problems.

Unfortunately, everyone has taken now very public positions based on a probably a flawed analysis of  the very real, painful problems they have experienced. Getting everyone to stop and revisit their presumptions,  and rethink and possibly change their positions is going to be hard. Please help!

 

21 Responses to “Bufferbloat and network neutrality – back to the past…”

  1. Whose house is of glasse, must not throw stones at another. « jg's Ramblings Says:

    […] on random topics, and occasional rants. « The criminal mastermind: bufferbloat! Bufferbloat and network neutrality – back to the past… […]

  2. Nicholas Weaver Says:

    I think its slightly (but only slightly) different:

    The problem with BitTorrent was particularly hard on cable networks.

    Since the DOCSIS media is a shared uplink (with an access mechanism), a saturated common uplink will affect all users on the segment, but the buffering ends up being in the user’s local cable modem (which is already bad-bad-bad) and if you look at the aggregate buffering on the uplink as distributed to these endpoints it becomes absolutely insane!

    So it wasn’t just a matter of one user shooting himself in the foot, it was a few users able to shoot the whole neighborhood in the foot. ( a simulation at http://www.cs.clemson.edu/~jmarty/papers/bittorrentBroadnets.pdf and these simulations were with buffers of 20-40 packets (32-64 KB buffers). As soon as the common uplink saturated, everything went to heck)

    DSL didn’t have this problem since the buffer the data would be in in the case of common congestion would be a common buffer in a carrier-grade router, and carrier grade routers are about the only area that doesn’t suffer from insane levels of bloat [1].

    I do think the support cost hypothesis is reasonable, and thus I think thats part of the reason that, in the US, we ONLY saw cable companies mucking with BitTorrent: no DSL providers.

    And comcast’s kill bittorrent ‘solution’ did only target uploads: http://www.isoc.org/isoc/conferences/ndss/09/pdf/08.pdf (Figure 1)

    Of course, their new solution is vastly, vastly better ( http://networkmanagement.comcast.net/ ), and it does have significant effects on when users experience buffer issues.

    What it does is: If there is no local congestion, it does nothing. (Note: Exact parameters have not been disclosed, but the rough parameters and operating mechanism have been. Anyway, there’s no good way to game this that I know about even if you do know the parameters, so for the discussion I’m using the sensible ballparks).

    If the uplink is within ~30% of maximum, all users who’ve used ~50% of their rated bandwidth over the past ~15 minutes are put into a QoS low category. And thats it.

    Which means a user will ONLY see uplink congestion (and thus, saturated cable-modem uplink buffers and the pain that induces) if

    a: They are saturating their rated allowance

    or

    b: There is neighborhood level congestion AND they are or recently were attempting to transfer a lot of data.

    So in both cases, rather than being a problem affecting everybody, it only affects those who are actively saturating their uplinks (the user’s network can’t “walk and chew gum”, but this no longer affects everybody else).

    [1] When I attended Internet 2 Joint Techs, it was interesting: They worry about buffers a lot. Namely, that they are too small for what they need (an ability to support full-rate single-flow TCP on 10 Gb+ links over 100ms across the country).

    • gettys Says:

      Nick, thanks for your insights: you know vastly more in this area than I do.

      Re: [1] And that is part of how we got so badly into this mess: we’ve sized many of the buffers for the worst case situation (and then some for good measure), and then, without thinking, believe the same buffering will work well on a very slow network (e.g. the txqueuelen setting for many Linux network device drivers).

      Near optimal behavior over high dynamic range networking is as hard a research problem as the “go fast” networking research we do right now.

      Static unmanaged buffers are almost always going to be wrong in today’s mobile internet.

    • gettys Says:

      Nick, does this mean we have the same situation on 802.11? The aggregate buffering across all the contending clients will, I expect also be mindbogglingly huge.

      • Nicholas Weaver Says:

        Not sure, but I’d bet on it.

        HOWEVER, how many cases do you have where you have more than a couple users on the same AP AND it is the air interface (rather than the upstream) thats the bottleneck?

        • gettys Says:

          Heh. A schoolroom of kids, on a mesh network or the AP is directly plugged into a school server.

          So this isn’t the home user case, it’s the school/institution/ad-hoc gathering case.

  3. Mason Says:

    [ 3rd try, posting to blogs with Javascript disabled is *not* easy ]

    For anyone wanting to sniff their switched Ethernet local network, I recommend the following Wireshark wiki article.

    http://wiki.wireshark.org/CaptureSetup/Ethernet

    which lead me to this list of netgear switches

    http://wiki.wireshark.org/SwitchReference/NetGear

    which mentions the GS105E (the trailing “E” is important), an inexpensive (40 USD) 5-port Gigabit switch with port mirroring support.

    Regards

  4. mpz Says:

    FWIW, my cable modem that exhibits this problem is a Motorola Surfboard SB5101E.

    Has it been confirmed that the buffer exists in the modem and not at the ISP’s end? If so, I might just consider buying a different cable modem.

    • gettys Says:

      While I think market forces are vital to the ultimate solution of bufferbloat, it’s premature to spend your money that way.

      1) we have no information about which vendors are better or worse (and Netalyzr, being a java applet, had no way of getting mac address information); you could be wasting your money. Until individuals and/or hardware reviewers do systematic testing for bufferbloat, you are shooting in the dark.
      2) the Netalyzr data, wonderful as it is, has some flaws: it under detected (and under-reported) bufferbloat, particularly on higher speed lines. The incidence may be (almost certainly is) higher than implied by the scatterplot.
      3) We have no good split between home router bufferbloat and broadband gear bufferbloat (netalyzr asks for you to tell it, but it’s not reliable); some of the problems are in the home routers, as I’ve mentioned before.
      4) and most importantly, to actually seriously mitigate bufferbloat latency, a factor of two or four improvement, while nice, doesn’t really succeed. Compare that with the mitigation I was able to do with a home router.

      So no, I wouldn’t go get a different cable modem, at least yet. If you want decent latency, you are much better off working on the home router by bandwidth shaping. Painful as the process I described is, you may actually end up with the broadband hop working correctly. Once there is actual data, *then* I encourage people to vote with their dollars.

    • gettys Says:

      Oh, and to specifically answer your question: yes, in the upstream direction there is known to be bufferbloat in the cable modems. Whether the CMTS’s (the cable head end boxes that talk to the modems) also do, when the next level up network is congested, is unknown.

      In the downstream direction, bufferbloat is also present as Netalyzr shows. This may be in the CMTS’s or cable modems, particularly if the ISP does not enable AQM (RED); and one of the references in one of the replies indicates (as the anecdotal information I have) that it is not uncommon for ISP’s to not have RED running in the CMTS’s. I know from talking to a friend at Cisco that their CMTS’s implement RED; but whether a particular ISP enables it is a story I’ll shed a bit of light on probably next week; but it’s clear many do not. I have no data on other vendors of CMTS gear; I would expect they do as a “check off” feature, but it’s possible some might not, for all I know. It’s not an area I have any insight into (other than the one conversation with a Cisco friend).

      Whether it even matters depends on the downstream bandwidth and how much bandwidth you get on your wireless hop; that determines where queues form.

      Again, you have to identify the bottleneck in the path the packets take; the queues form on either side of that bottleneck.

  5. gtmijvxasnk Says:

    Using FIFO instead of RED is stupid, disastrous, and unnecessary, and this is the true problem. That the FIFO buffer is “too big,” that is not the problem. It’s only filling up because it is a FIFO buffer rather than a proper, modern, and unfortunately somewhat rare, RED buffer.

    but even with RED there is a knob for how big must the buffer become before stochastic dropping begins, and how quickly does the drop probability ramp up to 100% (FIFO-equivalent). In practice this knob has to be set based on how many TCP flows you expect to share the buffer, and perhaps also on how jittery is the path those flows follow.

    BLUE is designed to be ASIC-implementable without being too much harder than RED, but doesn’t require this manual tuning. http://www.thefengs.com/wuchang/blue/41_2.PDF The science on all this is embarassingly old, yet I know of no ASIC implementation of BLUE. just getting RED turned on and properly-tuned seems like often too much to ask, which is really sad.

    (more complicated: I’ve heard cable carriers batch upstream ACK’s to save space on their expensive, inefficient upstream channel. Whatever ACK’s they batch turn into much larger microbursts in the downstream direction, which potentially taxes everyone else’s buffers. I don’t know how real or big is this effect, though.)

    The Internet2 discussion IS relevant, but it probably shouldn’t be if it weren’t for duopoly eyeball-ISP’s jackmoves, because for the networks we’re discussing the congestion should always be at the last mile, never in the core or the server farm, and therefore little or no buffering should be needed in the core. Of course this isn’t reality since Comcast is circumventing neutrality conventions by deliberately running their transit hot and then making you pay to push data to them through your private physical channel without going through the deliberately hot transit. but, it should be the case: the large buffer needed to move the 10gig * 100ms TCP would then be in effect spread over many customer-facing head-ends and DSLAMs.

    +1 on the idea that neutrality should involve giving end users $0 control over QoS on the last mile, so that new classes of applications can emerge, and VoIP can be truly unbundled. It is only fair since, with this kind of last-mile QoS, it will work if customers’ packets are only prioritized relative to other of the same customers’ packets. You are not asking for preferential treatment on “the network”, only for proper control of what’s yours, and if we had this a whole new world would open. If your ISP can’t do this, then they’re not acknowledging that you have a right to a certain slice of bandwidth, and even treatment wrt your neighbors even if not necessarily the level of service they claimed to be selling you. I think neighborhood-shared ISP’s should be forced to specify a non-oversubscribed CIR in both directions, and within this CIR there should be a standard architecture letting you mark flows for whatever priority you want.

    I don’t think you needed to ramble on with slovenly boot-licking copyright disclaimers or make these little win-win begging arguments where you presume to talk about another company’s tangential business motivation for treating you fairly, or waste paragraphs naming the specific model of your cable modem and then apologizing for unscientifically besmirching their holy name. seriously, whatever.

    Also I think your implication that buffers are “too big” misses the point of RED and BLUE and QoS in general. They are too primitive (FIFO), not too big. And there needs to be CIR guarantees from shared-bandwidth ISP’s and a mandatory last-mile QoS architecture.

    but thanks.

    • gettys Says:

      I included the vendor’s name solely that I was presenting data: data that you cannot reproduce independently is not data, in my book. Since the buffering and other behavior may differ between vendors, it’s germane to that.

      I agree absolutely that some AQM is needed and essential; but there are reasons why people have been gun-shy about about RED, and given the wide dynamic range of wireless, current RED may not be sufficient.

      The buffers are so big now they are beyond sanity. Though given real AQM, it wouldn’t matter what size they are.

  6. Проблемы с буферизацией в современных TCP/IP сетях (на английском) | Телекомблог: заметки и аналитика Says:

    […] «Bufferbloat and network neutrality – back to the past…» – оценивается связь буферизации и работы VoIP-телефонии и BitTorent трафика; […]

  7. Анализ проблем с буферизацией в современных TCP/IP сетях | AllUNIX.ru – Всероссийский портал о UNIX-системах Says:

    […] «Bufferbloat and network neutrality – back to the past…» – оценивается связь буферизации и работы VoIP-телефонии и BitTorent трафика; […]

  8. Net neutrality possibly an over hasty reaction Says:

    […] Gettys has been investigating an interesting technical I’net problem he calls bufferbloat. It is a result of technology advances being out of step where one aspect […]

  9. AQM/ECN in FreeBSD | Alexander Leidinger Says:

    […] about the prob­lems cur­rent buffer sizes of net­work equip­ment pro­voke (which may even have impli­ca­tions in the net neu­tral­ity debate), I had a look at which active queue man­age­ment (AQM) algo­rithms with or with­out […]

  10. Phil Says:

    I’m with gtmijvxasnk; the problem isn’t lots of buffering, it’s FIFO buffering. More buffering is fine as long as it’s intelligent.

    For years I’ve had a small Linux box with QoS between my home network and the outside world. The specific implementation has varied; it used to be an IP router, then an Ethernet bridge, and now it’s a “stub” with a single interface through which I loop only my outbound traffic. But the key is that it handles all my outbound traffic, and it rate-limits to what my service can actually accept so there is NEVER any FIFO queuing inside the cable or DSL modem.

    I defined five service priorities and invoke them with the IP DSCP field. DSCP 0 (the default) gets the middle priority; VoIP (DSCP=EF) gets top priority; and DSCP 8 (CS1) gets the lowest. This last one is my “scavenger class”, and I use it for all outbound Bit Torrent and TOR data.

    The priorities aren’t completely strict; e.g., the scavenger class is guaranteed 10 kb/s to keep TCP connections from starving and timing out from sustained higher priority traffic.

    Within each class, I use statistical fair queuing (SFQ) so that when there are multiple flows, each gets a fair share of the link.

    The result is that I can keep my link constantly saturated with Bit Torrent and TOR traffic without bothering anything else I do. I can even start up a few big interactive downloads and place a VoIP call without any harm to it whatsoever.

    It is unfortunate that such an adversarial relationship has developed between users and ISPs, especially with Bit Torrent users. QoS has been around for a while, and it can easily solve these problems if people would just turn it on and use it!

    It would be absolutely wonderful if ISPs would formally support user-controlled QoS at least on upstream links, which are the usual bottleneck. Guarantee a data rate for each user, and let the users contend for whatever capacity is left over from other users who happen to be idle. Encourage the use of the “scavenger class” DSCP for P2P by lifting all arbitrary speed limits and monthly traffic caps for traffic marked as scavenger class.

    Drop scavenger class packets when necessary to satisfy guarantees to other users but ONLY when actually necessary; never let a link go idle when traffic could use it.

    Also implement a higher priority class for, e.g., VoIP traffic. Your total traffic would still be subject to the same guaranteed rate limits so there’d be no incentive to abuse it. E.g., if you flagged all your P2P traffic as high priority, then the ISP could not distinguish it from your real VoIP traffic and your VoIP calls would suffer as a result.

    Network neutrality (or the lack thereof) is just a symptom of the real underlying problem: a lack of competition in the local broadband transmission market. A municipally owned dark fiber network, or alternatively one owned by a regulated common carrier strictly banned from the provision of end-user srvices, leased to any and all commercial service providers, would give users a meaningful alternative when they don’t like the policies of their current service provider. In a healthy competitive market, there’d be incentive to deploy technical mechanisms like DSCP to solve performance problems. It wouldn’t be necessary to legislate detailed neutrality rules that will undoubtedly be full of exploitable loopholes.

    • gettys Says:

      Again, classification, as you have done, is useful; but all it can do is move the pain point. In this case, you’ve moved it to BitTorrent and TOR. And this classification won’t work well in dynamic situations such as Comcast’s Powerboost, or when your ISP’s unable to provide exactly the amount of bandwidth you are supposed to be getting.

      If the buffers weren’t bloated in the first place, and congestion was therefore being correctly signaled in a timely fashion, the end-points would be backing off properly and sharing the link well. But the buffers have destroyed congestion avoidance long since.

      So most of the hair you’ve had to implement would not be necessary were it not for these unmanaged, bloated buffers. I’m not against classification (it can do things that AQM can’t); just pointing out that it can’t solve the real problem here, which are the unmanaged (and almost infinite) sized buffers.

      OK, lets’ say you want your scavenger class is supposed to use all additional bandwidth that is available: those applications will speed up until the buffers fill, and the next time you try to get some other class through that link, the broadband hops buffers are full and will take seconds to drain: not so nice….

      Also note that it is easy for bufferbloat to get you in your 802.11 hops (as soon as your broadband bandwidth exceeds your 802.11 goodput); in this case, you have to also solve the problem on each side of the 802.11 hops. Bandwidth, particularly “goodput” of actual data transferred is highly dynamic in wireless networks. So I think we have a more fundamental issue here, that classification just cannot solve by itself.

  11. Dave Täht Says:

    “For years I’ve had a small Linux box with QoS between my home network and the outside world. ”

    I have had the same, and applied nearly the same levels of traffic shaping as you do. The problem is, with newer kit, the depth of the device driver queues has grown from sane to bloated. Every new device/device driver I have has builtin, unmanageable, FIFO queues in the 256 to 512 buffer range, where 1 would be ideal and 32 (or so) the maximum, for good interactive networking performance.

    In the case of wireless, it’s often worse than the buffer depth, some devices retry sending a packet up to 13 times.

    I’ve been assembling a list of bloated device drivers. I would bet that the devices you are (to date) using have fairly small dma tx rings in the driver.

    Once the traffic hits a bloated device driver the wheels leave the road, traffic shaping is of little help, and with really big uncontrolled buffers, TCP itself begins to malfunction.

    http://www.bufferbloat.net/projects/bloat/wiki/Bloated_Driver_List

  12. Raphael Jacquot Says:

    Telephony on fiber service is provisionned 2 ways, depending on the technology used :

    if using GPON or similar which is based on some sort of TDMA / ATM, you use a different VPI/VCI equivalent, with priority to the phone carrying. Same goes for TV

    using pure ethernet, internet, telephone and TV are each in a separate VLAN with priorities applied appropriately

  13. Association Internet Libre en Corrèze » Blog Archive » Neutralité du Net : point de vue d’expert Says:

    […] Si vous aimez la technique, Jim Gettys, qui se bat depuis des années contre les tampons d’entrée-sortie trop grands dans les routeurs a trouvé un rapport entre ces bufferbloat et la neutralité, qu’il explique dans un article. […]

Leave a comment