Internet ‘speed’, what is it and am I getting what I pay for?

Problem: ISPs sell Internet services based on a metric that uses Line Capacity, or ‘speed’ as the primary measure of maximum amount of stuff you can move in a given period. It is NOT equivalent to a vehicle top-speed as an example. It is, however, more like how much freight can we move in a period.

The difference between a 10 Megabits per second (Mbps)  line and a 100 Mbps line is not that one is 10x ‘quicker’, it’s simply that the 100Mbps line can move ten times more data in aggregate per period than the other.

If we use a a truck analogy:

  • A 10 Mbps line is a small pickup than can move 1000 lbs of earth from point A to point B in one hour.

  • A 100 Mbs is a large dump truck that can move 10,000 lbs of earth from point A to point B one hour.

  • They both take an hour to do it, but one can carry more stuff.

So when an ISP sells you more ‘speed’ it’s really just a bigger truck, not a Ferrari.

That’s fine if what you need to do is move a lot of earth around. But what if you need to move 500 lbs of earth every two hours? Either truck will get the job done, and both will take an hour to deliver it. In our example, the bigger truck is not ‘quicker’.

If we need to move 2,000lbs, then the bigger truck can do it in one trip; this would be ‘quicker’ than the smaller truck.

There is another metric in Internet connectivity that is not as well-known as capacity, but is very relevant to good results, and that is delay, or latencies. Delay is a phenomenon that can dramatically alter results and affects the end-user perception of quality, and it can vary significantly.

As an example, say we need to move a payload of 100 lbs perishable foodstuffs every minute from point A to B. We know this is a small load, so either the small truck or the big truck can do it, and it won’t affect the outcome.

Every second trip, the truck (either one) encounters several red-lights along the way, and if red, the vehicle stops until it turns green again.

If this light stays red for longer than 3 minutes, the load is spoiled and must be dumped, and the load must be re-sent. Every second of delay counts, as this makes the payload arrive later. The longer the light, the worse the outcome. This affects the perception at the receiving end of ‘quickness’. When trucks are delayed, they are perceived as slow, because they get there later.

The delay can have a big impact on outcomes:  too much delay, and the loads spoil, and any added delay is immediately noticeable.

On the internet, most payloads are small, and quite sensitive to delay, as the sequencing and responses to these small payloads have much more impact to things like the load times when fetching a complex web page. Even if the page has multiple large images, there are hundreds of other smaller items that must be fetched and processed before the page is complete. Any added delay fetching those hundreds of items accrues and impacts the smoothness of the load and the overall completion time.

Latency Definitions:

 Latency = Delay. It’s the amount of delay (or time) it takes to send information from one point to the next.
Baseline latency: the minimum Ping time of a connection
Bloat latency: the added variable latency due to BufferBloat on a loaded line

Capacity vs. time

Capacity vs. time

As an example, on a 10 Mbps line, the overall time it takes to load a news site page can take anywhere from 5 seconds on a low-latency line, to over a minute on a high-latency, bloated line. The impact of latencies on the load, and even requiring re-transmissions for expired requests, can have orders of magnitude more effect than raw ‘speed’ or capacity.

For example, that same 180-element page, loading on a supposedly 10x faster 100 Mbps line, loads at best 2.5 times quicker (with both lines having low-latencies), as most of the payloads are actually small. Only a few are large enough that the bigger capacity allows for quicker delivery, and often, the load times are the same, as latencies affect the outcome more than data sizes.

This non-linear behavior continues to apply as the ‘speed’ or capacity of a line goes up. A 300 Mbps line loads that same page only 2.8 times faster than the 10 Mbps line.

Latency vs. time

Latency vs. time

Latencies, in contrast, tend to have a linear effect on outcomes, as they impact small transactions (the majority of Internet traffic) the most, and add up over time. If severe enough, it causes additional wait time for re-transmissions of expired traffic.

Latencies have a larger impact than capacity on most internet activities, thus keeping latencies minimized is critical to a better outcome.

BufferBloat is a phenomenon that has great impact on Internet usability, and yet is not often measured or controlled. Measured as latency build-up under load, BufferBloat indicates how much delay a line sees when its capacity is fully utilized. There can be 10x to 100x variance in these metrics between excellent and inferior performance. This is much greater than speed / capacity variances, and as noted above, more impactful on average transactions.

Managing the traffic to ensure low-latency results at all times is critical to well-performing and consistent Internet. With a traffic manager well-tuned to the line, we can ensure latencies remain at a low-enough level to not impact time-sensitive transactions like Voice over IP (VoIP), gaming and other interactive applications. A traffic manager is typically run in the router connected to the line.

As a line becomes congested, and the danger of latency build-up looms, a traffic manager controls the pace of traffic to keep latencies under control. This pacing means that the line capacity will never be fully consumed, as there is a need to keep a reserve available for high-priority small transactions and to stay away from potentially exceeding the limits and incur bufferbloat.
Therefore, metrics regarding capacity or ‘speed’ will never report the full line capacity as measured without a traffic manager. As reaching that point also means latencies build up.

Traffic managers, such as those in an IQrouter, are typically set at 5 to 10% lower than total line capacity. Furthermore, metrics for capacity run through routers with those settings can read a further 5 to 10% lower (for a variety of technical reasons we won’t go into now).

Typical max capacity usage

Typical max capacity usage

This means that a 100 Mbps line will likely have the traffic manager limiting max capacity to 95 Mbps, and thus a measurement of ‘speed’ from a laptop connected to that router will read somewhere in the 88 to 91Mbps range. However, it will have 10x to 100x lower latencies than a non-traffic managed router on the same line. As discussed above, most Internet traffic is more sensitive to latency than it is to capacity. This makes an exchange of 10 to 20% of the capacity metric for 10x to 100x improvement in latency an excellent trade-off. Again, capacity is rarely in full use. Look at this graph showing the average vs maximum capacity usage for a home with dozens of connected devices and two work-from-home professionals on a 100 Mbps line. Max capacity is rarely reached (mostly when running tests), but latencies affect every single one of the millions of transactions that occur daily.

The discussion up to this point is without consideration of the fact that most home and business networks have more than one device connected. Every day, more and more devices are connected to our networks. Thus, a good traffic manager will ensure that there is fairness applied to Internet capacity access. With a traffic manager built into a router such as the IQrouter, no one device can hog all the capacity (such as running a speed test) if there are other devices needing concurrent access. Some of that other traffic might be higher priority traffic, like VoIP from a phone call, forcing the bulk data from an internet speed test to be queued. Thus, a speed test run on a busy network from a device, will not reflect the true line-capacity. However, it will reflect the amount of bulk (low priority) traffic that device is allowed to access at that moment, and that might only be a fraction of the total available.

This is why the IQrouter has a built-in speed test (Configure->Speed Test) that will account for all traffic flowing through the router. Since it is traffic managed, one must reference the ‘Peak throughput’ for what the line might be capable of overall, and ‘Total Traffic’ reflects actual managed capacity.

Finally, we hope that the above explanations help you understand that ‘speed’ of an ISP line is really a measure of capacity, not velocity, and that a router with an effective traffic manager, ensuring fair, low-latency connectivity to all devices is important to good Internet Quality.

Further reading on speed test evaluation.