How Much Does the Hosting Industry Waste on Uplink?
According to various estimates, anywhere between a quarter to half of all web traffic today is generated by bad bots. This is what we call parasitic traffic and it poses no benefits to either website owners or hosting companies in any way. However, one thing is clear: If you pay for uplink, and you’re not doing anything about parasitic traffic, then you’re paying for bad bots!
I know you’re probably thinking that most hosting providers do not pay for the amount of traffic, but rather for channels with a predetermined bandwidth. Well, I’d start off by saying that this is not always the case and, more often than not, the providers paying for channels actually end up overpaying because of the bad bot traffic, and sometimes at a premium when compared to others. Let’s try to figure it out, I’ve broken down some of the main points below.
On average, hosting providers spend from 1 to 3% of their revenue on uplink payments. The payment model differs depending on the company size and location. Basically, there are 4 typical cases:
- Hosting company owns its servers and owns the data center, and has direct contract with uplink providers. Then they either have a flat rate (e.g. for 100 Gbps line) or they pay per use.
- Hosting company colocates its own servers in 3rd party data centers and has direct contracts with uplink providers. The payment is the same as in the example above.
- Hosting company colocates its own servers in 3rd party data centers and buys traffic from data center operator. Usually they pay per use, but sometimes a transfer package (e.g. 100 GB) is possible.
- Hosting company rents 3rd party servers in 3rd party data centers and buys traffic from the data center operator. For example, you buy a dedicated server at a larger company and sell it as web hosting to 500 customers as a shared server. Usually there is some transfer included (10 – 30 TB per month) and extra transfers are charged additionally, or the transfer is throttled (depending on the company).
The big question here is how much does parasitic traffic cost for those who pay per use?
In web hosting, ingress traffic averages about 10% of the total traffic. That might not sound like much, but let’s take a look at how a typical site scraper bot operates.¹ The bot sends a single request to the site’s index page. It parses the content of the received page and gets a number of links to other pages. The same operation is then repeated on each of these pages, and the amount of traffic grows exponentially. Most such bots crawl entire sites, and in the case of an online store, they do not stop until they get the detailed pages of each and every product. Therefore, one initial request from such a visitor turns into tens and hundreds of thousands of requests to the site.
In our own practice, blocking such bots has typically reduced “pay-per-use” traffic charges by 8 – 12%. However, this is only the tip of the iceberg in assessing the real costs caused by bot traffic. One of the more prominent reasons [that bad bot drive up costs] is that bot traffic consumes a lot of server resources. Humans happen to have a short attention span and on average the depth of their browsing goes as far as 3 – 15 pages, while for a bot this value can be thousands of times higher. For a shared server hosting hundreds of websites, this can be especially sensitive (and potentially very costly!).
It’s not uncommon to see hosting providers overpaying by 25% or more. By eliminating bots, an average server would serve a lot more normal requests and, in turn, by maximizing on the resources now freed from bot traffic, you may not need to buy that new server you’ve been budgeting for (at least not until you have a real need for it).
With the “pay-per-use” model, everything is relatively simple. But why do those who pay for uplink channels with predetermined bandwidth sometimes have even more to lose? This depends entirely on their specific situation. Typically a hosting provider will connect to medium/small ISP’s through an internet exchange point, and in addition they can connect directly to 1 or 2 larger ISPs (usually, that’s a “trade secret”). Each channel has its own bandwidth, so, the main question is how much traffic is actually processed and how that might change in the absence of bot traffic. In some cases, the share of bot traffic is so significant that it actually forces the hosting company to purchase more bandwidth, costing them a significant amount of money.
What most hosting providers want to know (naturally) is if any of these costs can be reduced by blocking parasitic traffic. The answer is yes!
Working with our own hosting clients we have seen that, when it comes to direct uplink fees, which are usually considered as monthly operational expenses, the relevant savings average around 10%. As for the server costs, which usually relate to capital expenditures, the corresponding savings can be as high as 25%.
Anyone interested in finding out more about their own traffic is welcome to contact me to measure their web traffic and calculate possible savings. Please, just drop me a message at danica (at) botguard.ee.