How a Start-Up Received a $75,000 Bill for 2 Hours of Google Cloud Services

28-01-2021 | By Sam Brown

Cloud computing services such as Google Cloud and AWS provide developers with expandable resources that can grow with their needs. However, you may face a rather large bill unless deployed correctly, like one unfortunate start-up company.

What are cloud services?

When the term “cloud” was introduced several years ago, some were left confused (such as myself) about what the term really meant. Did it mean a new type of server, a service, or a new way to connect stuff together? No, the term “cloud” is another name for the internet.

If you want to put this statement to the test, replace the word “cloud” with “internet”, making sense very quickly. Internet services, internet computing, and internet backup make sense, but the word cloud helps with marketing and sales. However, as many are used to the term cloud, it will be used henceforth in this article.

Cloud services are internet-accessible services that run on some remote server unbeknown to the user. Once that service has completed its task, it will generally inform the user that the task is complete.

An example of a cloud service would be Google Docs; it provides users with a word editor in the browser (which runs client-side) and then allows saving to Google Drive, which itself is the cloud service. Another example of a cloud service would be just about any website; the user connects to the service, responds with the content, and can view the content.

It should be stated that the vast majority of cloud services are event-driven (i.e. they react to things that happen) and are rarely an ongoing process (like an infinite while(1) loop). Developers write scripts that describe what a cloud service should do when a message is received and provide APIs that allow users to interact with the services.

The Rise of Expandable Cloud Services

Anyone can create their own cloud service at home with relative ease; a few Raspberry Pi’s can be configured as Node.js servers. A single Raspberry Pi can be configured as a load balancer, and an internet connection provides an access point. Such a setup is excellent for IoT devices that access a server once every so often, but what about website hosting?

As the demands on a system increase, more servers are needed, faster local networks are required, and a better internet connection is necessary. This is where most users hit a bottleneck as they lack the funding to purchase many hundreds of Raspberry Pis, lack the space to make such a setup, and are limited by their ISP for using too much data.

Expandable cloud services popped up to help developers with such issues, such as Amazon Web Services and Microsoft Azure. These systems allow developers to deploy websites and services that can automatically expand their resources as they require them. Furthermore, the expansion is fully automatic, and the user only pays for the number of processing hours used.

Such services now drive many thousands of popular online services, and the companies running the web services are seeing incredible growth. Users should be extremely cautious when using such services as things can quickly get out of hand.

How a Start-Up Racked Up a $75,000 Bill

Recently, Announce, a start-up company, was at the final stages of developing their web service that shows users of announcements near their location on Google Maps. Preliminary testing looked good, it was time to deploy the service, and Google Cloud was chosen to run it.

An account was made, the company credit card was used across all the various Google services, and the free plan was selected. Of course, the designers understood that their service may need to grow during testing, so they put a $7 cloud budget into the service. The designers also used other free services but understood that the worst-case scenario would see that the free services would be suspended as they reach their daily limits.

The cloud service was deployed, the developers sat back and decided to rest up to see how things turned out over the next two days. Just two hours into the test, the developers noticed a warning from Google saying they had used up their free Firebase service. Still, it’s OK because Google automatically upgraded their Firebase account. After all, cloud services are designed to scale; this was the first red flag.

Almost immediately after receiving the email, another one came through, stating they had gone over their $7 budget. But it was OK because the developers set a $7 budget limit, or so they thought. As it turns out, the budget is actually a warning event that informs a developer when they have exceeded their budget.

The third email stated that all their Cloud Services had been suspended due to the companies’ credit card being denied. Why would a credit card be declined on $7? The developers opened the Cloud Billing app to find that they had a total bill of not $7 but $5,000.

The team panicked, trying to figure out why the bill was so large, but as they started to damage the situation, the bill updated to $15,000. As the day progressed, the bill reached a final value of $72,000, all for two hours of cloud computing time.

What caused the enormous bill?

Without going into technical details, the cause of the hefty bill was essentially poorly written code. As stated previously, cloud computing is an event-based resource that reacts to messages, performs its task, and then sends the result.

Unknowingly to the developers, they had created a recursive function that would perform many requests before coming to a solution. As soon as the function was called, the cloud service systems had to expand their capabilities very quickly, and the deployed cloud service had used over 16,000 hours of CPU time and 116,222,164,695 read operations from Firebase.

The good news: Google waved the bill!

If the bill had been enforced, which Google could have been within their rights to do, the start-up would have gone into bankruptcy. However, after contacting Google about the situation, they could have the bill waved.

While Google is a tech company’s mammoth, it has no interest in crushing users who may provide long-term service, especially those that could someday become gigantic. It is also likely that Google reviewed its system and recognised that there was definitely a flaw in the budgeting system with no ability to prevent services from consuming large amounts of processor time.

This does not mean that Google, or other companies, will forgive all mistaken overloaded recursive functions, but let this be an example of how cloud services should be used. Developers on such platforms should always take heed when writing code and ensure that all functions operate quickly with no calls to themselves or functions that perform large operations that could otherwise be done once. The result is shared with all other users who need the same data.

Should you create your own cloud platform?

While this question may seem odd, creating a personal cloud platform using low-cost machines and load balancers is not a bad idea. In fact, such a setup can be ideal for developmental stages as it allows developers to see how their system reacts when constrained to a small framework.

If recursive functions crop up, the service will grind to a halt, but the bank balance will remain untouched. The development of such platforms also allows developers to appreciate how such hardware works and how to make the most of it.

Read More

By Sam Brown