Tame costs in the cloud: subscriptions, billing, services

Which decisions have a major impact on the price of cloud services? You can also turn your costs in the cloud to your advantage.

Jakub Procházka

Yes, that’s right, the cloud is not free – except for some services, depending on the type of billing and the volume of the service… Did I confuse you with the first sentence? I wouldn’t be surprised. Taming costs in the cloud is often a science (even alchemy in some cases) even for seasoned professionals. In today’s article, I’ll give you a closer look at how to understand cloud costs, grasp them, and turn them to your advantage.

Necessary introduction: basic theory

The general rule of thumb for the cloud is “The more consumption, the better the price.” This “economy of scale” is based on the fact that with higher consumption, the cloud provider has lower prices from its suppliers (e.g. for energy). These discounts are then passed on to its clients. This is one of the reasons why the price for the same service from the same operator can vary from region to region.

Before we dive into trendy names like CapEx, OpEx, pay-as-you-go and more, one thing to keep in mind. Cloud costing is significantly more complex than on-premise solutions because more variables come into play. These need to be looked at holistically – keeping an eye on them and keeping them continuously under control. Otherwise, they tend to go their own way – and that always leads upwards in terms of costs.

CapEx vs. OpEx

One of the many reasons why so many companies are moving to the cloud these days is to save money. But how to achieve this?

For startups and early stage companies, it’s simple. If they start the so-called. on a greenfield site, the cloud is often the obvious choice precisely because the initial start-up costs are minimal.

This just clarified what CapEx is: the capital expenditures that a company must make to purchase, maintain and regularly upgrade (not only) physical assets. In our context, it can be the purchase of servers, network elements and other HW, associated premises (building, hall, racks, UPS), but also the purchase of software, licenses, etc.

As tangible assets depreciate over time, further investment is needed after a certain period of time to maintain the standard. A company can elegantly avoid this costly investment by running the infrastructure in the cloud, shifting these concerns to the vendor.

In the case of an existing in-house datacenter, a company often decides to move to the cloud before the end of the infrastructure lifecycle. Would it be better to go the route of buying HW for the next 5 years or move the costs in the accounting statement one box over to the OpEx section?

OpEx is an operating expense, sometimes referred to as a non-investment expense. These are operating expenses, typically charged on a monthly basis. By operating the infrastructure in the cloud, the company replaces large one-time investment costs with smaller recurring operating costs.

Benefits of OpEx in the cloud

No initial cost
Speeding up and facilitating budgeting
Smoother cash flow and distribution of costs over time
Relieving internal staff so they can focus on other work (e.g. transforming the operational infrastructure team into devops)

Subscription options or I don’t want a free discount

The first discounts are decided at the time of subscription, when the company can enter into two basic partnership options with the provider. Their specific form varies by cloud provider, but the idea remains the same. For example, for MS Azure we distinguish:

Cloud Solution Provider (CSP) is a subscription arranged through a partner. Typically, it is a partner company that helps the company with cloud adoption or otherwise takes care of the environment. The partner company has more similar clients and has negotiated lower prices than the company could achieve on its own. We at ORBIT also offer our clients the CSP variant with interesting discounts.
The Enterprise Agreement (EA) is particularly suitable for really large volumes and is concluded directly with the provider – in this case Microsoft. If the provider has at least a vision of a potentially large business opportunity, it is willing to offer companies interesting discounts on its services and support in cloud adoption (either on its own or more often with the help of a partner such as ORBIT).

Both CSP and EA benefits are applied across the entire subscription (or even across multiple subscriptions) without the need to set up or change anything further. These discounts can also be combined with other discounts. Other providers such as AWS and GCP practice a similar principle.

Subscriptions can of course be arranged without partner discounts, but the subsequent move to EA or CSP is not just “on paper”. In some cases, it also requires a transfer of funds to a new subscription. That’s why I recommend deciding on a subscription type at the beginning.

Forms of accounting: how not to get lost in them?

Public cloud providers offer multiple billing options for their services, and it’s impossible to say which is better. It depends on the type of services provided, but also on the company’s financial strategy. Overwhelmingly, we are talking about two basic forms: pay-as-you-go and reserved instances.

Pay-as-you-go

In the case of pay-as-you-go (payg), the client pays for services according to actual consumption and usage. Although this option does not offer any discounts in its basic form, it is still possible to save money thanks to its flexibility.

PAYG brings savings in the following cases:

Automatic scaling – increase and decrease resources as needed
Switching off resources outside working hours – e.g. dev prostředí
Reducing power and using the service more efficiently
Tests and short-term projects

Logically, the pay-as-you-go option is useful even if it is not clear at first how much the application will consume or if it is prone to frequent operational peaks (which need to be responded to flexibly).

PAYG is the standard option when creating a cloud subscription. For some types of services, it may even be the only option available, or it may be combined with a volume discount.

Reserved instances

The second form that the client can use for billing (usually for virtual servers) is the Reserved Instance (RI) option. As the name suggests, the client commits to pumping (usually) a specific type of instance (VM) for three or five years.

The provider applies a discount to this type of instance for the duration of the drawdown. Interesting discounts of tens of percentcan be achieved. RI is good to combine with PAYG and other discounts, but is only suitable for predicted loads and long-term projects.

Even if the client commits to draw down certain funds, Azure, for example, is quite benevolent and allows you to change (up to a certain amount) or even cancel RI without much hassle or penalty. In this case, it is advisable to check the provider’s current refund policy in detail.

AWS has an even relatively newer pricing model called Saving Plans which is more flexible than RI and can be applied to Fargate in addition to EC2, EKS, ECS.

Transfer of licences

Another way to save money when moving to the cloud is to transfer existing licenses. AWS offers Bring-Your-Own-License (BYOL) for Windows servers and SQL servers. Azure in turn provides the so-called Azure hybrid model, which in addition to Windows and SQL servers also supports SUSE and RedHat. This allows companies to migrate existing Software Assurance licenses to the cloud, thereby reducing the cost of the cloud service.

Source: https://azure.microsoft.com/en-us/pricing/reserved-vm-instances/

What about the flows, have you calculated them?

The big alchemy comes when you need to estimate such unknowns as the data flow. Fortunately, this is one of the cheapest items ever.

In the cloud, Ingress, i.e. data flow to the cloud, is mostly free (not considering direct line, VPN, etc.). On the contrary, the flow from the cloud, referred to as Egress, is already charged – even in the case of internal communication between datacentres, availability zones (AZ), etc.

You should always refer to the provider’s price list. Just to give you an idea: Azure has a free flow between the data source and the CDN, but the communication between AZs is charged at €0.009 per GB.

Source: https://azure.microsoft.com/en-us/pricing/details/bandwidth/

If you have petabytes of data, you’re probably wondering how to move it to the cloud as cheaply as possible and in a reasonable amount of time.

In the case of large volumes of data for which even a dedicated line is not sufficient, cloud providers allow the transfer of physical disks. The client sends the disks directly to the provider or orders special HW with capacities ranging from a few TB (Azure Databox, AWS Icebox) to 100 PB, i.e. literally a 14-meter truck to their own door (AWS Snowmobile).

Cloud cost calculator won’t be enough

As I mentioned at the beginning, there can be many variables in estimating the future cost of running a cloud. Writing everything in your own excel spreadsheets would be difficult and inefficient. So how to get meaningful numbers?

Fortunately, all major cloud providers offer their own advanced tools to calculate how much a client’s cloud operation will cost them. For example, with Azure, a client can use a quick calculator for an instant overview and a slightly more advanced TCO (total cost of ownership) calculator to calculate the total cost.

We’re in the cloud, now what?

Even if the company has gone through everything mentioned so far, its journey is far from over. For life in the cloud, it is essential to keep an overview of the environment, which involves quality reporting, creating a budget and associated alerts.

Continuous optimization or pay only for what you need

To optimize already running servers, providers offer native tools in the form of advisors that alert clients to excess or insufficient performance. In both cases, the providers leave more than enough margin and the recommendations made are considered very conservative.

That’s why there are third-party tools. Some are just advanced bill readers, others, like Densify, use proprietary machine learning for the best possible recommendations.

ORBIT has been a partner of Densify for more than 10 years and we have successfully operated it for several major customers. We know from experience that compared to conventional advisors or the best will of the internal infrastructure team, we are able to achieve on average 20% higher savings with this tool.

Not only the virtual server at the type and family level, but also the platform services and especially the containers undergo optimization. Without advanced intelligence and long-term measurements, the correct combination of the underlying VM and the container itself in the pod cannot be calculated correctly.

I would like to stress here that this should be a continuous process, not a one-off clean-up. Unlike on-premise infrastructure, the public cloud is constantly changing technically and in terms of price, and you can pay less for the same services with the right combination of services. Also, the longer the measurement, the more accurate the results.

In the end, the entire configuration optimization process can be automated, for example using Terraform, and let the AI decide the instance size.

Initial sizing

Just like in the on-premise world, the cloud overestimates the amount of performance needed during development or migration – mainly because it is unclear how much performance will actually be needed. We cannot come up with the exact sizing ourselves, it must be measured. Otherwise, we run the risk not only of unnecessarily higher costs but also of poor performance.

Public cloud providers are again offering their tools to clients to do this. In the case of Azure, this is Azure Migrate, which, after an appropriate length of measurement, tells you which instances are suitable for performance coverage. Amazon’s equivalent service is the AWS Application Discovery Service.

Costs in the cloud – more tips

Wouldn’t it be more profitable and easier to move applications to PaaS or SaaS? That is for the client to decide. Likewise, they must engage their own forces and decide with the application team whether they can use burstable instances for applications or even spot instances, which are significantly cheaper.

Burstable instances are suitable for non-critical performance, which is minimal most of the time, and applications only need their performance in bursts (hence burst). It is good to be careful with burstable, but for testing or development it may be the best choice.

Spot instances, on the other hand, are suitable for stateless applications or other services that don’t mind losing performance suddenly. These are servers that are unused by the providers, so they offer them at a very interesting price. But the moment this power is needed elsewhere, you can suddenly lose it.

Similarly, ephemeral disks can be used – local disks (similar to temporary disks) that are completely free for each VM, provide lower read/write response, and are thus ideal for stateless applications.

Conclusion

The topic of cloud costs (and especially their optimization) is a broad topic. In this article, I have only touched the tip of the iceberg by mentioning only the most important decisions and considerations you may encounter. I trust you will not be caught off guard now.

If you’re interested in other topics related to the cloud, check out our Cloud Encyclopedia series – a quick guide to the cloud.

This is a machine translation. Please excuse any possible errors.