Can you restore your company’s IT services within a specific timeframe? In this article cloud encyclopedia, we’ll take a look at how backups can be handled using cloud services and explain the benefits of backing up to the cloud.
Backup and subsequent data recovery or more advanced disaster recovery (DR) scenarios are still an underrated topic in many organizations. Companies that have never lost data often treat backups and recovery tests as a fringe issue.
Statistically speaking, however, sooner or later a failure will occur, whether it is caused by HW, human error or a hacker attack. I’m sure you’ve seen reports in the media about ransomware attacks that have encrypted the data of many organisations, which have found that being able to recover data quickly is very important.
So it is not a question of if, but when something will happen in every organization. How can we prevent this with cloud backup?
First a little theory
Before you deploy backup or a more complex DR solution in your organization, a planning phase must precede it. First, we define basic parameters such as maximum recovery time (RTO) and maximum data loss time (RPO).
In larger organizations, this phase is usually part of business continuity planning (BCP) or business impact analysis (BIA). We define the criticality of the service and know how much it will cost us if it fails.
The SLA of the service is defined from these parameters. For example, if we have an SLA of 99.9%, we can afford a maximum of 8 h 45 min 56 s of service outage per year.
However, we must also include activities such as OS updates, service upgrades, migrations, etc. Therefore, we would choose the MTPOD (max. tolerable period of down time) parameter to a maximum of 8 hours and the RTO to 4-6 hours. The RPO parameter, i.e. how much time is lost, is based on the BIA definition.
Backup is chosen when we have a longer RTO time and we are able to restore the data in a defined period of time.
Disaster recovery is used when the RTO is short (usually units of minutes) and the service is restored by switching to a backup site. The disaster recovery option is more expensive than backup, but covers more scenarios (such as loss of hardware or an entire site).
Example: if the service runs on SQL server and we have set incremental database backup after 15 minutes, then the data loss is in the range of 0-15 min and the RPO is max. 15 min. The RTO depends on how long it takes to restore the database.
First, let’s review the most common backup scenarios from an on-premise environment:
- USB drives
We cast data to USB drives, which we continuously rotate. Cheap solution, high RTO and RPO.
We cast data to NAS and use native software from NAS or scripts. Cheap solution, medium RTO, higher RPO.
- Backup SW
We have implemented specialized backup software with the possibility of granular data recovery. We also protect important data by off-site backup to another location or server room. Medium cost solution, medium RTO, medium/lower RPO.
- Backup & Disaster recovery scenario
It is an enterpise scenario that consists of a comprehensive backup solution and a disaster recovery solution with the ability to quickly restore data or switch operations to a backup site. It is suitable for large companies operating mission-critical workloads or companies subject to audits and regulations. Expensive solution, in case of DR service switching the RTO is very low and the RPO is low.
Why is cloud backup worth it?
The advantage of using the cloud for backup and disaster recovery is the large amount of free resources that we can quickly start using in the event of a disaster.
On the other hand, in an on-premise environment, we need to keep an appropriate amount of HW ready. In the case of DR, we then reserve the entire second site, which is not fully used most of the time, and where we also deal with connectivity, cooling, network elements, etc. This represents a huge extra cost compared to the cloud, where we only pay for the stored data and only pay for the resources when they are used.
In the cloud, in general, data transmitted inwards is free and data transmitted outwards is paid for. This can be used in any off-site backup scenario.
An example is NAS storage, which today can natively connect both Azure Blob Storage and AWS S3 Glacier storage and automatically offload data from the NAS to them. Similarly, backup software such as Veeam, CommVault, Acronis, etc. support cloud storage connections. The advantage is the ability to connect via the internet without the need to set up a VPN.
Long term cloud storage on storage tiers like Azure Blob storage archive tier and AWS S3 Glacier archive enable long term data storage at a very good price/performance ratio.
When using Azure Backup, data transfers are not charged at all. We only pay for the service based on the number of protected VMs in the on-premise environment and the data stored in Azure.
Within AWS Backup, we also pay only for the space used. Prices vary slightly depending on the type of service such as EFS, RDS Database, etc. Generally, storing 1 TB of data with AWS Backup costs $1 per month.
Cloud backup options
Each cloud provider has implemented its own backup technologies and third-party solutions. In the case of:
- Azure is Azure Backup and Azure Site recovery,
- Azure is Azure Backup services and AWS Disaster Recovery,
- third-party solutions such as Commvault, Veeam, Acronis, Avamar and others.
This technology enables backup of on-premise resources such as physical and virtual servers on Hyper-v and VMware platforms.
Backup is performed via an appliance deployed in a local environment, which stores backups partially locally and then to the cloud. It depends on the backup policy settings and the instant recovery feature, where recovery can take place from the local server without the need to transfer data from Azure.
In addition to supporting on-premise resources, all Azure resources are natively supported such as:
- virtual computers
- managed drives
- SQL Server on VM
- blob storage
In the event of a restore, we can restore to both on-premise environments and Azure. The scenario of restoring resources to Azure is interesting in case we have lost local hardware in some disaster.
Azure Site Recovery (ASR)
This service uses similar components to Azure Backup. However, it doesn’t just perform policy-based backups, but continuously replicates data from protected resources such as physical servers or VMs hosted on the VM-ware and Hyper-V hypervisor.
Azure resources are supported natively. When ASR is deployed here, data is replicated to another region with the option of switching traffic to the replicated site. We wrote about this in detail in the article High availability of services in the cloud.
Another advantage of ASR is the ability to deploy replicated resources on an isolated network (V-net). This can be used to verify the functionality of the DR, or for the purpose of a test environment that is 1:1 to the production environment.
On the AWS Backup side, all types of AWS cloud resources are supported, such as:
- Amazon S3
- Amazon EBS
- Amazon RDS including Amazon Aurora
- Amazon DynamoDB
- Amazon Neptune
- Amazon EFS
- Amazon EC2
AWS Backup also allows you to back up all VMware workloads on-premise and in the VMware cloud. Backup of standard physical servers and other hypervisors is not supported. For physical server protection, we can use AWS Elastic Disaster Recovery.
AWS Elastic Disaster Recovery
This technology allows (like Azure Site Recovery) to protect all possible types of resources in on-premise, where RPO is in seconds and RTO is in minutes. The replication agent is OS-level and can be used on both physical and virtual servers.
AWS Elastic Disaster Recovery also supports scenarios for other cloud providers and replication of their resources. AWS natively supports Failover & Fail back replication between regions.
Third-party backup solutions
Besides native solutions, we can use any third-party technology. It is usually available in the form of an appliance or some off-the-shelf solution where you only need to supply your own configurations and license in the BYOL model.
The result is fast deployment and configuration of resources and custom licensing just like in on-prem. Here we have a choice of two scenarios:
1) We connect cloud storage like Azure Blob Storage or S3 Glacier to the existing solution via AWS storage gateway and use the cloud only as off-site storage.
2) Extend backup infrastructure to the cloud and deploy backup servers and storage.
Recommended cloud backup scenarios
Long term data archiving on long term storage like Azure Blob Storage in Archive mode, or AWS S3 Glacier in Archive mode, where prices start at 1-2 EUR per stored TB of data per month.
Transactional operations such as writes and reads must be added to the cost, which are negligible for one-time storage. Archives are not suitable for data that we modify frequently. In the case of Azure, the requirement is to retain data for at least 180 days.
Off-site backup for NAS storage using cloud storage such as Azure Blob storage COL and Archive tier or AWS S3 Glacier in Cold and Archive tier.
Here you need to properly design and configure the replication agent on the NAS so that data destined for long-term archiving is placed on the archive tier and data that is more frequently modified and mirrored 1:1 ends up on the COL or HOT tier.
Off-site backup using existing backup software and connecting cloud storage such as Azure Blob storage or AWS S3 Glacier.
This scenario is used most often as it is the cheapest off-site backup option. If we only need to deploy storage on the cloud side, we can connect it over the internet without the need for a VPN.
Off-site backup + DR using existing backup software and extending the backup infrastructure to the cloud – either by installing it on a VM or deploying the appropriate appliance or platform slave.
Next, it is necessary to have the basic infrastructure ready: networks, VPNs, servers (deallocated) or pipelines and scripts that can be used to deploy the necessary storage and increase the computing resources for data recovery and the creation of a DR-Site with the ability to switch traffic.
In this scenario, the architecture of the entire solution is important because of network readmissions, various traffic redirection, integrations, Firewall settings, DNS, etc.
Off-site backup + DR using native services like Azure Backup and AWS Backup.
This scenario requires basic cloud infrastructure such as VPNs, servers (deallocated) and ready deployments of the necessary resources via pipelines and scripts or IaC, which can be used to increase the necessary storage and computing power for data recovery and the creation of a DR-Site with the ability to flip traffic. RPO and RTO times reach tens of minutes.
Full DR solution using Azure Site Recovery or AWS Elastic Disaster Recovery,
In this scenario, it is possible to achieve minimum RTO and RPO times in the order of units of minutes. It all depends on the architecture of the solution and the components chosen, where all resources are protected by the replication agent.
Finally, on cloud backup
In today’s world of digitized processes and ever-increasing business data, there is an increasing emphasis on service availability defined in SLAs. From this comes the push for more advanced backup scenarios and BCDR – business continuity & disaster recovery.
As we have shown in the article, many scenarios can be implemented in the cloud – from simple, where we only use storage, to advanced, where we use cloud-based data recovery using native cloud tools or specialized third-party software. All the way to complex Enterprise scenarios providing very low RTOs, where a switchover will automatically deploy all necessary assets and switch operations.
From the perspective of running backup or DR in the cloud, we don’t have to deal with a lot of unused hardware, its maintenance, location somewhere in another datacenter or server room and all the things associated with connection, operation and maintenance. Solutions built on cloud services therefore achieve savings in the tens of percentages.
This is a machine translation. Please excuse any possible errors.