ibee
Disaster Recovery Is Not a Backup. Here Is the Difference, and Why Teams Confuse the Two.

Disaster Recovery Is Not a Backup. Here Is the Difference, and Why Teams Confuse the Two.

Ravi Teja
Ravi TejaInfrastructure Engineer
April 24, 20267 min read

The Backup That Was Not a Recovery Plan

A logistics company. The IT team had been diligent about backups: daily snapshots of the database, weekly full backups to cloud storage, retention going back 90 days. When ransomware encrypted the primary systems in the early hours of a Monday morning, the team felt, initially, that the situation was recoverable.

It took four days to get back online. Not because the backups were missing. The backups were there. It took four days because nobody had ever tested the restore process, the backup storage was in the same cloud account as the primary infrastructure and the attacker had compromised the account, the restore procedure for the database server had never been documented, and there was no secondary environment to restore into. The team spent two days provisioning infrastructure before they could even begin restoring data.

The backups existed. The disaster recovery plan did not. That is the gap that this article addresses, and it is a gap we see repeatedly across companies of every size, in every industry, and on every major cloud platform.

RTO and RPO: The Two Numbers That Define Your DR Requirements

Before any architecture decision, disaster recovery planning starts with two numbers.

Recovery Time Objective, or RTO, is how long the business can be offline before the damage becomes unacceptable. For some businesses the answer is four hours. For a payment processor during business hours it might be fifteen minutes. For a government portal it might be twenty-four hours. RTO is a business decision, not a technical one, and it needs to be agreed with business leadership before any infrastructure choices are made.

Recovery Point Objective, or RPO, is how much data loss is tolerable. If the database is backed up every 24 hours and the RPO is 24 hours, those are consistent. If the RPO is one hour but backups run daily, there is a gap between the stated tolerance and the actual recovery capability. That gap only becomes visible during an incident.

These two numbers drive every subsequent architecture choice. A four-hour RTO with a one-hour RPO requires fundamentally different infrastructure than a 24-hour RTO with a 24-hour RPO.

The Three Tiers of Disaster Recovery Architecture

Three tiers of disaster recovery architecture

Choose the tier that matches your RTO and RPO, not the one that sounds the most resilient.

Building the DR Architecture on Cloud Object Storage

Cloud object storage is the foundation layer for all three DR tiers. The choices made here determine whether the rest of the plan functions when it is needed.

Cross-region replication is the first requirement. Backups should be stored in a geographically separate location from primary data. If a data centre outage, natural disaster, or regional network failure affects the primary location, the backup must be in a location with no shared infrastructure dependencies. IBEE's object storage supports cross-region replication, maintaining a continuously updated copy of data in a secondary region automatically.

Immutable backups using Object Lock are the second requirement, and they are non-negotiable for ransomware protection. Ransomware attacks increasingly target backup storage as part of the attack strategy: encrypt or delete the backups first, then encrypt the primary data. Object Lock, also known as WORM (Write Once Read Many), prevents any process from deleting or overwriting backup objects for a defined retention period, even if account credentials are compromised. A backup that can be deleted by an attacker who has obtained cloud credentials is not a backup in any meaningful sense.

Even without Object Lock, bucket versioning provides a meaningful recovery path. When ransomware overwrites files with encrypted versions, the previous unencrypted versions are retained as prior object versions and remain recoverable. Versioning is not a substitute for Object Lock but it is a worthwhile additional layer.

The DR storage account must be completely separate from the primary infrastructure account. Different credentials, different IAM configuration, and no trust relationship to the primary account. This prevents an account-level compromise from destroying both primary and backup data simultaneously. The logistics company described at the start of this article kept backups in the same account. When the attacker compromised the account, the backups were as accessible to the attacker as the primary data.

The Restore Test Most Teams Skip

A backup that has never been tested is not a backup. It is an optimistic assumption.

Restore testing should be scheduled, documented, and treated as a routine operational activity with the same priority as any other infrastructure maintenance. A practical testing cadence runs at three levels.

Monthly, restore a random selection of files from backup storage and verify their integrity. This confirms that the backup process is writing valid data and that the restore mechanism functions at the file level.

Quarterly, run a full restore of a non-production system from backup. Time the process. Document every step. Identify anything that took longer than expected or required knowledge that exists only in one team member's head. The output of this test is not just confirmation that the restore works — it is a documented restore runbook that can be followed by anyone on the team.

Annually, run a full DR exercise. Simulate a complete primary environment failure and execute the DR plan from scratch, including infrastructure provisioning. This is the test that reveals gaps in documentation, undocumented dependencies, and skills that live in a single person's memory. When primary systems are offline at 3 AM and the lead engineer is unreachable, the person executing the restore needs documented procedures, not tribal knowledge.

DR for Regulated Industries

Disaster recovery is increasingly a compliance requirement across regulated industries globally, with specific documentation and testing standards that vary by jurisdiction and sector.

In financial services, regulators including RBI in India, FCA in the UK, and SEC-regulated entities in the US require payment system operators and financial institutions to maintain business continuity plans with defined RTOs for critical systems, subject to periodic audit.

In insurance, IRDAI in India and equivalent bodies in other jurisdictions require documented business continuity and DR plans as part of IT governance frameworks.

In securities markets, SEBI-regulated entities in India including stock brokers and asset managers are required to maintain DR infrastructure and test recovery procedures with documentation available for regulatory inspection.

In healthcare, HIPAA in the US and equivalent frameworks in other markets require covered entities to maintain contingency plans including DR procedures and regular testing.

For organisations subject to ISO 22301 (Business Continuity Management) or SOC 2 Type II certification, documented and tested DR procedures are a direct audit requirement regardless of industry.

CERT-In's 2022 mandatory directions in India require organisations in critical sectors to maintain incident response plans and report major incidents within prescribed timelines. A tested DR plan is a prerequisite for meaningful incident response capability.

IBEE's India-sovereign infrastructure provides the jurisdiction-clean storage foundation for DR environments in regulated sectors operating in India: audit logs retained within Indian jurisdiction, data governed under Indian law, and 180-day log retention included by default.

A Practical DR Checklist

Before the next infrastructure review, confirm the following are in place. Backups are stored in a separate cloud account from primary infrastructure with no shared trust relationship. Cross-region replication is configured for all critical data. Object Lock or versioning is enabled on all backup buckets. RTO and RPO have been formally defined and agreed with business leadership. A restore runbook exists, is documented, and is accessible offline without requiring access to the systems being recovered. Restore tests have been run within the past 90 days with results recorded. A DR environment at minimum Pilot Light tier exists and has been tested under realistic conditions. Regulatory DR requirements specific to the organisation's sector and operating jurisdictions have been reviewed and mapped to the current architecture.

Related articles