For IT heads, CTOs, and infrastructure engineers at Indian businesses that need documented, tested disaster recovery plans for cloud workloads.
The Backup That Wasn't a Recovery Plan
A mid-size logistics company in Pune. Their IT team had been diligent about backups — daily snapshots of their database, weekly full backups to cloud storage, retention going back 90 days. When ransomware encrypted their primary systems in the early hours of a Monday morning, they felt, initially, like the situation was recoverable.
It took them four days to get back online. Not because the backups were missing. The backups were there. It took four days because nobody had ever tested the restore process, the backup storage was in the same cloud account as the primary infrastructure (and the attacker had compromised the account), the restore procedure for their database server had never been documented, and there was no secondary environment to restore into — the team spent two days provisioning infrastructure before they could even begin restoring data.
The backups existed. The disaster recovery plan did not. That is the gap.
RTO and RPO: The Two Numbers That Define Your DR Requirements
Before any architecture decision, disaster recovery planning starts with two numbers.
Recovery Time Objective (RTO) — How long can your business be offline before the damage becomes unacceptable? For some businesses, the answer is four hours. For a payment processor during business hours, the answer might be fifteen minutes. For a government portal, it might be twenty-four hours. RTO is a business decision, not a technical one.
Recovery Point Objective (RPO) — How much data loss can you tolerate? If your database is backed up every 24 hours and your RPO is 24 hours, that's consistent. If your RPO is one hour but you're only taking daily backups, there's a gap between your tolerance and your actual recovery capability.
These two numbers drive every subsequent architecture choice. A four-hour RTO with a one-hour RPO requires different infrastructure than a 24-hour RTO with a 24-hour RPO.
The Three Tiers of Disaster Recovery Architecture
Tier 1: Backup and Restore
Your data is backed up to a secondary location. In a disaster, you provision new infrastructure and restore from backup.
RTO: Hours to days. Depends entirely on how fast you can provision and restore.
RPO: Determined by backup frequency. Daily backups = up to 24 hours of data loss.
Cost: Low. You're paying for storage only, not for a running secondary environment.
Best for: Non-time-critical systems, development environments, archival workloads.
Tier 2: Pilot Light
A minimal version of your infrastructure is always running in a secondary environment — database replication is active, configuration is maintained, but compute is at minimum capacity. In a disaster, you scale up the secondary environment.
RTO: 30 minutes to a few hours, depending on how much needs to be scaled up.
RPO: Near-zero for databases with active replication. Determined by replication lag.
Cost: Moderate. You pay for the minimal running environment plus storage.
Best for: Most production business applications — the balance point between cost and recovery speed.
Tier 3: Warm Standby / Active-Active
A full secondary environment runs continuously and handles some live traffic. In a disaster, all traffic shifts to the secondary.
RTO: Minutes or seconds.
RPO: Near-zero.
Cost: High. You're running two full environments simultaneously.
Best for: Mission-critical systems where any downtime is unacceptable — payment processors, trading platforms, health infrastructure.
Building Your DR Architecture on Cloud Object Storage
Cloud object storage is the foundation layer for all three DR tiers, and the choices you make here determine whether the rest of the plan works.
Cross-Region Replication
Store your backups in a geographically separate location from your primary data. If a data centre outage, natural disaster, or regional network failure affects your primary location, your backup should be in a location with no shared infrastructure dependencies.
IBEE's object storage supports cross-region replication, allowing you to maintain a continuously updated copy of your data in a secondary region automatically.
Immutable Backups and Object Lock
Ransomware attacks increasingly target backup storage as part of the attack strategy — encrypt or delete the backups first, then encrypt the primary data. Object Lock (also called WORM — Write Once Read Many) prevents any process from deleting or overwriting backup objects for a defined retention period, even if the account credentials are compromised.
For ransomware protection, immutable backups are not optional. A backup that can be deleted by an attacker with your cloud credentials is not a backup in any meaningful sense.
Versioning as a Ransomware Defence
Even without Object Lock, bucket versioning provides a recovery path for ransomware events. When ransomware overwrites files with encrypted versions, the previous unencrypted versions are retained as previous object versions, recoverable by anyone with appropriate access.
Separate Cloud Account for DR Storage
Your DR storage should live in a completely separate cloud account from your primary infrastructure — ideally with different credentials, different IAM configuration, and no trust relationship to the primary account. This prevents an account-level compromise from destroying both primary and backup data simultaneously.
The Pune logistics company's fundamental mistake was keeping backups in the same account. When the attacker compromised the account, the backups were as accessible as the primary data.
The Restore Test Most Teams Skip
A backup that has never been tested is not a backup. It is an optimistic assumption.
Restore testing should be scheduled, documented, and treated as a regular operational activity:
Monthly: Restore a random selection of files from backup storage and verify integrity.
Quarterly: Full restore test of a non-production system from backup. Time the process. Document the steps. Identify anything that took longer than expected or required undocumented knowledge.
Annually: Full DR exercise — simulate a complete primary environment failure and execute the DR plan from scratch. This is the test that reveals gaps in documentation, dependencies that were assumed but not captured, and skills that only exist in one person's head.
The quarterly and annual tests also produce something equally valuable: a documented restore runbook. When your primary systems are offline at 3 AM and your lead engineer is unreachable, the person doing the restore needs documented procedures, not tribal knowledge.
DR for Regulated Industries in India
For Indian businesses in regulated sectors, disaster recovery is increasingly a compliance requirement with specific documentation standards.
SEBI — Regulated entities including stock brokers, depositories, and asset managers are required to maintain DR infrastructure and test recovery procedures, with documentation available for regulatory inspection.
RBI — Payment system operators and NBFCs are required to maintain business continuity plans with defined RTOs for critical systems.
CERT-In — The 2022 mandatory directions require organisations in critical sectors to maintain incident response plans and report major incidents within prescribed timelines. A DR plan is a prerequisite for meaningful incident response.
IRDAI — Insurance companies are required to maintain documented business continuity and DR plans as part of IT governance frameworks.
IBEE's India-sovereign infrastructure provides the jurisdiction-clean storage foundation for DR environments in regulated sectors — audit logs retained within India, data under Indian law, 180-day log retention included by default.
A Practical DR Checklist for Indian Cloud Workloads
Before your next infrastructure review, verify:
- Backups are stored in a separate cloud account from primary infrastructure
- Cross-region replication is configured for critical data
- Object Lock or versioning is enabled on backup buckets
- RTO and RPO have been formally defined and agreed with the business
- A restore runbook exists, is documented, and is accessible offline
- Restore tests have been run within the past 90 days with results recorded
- DR environment (at minimum Pilot Light) exists and has been tested
- Regulatory DR requirements have been reviewed for your specific sector
