Disaster Recovery Architecture Using …

For CTOs, DevOps leads, and engineering heads at Indian businesses responsible for business continuity and infrastructure resilience.

RTO and RPO: The Two Numbers That Define Your DR Strategy

Before choosing a DR architecture, a business must answer two questions with specific numbers.

Recovery Time Objective (RTO) — how long can the business operate without the affected system? An e-commerce platform that loses its order management system has an RTO measured in hours. A payment processing platform has an RTO measured in minutes. A content archive has an RTO measured in days. The RTO determines how much you need to invest in standby infrastructure and automation.

Recovery Point Objective (RPO) — how much data can the business afford to lose? An RPO of zero means no data loss is acceptable — every transaction must be committed to a secondary location before it is confirmed to the user. An RPO of 1 hour means the business can recover from a failure by restoring the state from up to 1 hour ago. An RPO of 24 hours means daily backups are sufficient.

RPO drives backup frequency. RTO drives recovery automation. Together they determine the DR architecture and its cost.

Most Indian SMB and startup businesses have realistic RTOs of 4–24 hours and RPOs of 1–24 hours. This tier — often called Warm Standby — is achievable with automated backups to object storage and documented recovery runbooks, without the cost of hot standby infrastructure.

Object Storage as the DR Foundation

Object storage is the ideal backup target for DR because it provides three things that other backup targets do not combine:

Durability: IBEE's Tier 4 infrastructure stores data with 11 nines (99.999999999%) durability — data written to IBEE is not lost to hardware failure. The same data on an attached disk or a single-server NFS mount is one hardware failure away from loss.

Accessibility: backups stored in object storage are accessible from any server, in any data centre, using standard HTTP — without mounting a volume, without VPN, without special drivers. Recovery starts by provisioning a new server and pointing it at the backup bucket.

Cost: at Rs.1.50/GB-month, long-retention backup storage on IBEE is economically feasible. A 500 GB database backup retained for 12 months costs Rs.9,000/month — a fraction of the cost of the business continuity event it exists to prevent.

The Three-Layer Backup Architecture

Layer 1 — Database Backups (Object Storage)

Application databases — PostgreSQL, MySQL, MongoDB — generate the most critical and irreplaceable data in most businesses. Database backups to object storage are covered in detail in our database backup guide, using pgBackRest for PostgreSQL and Restic for general workloads.

For DR purposes, the critical configuration is WAL archiving for PostgreSQL (which enables point-in-time recovery to any moment, not just backup snapshots) and a backup retention period that matches the RPO. For an RPO of 4 hours, backups must run every 4 hours or WAL archiving must be continuous.

Layer 2 — Application Data Backups (Object Storage Sync)

User-uploaded files, generated documents, media assets, and any other application data that lives in object storage or on server filesystems. If this data already lives in IBEE, the DR strategy is cross-bucket replication — syncing primary bucket contents to a secondary bucket using rclone on a scheduled basis:

Run this as a scheduled job (cron or a dedicated pipeline step). The sync is incremental — only changed and new files are transferred on subsequent runs. For an RPO of 1 hour, run the sync hourly.

Layer 3 — Configuration and Infrastructure Backups

Nginx configuration, application environment files, Terraform state, database schemas, deployment scripts — the configuration layer that tells a new server how to run the application. Store these in a dedicated IBEE bucket, versioned, with access limited to the on-call team.

A complete configuration backup means recovery from a total infrastructure loss requires only: provisioning new servers, restoring database from backup, pulling configuration from IBEE, and running the deployment script. No tribal knowledge required.

Cross-Provider DR Replication

For businesses where a provider-level failure is a DR scenario — unlikely at Tier 4 reliability, but relevant for regulated businesses that require geographic and provider diversity — IBEE's S3 compatibility enables cross-provider replication using rclone.

Configure rclone with both IBEE and a secondary provider (another S3-compatible provider, or AWS S3 in a different region) and run a daily or hourly sync:

The secondary bucket holds a point-in-time copy of the primary, updated on every sync run. In a DR scenario, update the application configuration to point at the secondary bucket and recovery proceeds from there.

The DR Runbook

A DR plan without a tested runbook is not a DR plan — it is a hope. The runbook is a step-by-step document that any on-call engineer can follow to execute recovery without specialised knowledge of the system internals.

A minimal runbook for a web application with database and object storage:

Step 1 — Assess the failure. Identify which component has failed (database server, application server, network, storage provider). Determine whether this is a recoverable incident (restart the failed component) or a DR scenario (the component cannot be recovered quickly enough to meet RTO).

Step 2 — Provision replacement infrastructure. Spin up a new server instance. Apply the saved server configuration from the IBEE configuration backup bucket. Install required packages using the recorded package list.

Step 3 — Restore the database. Identify the most recent backup in IBEE that satisfies the RPO. Run the restore command (pgBackRest, Restic restore, or equivalent). Verify database integrity after restore.

Step 4 — Restore object storage access. Update the application configuration with storage credentials. Verify read and write access to the primary or DR storage bucket.

Step 5 — Deploy the application. Run the documented deployment script. Verify the application starts correctly and passes health checks.

Step 6 — Update DNS. Point the application domain to the new server IP. Verify end-to-end user access.

Step 7 — Validate. Run through the critical user flows manually. Check error logs for unexpected failures.

Total estimated time for this runbook with a tested team and pre-built server images: 1–3 hours. This is a comfortable RTO for most Indian businesses outside of financial services and critical infrastructure.

DR Testing

A DR plan that has never been tested has unknown reliability. Test the complete runbook quarterly — not just the backup integrity check, but the complete recovery from a cold start.

The test procedure: provision a fresh server, follow the runbook exactly from step 1, and measure the time to full recovery. Any steps that require tribal knowledge, undocumented tools, or manual intervention that is not in the runbook identify gaps to fix before the next test.

IBEE's Tier 4 uptime of 99.995% means DR testing rarely coincides with an actual incident. Test in a staging environment to avoid any production impact.

Disaster Recovery Architecture Using Object Storage — RTO, RPO, and Backup Strategy