Skip to main content
All CollectionsGeneral FAQs
Goodworld's Disaster Recovery Plan
Goodworld's Disaster Recovery Plan

Learn the details of Goodworld's AWS multi-region disaster recovery plan

Richie Kendall avatar
Written by Richie Kendall
Updated this week

Executive Summary

This document outlines Goodworld's comprehensive disaster recovery (DR) strategy, designed to ensure business continuity in the event of a regional infrastructure failure. Our architecture leverages AWS's global infrastructure with a multi-region deployment approach, providing robust failover capabilities and minimal data loss potential.

Recovery Objectives

  • Recovery Time Objective (RTO): 1 hour

  • Recovery Point Objective (RPO): 15 minutes

These objectives reflect our commitment to maintaining business continuity while minimizing potential data loss in disaster scenarios.

Infrastructure Overview

Geographic Distribution

Our infrastructure is strategically distributed across multiple AWS regions:

  • Primary Region: US-East-1 (Virginia)

  • DR Region: US-West-2 (Oregon)

  • Content Delivery: Global CloudFront distribution network

  • Data Replication: Cross-region synchronization between primary and DR regions

Architecture Components

Database Infrastructure

  1. MongoDB Atlas Global Clusters

    • Active-active configuration across regions

    • Automated failover capability

    • Maximum data loss window: 15 minutes

    • Continuous replication between regions

  2. Neo4j Deployment

    • Active-passive configuration

    • Synchronized replica maintained in DR region

    • Automated promotion of DR instance during failover

Application Infrastructure

  1. Container Orchestration

    • ECS clusters maintained in both regions

    • Blue-green deployment capability

    • Automated health checks and failover

    • Container images replicated to both regions

  2. Content Delivery

    • CloudFront distribution points in both regions

    • Automatic failover configuration

    • Global edge location utilization

    • DNS-based routing with health checks

Failover Strategy

Automated Failover Triggers

  • Regional AWS health check failures

  • Application performance degradation beyond thresholds

  • Manual activation by authorized personnel

Failover Process

  1. Health check failure detection

  2. DNS routing update to DR region

  3. Promotion of DR database instances

  4. ECS service activation in DR region

  5. CloudFront origin update

  6. Traffic routing to DR infrastructure

Recovery Process

  1. Assessment of primary region status

  2. Data integrity verification

  3. Replication catch-up

  4. Traffic restoration to primary region

  5. Verification of normal operations

Testing and Maintenance

Regular Testing Schedule

  • Quarterly failover drills

  • Monthly backup restoration tests

  • Continuous monitoring and alerting verification

Documentation and Updates

  • DR plan review every six months

  • Update after major infrastructure changes

  • Incident post-mortem incorporation

  • Team training and procedure updates

Communication Plan

Notification Protocol

  1. Initial incident detection and assessment

  2. Stakeholder notification

  3. Regular status updates

  4. Resolution confirmation

  5. Post-incident review

Contact Matrix

  • Primary DR Coordinator

  • Technical Team Leads

  • Executive Management

  • External Dependencies (AWS Support)

Recovery Verification

Success Criteria

  • All critical services operational

  • Data integrity verified

  • Performance metrics within baseline

  • External connectivity confirmed

  • Security controls validated

Post-Recovery Tasks

  1. System health verification

  2. Data consistency checks

  3. Performance baseline comparison

  4. Security audit

  5. Documentation update

Plan Maintenance

This plan is maintained under version control and updated quarterly or upon significant infrastructure changes. All updates are reviewed and approved by the Technical Operations team and Executive Management.

Version History

  • Current Version: 1.0

  • Last Updated: January 2025

  • Next Review: April 2025

Appendix

Critical Dependencies

  • AWS Infrastructure

  • MongoDB Atlas

  • Neo4j Enterprise

  • CloudFront CDN

  • Internal monitoring systems

Reference Documentation

  • AWS Regional Failover Guide

  • MongoDB Atlas Disaster Recovery Documentation

  • Neo4j High Availability Configuration Guide

  • Internal Runbooks and Procedures

Did this answer your question?