{"id":934,"date":"2026-04-17T04:45:55","date_gmt":"2026-04-17T04:45:55","guid":{"rendered":"https:\/\/www.devopsschool.com\/tutorials\/oracle-cloud-full-stack-disaster-recovery-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-migration-and-disaster-recovery\/"},"modified":"2026-04-17T04:45:55","modified_gmt":"2026-04-17T04:45:55","slug":"oracle-cloud-full-stack-disaster-recovery-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-migration-and-disaster-recovery","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/tutorials\/oracle-cloud-full-stack-disaster-recovery-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-migration-and-disaster-recovery\/","title":{"rendered":"Oracle Cloud Full Stack Disaster Recovery Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Migration and Disaster Recovery"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Category<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Migration and Disaster Recovery<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Introduction<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What this service is<\/strong><br\/>\nOracle Cloud <strong>Full Stack Disaster Recovery<\/strong> is an Oracle Cloud Infrastructure (OCI) service designed to help you plan, automate, and orchestrate disaster recovery (DR) for an entire application stack across OCI regions (and, in some designs, across availability domains within a region\u2014verify support in official docs for your exact target architecture).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Simple explanation (one paragraph)<\/strong><br\/>\nInstead of writing large, error-prone runbooks for failover and then hoping they work during an outage, Full Stack Disaster Recovery lets you define DR \u201cprotection groups\u201d and \u201cplans\u201d so you can perform repeatable DR drills, controlled switchovers, and emergency failovers with consistent steps and clear execution tracking.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Technical explanation (one paragraph)<\/strong><br\/>\nFull Stack Disaster Recovery provides a control plane for DR orchestration: you model your application as a set of protected OCI resources (for example, compute, storage, database replication relationships, networking constructs\u2014exact supported resource types vary; verify in official docs). You then define DR plans that coordinate dependencies and execution order (stop\/start, detach\/attach, promote standby, re-IP\/re-route, etc.) and execute those plans in a controlled manner with status visibility, logging, and repeatability. The data plane replication is still performed by underlying OCI services (for example, database replication technologies, block volume replication, object replication), while Full Stack Disaster Recovery coordinates the overall sequence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What problem it solves<\/strong><br\/>\nIt reduces recovery time and recovery errors by providing a structured, testable way to fail over complete stacks\u2014not just individual components\u2014so your DR posture is operationally realistic, auditable, and easier to practice.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">2. What is Full Stack Disaster Recovery?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Official purpose<\/strong><br\/>\nFull Stack Disaster Recovery is intended to orchestrate disaster recovery for multi-tier workloads running on <strong>Oracle Cloud (OCI)<\/strong> by coordinating application dependencies and DR operations between a <strong>primary<\/strong> site and a <strong>standby\/DR<\/strong> site.<\/p>\n\n\n\n<blockquote>\n<p>If Oracle has adjusted terminology or added\/removed supported member types since your last implementation, <strong>verify in official docs<\/strong> before building production runbooks.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Core capabilities<\/strong>\n&#8211; Define <strong>DR protection groups<\/strong> to represent the \u201capplication stack\u201d scope you want to protect.\n&#8211; Define <strong>DR plans<\/strong> (for example, DR drill, switchover, failover\u2014exact plan types\/names may vary; verify in official docs) to automate recovery actions in a consistent sequence.\n&#8211; Execute plans and track progress with a centralized orchestration layer rather than ad-hoc scripts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Major components (conceptual model)<\/strong>\n&#8211; <strong>Protection group<\/strong>: A logical grouping of resources and\/or replication relationships that collectively represent an application environment.\n&#8211; <strong>Primary protection group<\/strong> and <strong>standby protection group<\/strong>: Paired groups representing the source and target sites.\n&#8211; <strong>DR plan<\/strong>: An orchestrated sequence of actions to test or perform recovery.\n&#8211; <strong>Work request \/ execution tracking<\/strong>: OCI-style asynchronous operations tracking (common in OCI services; exact naming may vary).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Service type<\/strong>\n&#8211; A managed OCI service (control plane) focused on orchestration.<br\/>\n&#8211; It generally relies on underlying OCI services for replication and infrastructure (data plane).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Scope (regional\/global\/account\/project)<\/strong>\n&#8211; In OCI terms, Full Stack Disaster Recovery is typically <strong>regional<\/strong>: you create and manage DR resources in a region, then pair primary\/standby across regions.<br\/>\n&#8211; Resource visibility and permissions are governed by <strong>tenancy<\/strong>, <strong>compartments<\/strong>, and <strong>IAM policies<\/strong>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>How it fits into the Oracle Cloud ecosystem<\/strong>\n&#8211; Works alongside OCI\u2019s foundational services:\n  &#8211; <strong>Identity and Access Management (IAM)<\/strong> for authorization\n  &#8211; <strong>Networking (VCN, subnets, routing, DNS, Load Balancing)<\/strong> for traffic steering in DR\n  &#8211; <strong>Compute &amp; Storage<\/strong> and\/or <strong>Database replication<\/strong> technologies for workload continuity\n  &#8211; <strong>Observability (Logging, Monitoring, Events, Notifications, Audit)<\/strong> for execution visibility and governance<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">3. Why use Full Stack Disaster Recovery?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Business reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Reduce downtime risk<\/strong>: DR becomes a practiced, repeatable process rather than a \u201cone-time document.\u201d<\/li>\n<li><strong>Improve auditability<\/strong>: You can demonstrate DR readiness through consistent drills and documented execution outcomes.<\/li>\n<li><strong>Protect revenue and trust<\/strong>: Faster and more reliable recovery reduces business impact.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Technical reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Orchestrate dependencies<\/strong>: Real DR is not just \u201crestore a database.\u201d It\u2019s DNS, routing, app tiers, storage, security rules, and sequencing.<\/li>\n<li><strong>Standardize runbooks<\/strong>: Convert tribal knowledge into repeatable DR plans.<\/li>\n<li><strong>Lower human error<\/strong>: Automation reduces missed steps during high-stress outages.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Run DR drills safely<\/strong>: Regular testing reveals gaps early.<\/li>\n<li><strong>Centralized tracking<\/strong>: Execution status is visible and reviewable.<\/li>\n<li><strong>Repeatable operations<\/strong>: Supports consistent processes across multiple application stacks.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/compliance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Separation of duties<\/strong>: DR operators can be granted narrow permissions for DR actions.<\/li>\n<li><strong>Change control alignment<\/strong>: DR plans can be reviewed and updated as code-adjacent operational artifacts (often alongside Terraform for infrastructure).<\/li>\n<li><strong>Audit trail<\/strong>: OCI Audit plus service logs support compliance evidence.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scalability\/performance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Scale across multiple applications<\/strong>: You can model multiple protection groups for multiple stacks.<\/li>\n<li><strong>Design for RTO\/RPO<\/strong>: The orchestration layer supports fast execution; actual RTO\/RPO depends heavily on replication methods and capacity planning.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should choose it<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Choose Full Stack Disaster Recovery when:\n&#8211; Your application spans multiple OCI services and you need <strong>coordinated recovery<\/strong>.\n&#8211; You want <strong>regular DR drills<\/strong> with consistent outcomes.\n&#8211; You require <strong>region-to-region<\/strong> recovery planning.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should not choose it<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">It may not be the right fit when:\n&#8211; You only need <strong>backup\/restore<\/strong> (not orchestrated DR).\n&#8211; Your workload is small enough that a simple scripted procedure is sufficient and already tested.\n&#8211; You need <strong>cross-cloud<\/strong> DR orchestration (Full Stack Disaster Recovery is OCI-focused).\n&#8211; You cannot meet prerequisites for replication (networking, data replication, standby capacity).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">4. Where is Full Stack Disaster Recovery used?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Industries<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Financial services and fintech (availability and audit requirements)<\/li>\n<li>Healthcare (continuity and compliance)<\/li>\n<li>E-commerce and retail (revenue impact of downtime)<\/li>\n<li>SaaS providers (customer SLAs)<\/li>\n<li>Manufacturing and logistics (operational uptime)<\/li>\n<li>Government\/public sector (resilience mandates)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Team types<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Platform engineering teams running shared OCI landing zones<\/li>\n<li>SRE\/operations teams responsible for uptime and incident response<\/li>\n<li>DevOps teams owning release and environment automation<\/li>\n<li>Security and compliance teams validating DR readiness<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Workloads<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Multi-tier web applications (LB + app tier + DB tier)<\/li>\n<li>Data pipelines and analytics platforms (where rehydration is expensive)<\/li>\n<li>Enterprise applications running on OCI compute and database services<\/li>\n<li>Internal mission-critical systems (ERP integrations, order processing, identity services)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Architectures<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Active\/passive regional DR (most common)<\/li>\n<li>Warm standby architectures (pre-provisioned, scaled down)<\/li>\n<li>Pilot light designs (minimal standby components; promote during failover)<\/li>\n<li>Multi-region network designs with DNS and routing failover<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Real-world deployment contexts<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Primary region in one geography and DR region in another<\/li>\n<li>Regulatory \u201cdata residency\u201d constraints requiring region choice<\/li>\n<li>Hybrid connectivity (FastConnect or VPN) where DR must consider on-prem dependencies<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Production vs dev\/test usage<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Production<\/strong>: Use for real DR plans with strict IAM, change control, and regular drills.<\/li>\n<li><strong>Dev\/test<\/strong>: Use for validating DR plan logic, dependency sequencing, and runbook correctness with smaller footprints.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">5. Top Use Cases and Scenarios<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Below are realistic scenarios where <strong>Oracle Cloud Full Stack Disaster Recovery<\/strong> is commonly applied. For each, the key is orchestration across components, not just one replicated system.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Regional outage recovery for a 3-tier app<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Primary region becomes unavailable; manual failover is slow and error-prone.<\/li>\n<li><strong>Why this fits<\/strong>: Full Stack Disaster Recovery coordinates networking, compute, and database promotion steps.<\/li>\n<li><strong>Example<\/strong>: A retail checkout app fails over from Region A to Region B with a defined failover plan.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) Planned datacenter maintenance with minimal downtime (switchover)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: You need controlled maintenance in the primary region without a hard outage.<\/li>\n<li><strong>Why this fits<\/strong>: A switchover plan sequences actions more safely than emergency failover.<\/li>\n<li><strong>Example<\/strong>: Quarterly maintenance triggers a planned switchover to DR region and back.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) DR drills for compliance evidence<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Auditors require proof that DR is tested regularly.<\/li>\n<li><strong>Why this fits<\/strong>: DR drill plans produce consistent execution history and outcomes.<\/li>\n<li><strong>Example<\/strong>: A healthcare provider runs monthly DR drills and stores reports for audits.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) Reducing recovery errors during incident response<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: During outages, teams skip steps (DNS, routes, security rules).<\/li>\n<li><strong>Why this fits<\/strong>: Automated plan steps reduce reliance on memory.<\/li>\n<li><strong>Example<\/strong>: A SaaS team executes a failover plan and validates health checks automatically.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Coordinated recovery of app + database replication<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Database is replicated, but apps and network cutover are not coordinated.<\/li>\n<li><strong>Why this fits<\/strong>: Full Stack Disaster Recovery orchestrates \u201cpromote DB\u201d then \u201cstart app\u201d then \u201cshift traffic.\u201d<\/li>\n<li><strong>Example<\/strong>: A billing system promotes a standby database and then starts app nodes in the DR region.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) DR for microservices with shared ingress<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Many services depend on shared ingress, policies, and routing.<\/li>\n<li><strong>Why this fits<\/strong>: Protect groups and plans can represent the whole environment\u2019s dependencies.<\/li>\n<li><strong>Example<\/strong>: An API platform shifts ingress endpoints and restarts critical services in the correct order.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) DR for batch processing with strict data consistency<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Batch jobs must not run simultaneously in two regions.<\/li>\n<li><strong>Why this fits<\/strong>: DR plans can include controlled stop\/start sequencing.<\/li>\n<li><strong>Example<\/strong>: Overnight jobs are disabled in primary before enabling in DR.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) Multi-tenant SaaS isolation by compartment<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Multiple tenants\/apps need isolated DR operations.<\/li>\n<li><strong>Why this fits<\/strong>: OCI compartments + protection groups support operational separation.<\/li>\n<li><strong>Example<\/strong>: Each customer environment has its own protection group and DR plan.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) DR runbook standardization across teams<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Each app team has a different DR procedure.<\/li>\n<li><strong>Why this fits<\/strong>: Central platform team can standardize plan patterns.<\/li>\n<li><strong>Example<\/strong>: Standard DR plan templates for \u201cweb + DB\u201d workloads.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Accelerating DR onboarding for new applications<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: New apps take months to get a tested DR process.<\/li>\n<li><strong>Why this fits<\/strong>: A repeatable protection-group approach speeds onboarding.<\/li>\n<li><strong>Example<\/strong>: A new internal app is DR-enabled using an existing plan pattern and policies.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">6. Core Features<\/h2>\n\n\n\n<blockquote>\n<p>Feature availability can depend on region, resource type, and underlying services. <strong>Verify supported member types and plan actions in official docs<\/strong> before designing production workflows.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 1: DR protection groups (application-stack modeling)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Groups related OCI resources\/relationships into a single DR scope.<\/li>\n<li><strong>Why it matters<\/strong>: DR is about coordinated recovery, not isolated components.<\/li>\n<li><strong>Practical benefit<\/strong>: Clear boundaries: \u201cThis protection group is the payroll stack.\u201d<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Supported members vary. Don\u2019t assume every OCI service can be orchestrated directly.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 2: Pairing primary and standby environments<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Associates a primary protection group with a standby protection group (often cross-region).<\/li>\n<li><strong>Why it matters<\/strong>: Enables consistent execution targeting the correct DR site.<\/li>\n<li><strong>Practical benefit<\/strong>: Reduces configuration drift by establishing explicit primary\/standby relationships.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Requires you to design consistent networking, IAM, and capacity in both sites.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 3: DR plans (orchestration and runbooks)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Encodes the sequence of operations for drills, switchovers, and failovers.<\/li>\n<li><strong>Why it matters<\/strong>: Order matters (e.g., stop app \u2192 finalize replication \u2192 promote DB \u2192 start app \u2192 route traffic).<\/li>\n<li><strong>Practical benefit<\/strong>: Repeatable execution with reduced human error.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Some steps may require preconditions (replication healthy, capacity available).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 4: DR drills (test without a real disaster)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Runs a test plan to validate recovery procedures.<\/li>\n<li><strong>Why it matters<\/strong>: DR plans that are not tested are not trustworthy.<\/li>\n<li><strong>Practical benefit<\/strong>: Find missing IAM permissions, incorrect routes, DNS issues, or replication lag early.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Drills can still incur costs (running compute in DR, replication, traffic).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 5: Switchover (planned) vs failover (unplanned) semantics<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Supports different operational modes depending on whether the primary site is still reachable.<\/li>\n<li><strong>Why it matters<\/strong>: Planned switchovers can preserve data consistency better than emergency failovers.<\/li>\n<li><strong>Practical benefit<\/strong>: Lower RPO risk for planned events.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Unplanned failover may accept some data loss depending on replication method.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 6: Execution tracking and status visibility<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Provides execution status (step progress, success\/failure) and operational history.<\/li>\n<li><strong>Why it matters<\/strong>: Operators need confidence and clear troubleshooting signals during incidents.<\/li>\n<li><strong>Practical benefit<\/strong>: Faster incident response and post-mortem clarity.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Deep troubleshooting still requires checking underlying services (DB replication, network, compute logs).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 7: Integration with OCI IAM and compartments<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Uses OCI\u2019s standard identity model and compartmentalization.<\/li>\n<li><strong>Why it matters<\/strong>: DR is sensitive; least privilege matters.<\/li>\n<li><strong>Practical benefit<\/strong>: You can scope DR operators to only what they need.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: IAM complexity can cause plan failures if permissions are incomplete.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 8: Alignment with OCI observability<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Works with OCI Audit and often integrates with Logging\/Events patterns for operational alerts (verify specifics).<\/li>\n<li><strong>Why it matters<\/strong>: DR execution should trigger notifications and leave audit trails.<\/li>\n<li><strong>Practical benefit<\/strong>: Integrate with on-call workflows.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: You must configure Notifications\/Events separately in most OCI designs.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">7. Architecture and How It Works<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">High-level service architecture<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Control plane<\/strong>: Full Stack Disaster Recovery defines protection groups and DR plans, then orchestrates actions.<\/li>\n<li><strong>Data plane<\/strong>: Actual data replication is handled by the underlying OCI services you configure (database replication, volume replication, object replication, etc.\u2014verify supported integrations for your stack).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Request\/data\/control flow (typical)<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Operator triggers a DR plan execution (drill\/switchover\/failover).<\/li>\n<li>Full Stack Disaster Recovery validates prerequisites (permissions, pairing, plan configuration).<\/li>\n<li>The service orchestrates steps across components (network cutover, promotion actions, start\/stop operations).<\/li>\n<li>Underlying services perform replication\/promotions (for example, database role transitions).<\/li>\n<li>Traffic is shifted (DNS, load balancer, routing patterns\u2014implementation-specific).<\/li>\n<li>Operator validates application health and completes the operation.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Integrations with related services (common patterns)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Networking<\/strong>: VCN, subnets, route tables, security lists\/NSGs, Load Balancer, DNS (OCI DNS or external), and potentially Traffic Management Steering Policies (verify product fit).<\/li>\n<li><strong>Compute &amp; Storage<\/strong>: Instances, boot volumes, block volumes, and backups\/replication services.<\/li>\n<li><strong>Database<\/strong>: Oracle database replication technologies (for example, Data Guard for Oracle databases; verify service compatibility).<\/li>\n<li><strong>Observability<\/strong>: OCI Audit for governance; Monitoring\/Logging for operational visibility; Notifications for alerting.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Dependency services<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI IAM<\/li>\n<li>OCI networking<\/li>\n<li>Replication technology for your stateful layers<\/li>\n<li>A consistent landing zone (compartments, tags, policies) for primary and standby<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/authentication model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Uses OCI IAM users\/groups or federated identities.<\/li>\n<li>Permissions are managed via IAM policies at tenancy or compartment scope.<\/li>\n<li>Best practice: dedicate a <strong>DR Operators<\/strong> group with least privileges.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Networking model (what to plan)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>IP addressing strategy<\/strong>: Decide whether you will reuse CIDRs across regions (often easier for lift-and-shift) or use different CIDRs and rely on DNS\/service discovery.<\/li>\n<li><strong>Ingress failover<\/strong>: DNS TTLs, health checks, and routing strategies strongly influence effective RTO.<\/li>\n<li><strong>Egress dependencies<\/strong>: Third-party allowlists (payment gateways, partner APIs) must be updated during DR or designed to be region-agnostic.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Monitoring\/logging\/governance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure <strong>OCI Audit<\/strong> is enabled and retained per policy.<\/li>\n<li>Route DR execution events to a central alerting channel (OCI Notifications + email\/SMS\/webhook, or integrate into external systems).<\/li>\n<li>Use tags to track DR-related resources for cost and ownership.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Simple architecture diagram (conceptual)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart LR\n  U[Users] --&gt; DNS[DNS \/ Traffic Steering]\n  DNS --&gt; LB1[Load Balancer (Primary)]\n  DNS --&gt; LB2[Load Balancer (Standby)]\n\n  subgraph R1[Region A (Primary)]\n    LB1 --&gt; APP1[App Tier]\n    APP1 --&gt; DB1[(Primary DB)]\n    APP1 --&gt; ST1[Storage]\n  end\n\n  subgraph R2[Region B (Standby)]\n    LB2 --&gt; APP2[App Tier (Standby)]\n    APP2 --&gt; DB2[(Standby DB)]\n    APP2 --&gt; ST2[Storage (Replicated)]\n  end\n\n  FSDR[Full Stack Disaster Recovery\\n(Orchestration Control Plane)]\n  FSDR --- R1\n  FSDR --- R2\n\n  DB1 &lt;-.replication.-&gt; DB2\n  ST1 &lt;-.replication.-&gt; ST2\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Production-style architecture diagram (more detailed)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart TB\n  subgraph Id[Identity &amp; Governance]\n    IAM[IAM Policies &amp; Groups]\n    AUD[OCI Audit]\n    TAG[Tagging \/ Cost Tracking]\n  end\n\n  subgraph Obs[Observability]\n    MON[Monitoring]\n    LOG[Logging]\n    EVT[Events]\n    NOTIF[Notifications]\n  end\n\n  subgraph R1[Primary Region]\n    direction TB\n    WAF1[WAF \/ Edge Controls (optional)]\n    DNS1[DNS Records \/ Steering]\n    LB1[OCI Load Balancer]\n    OKE1[Compute \/ Kubernetes \/ App Tier]\n    DB1[(Database Primary)]\n    OBJ1[Object Storage Bucket]\n    BLK1[Block Volumes \/ Boot Volumes]\n    NET1[VCN + Subnets + NSGs + Routes]\n  end\n\n  subgraph R2[Standby Region]\n    direction TB\n    WAF2[WAF \/ Edge Controls (optional)]\n    DNS2[DNS Records \/ Steering]\n    LB2[OCI Load Balancer]\n    OKE2[Compute \/ Kubernetes \/ App Tier Standby]\n    DB2[(Database Standby)]\n    OBJ2[Object Storage Bucket (Replicated)]\n    BLK2[Block Volumes (Replicated)]\n    NET2[VCN + Subnets + NSGs + Routes]\n  end\n\n  FSDR[Full Stack Disaster Recovery\\nProtection Groups + DR Plans]\n\n  IAM --&gt; FSDR\n  FSDR --&gt; NET1\n  FSDR --&gt; LB1\n  FSDR --&gt; OKE1\n  FSDR --&gt; DB1\n\n  FSDR --&gt; NET2\n  FSDR --&gt; LB2\n  FSDR --&gt; OKE2\n  FSDR --&gt; DB2\n\n  DB1 &lt;-.DB replication.-&gt; DB2\n  OBJ1 &lt;-.bucket replication.-&gt; OBJ2\n  BLK1 &lt;-.volume replication.-&gt; BLK2\n\n  FSDR --&gt; EVT --&gt; NOTIF\n  FSDR --&gt; LOG\n  AUD --&gt; LOG\n  TAG --&gt; R1\n  TAG --&gt; R2\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">8. Prerequisites<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Tenancy \/ account requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>An active <strong>Oracle Cloud (OCI) tenancy<\/strong> with permission to use <strong>Full Stack Disaster Recovery<\/strong>.<\/li>\n<li>Access to at least <strong>two OCI regions<\/strong> if you want regional DR (primary + standby).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Permissions \/ IAM roles<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You need IAM permissions to:\n&#8211; Manage Full Stack Disaster Recovery resources (protection groups, plans, executions).\n&#8211; Manage\/inspect underlying resources included in the protection group (networking, compute, storage, databases).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Because OCI IAM policies are precise and service-specific, <strong>verify the exact policy statements in official docs<\/strong> for Full Stack Disaster Recovery. A common operational model is:\n&#8211; A platform admin sets up compartments, networks, and baseline policies.\n&#8211; A DR operator group can execute DR plans but cannot create unrelated infrastructure.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Billing requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A paid tenancy (or credits) is typically required for cross-region DR because replication, standby storage, and standby compute incur cost.<\/li>\n<li>Even if Full Stack Disaster Recovery itself has no separate line-item charge (verify in pricing), the underlying services do.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Tools (CLI\/SDK)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI Console access (recommended for beginners).<\/li>\n<li>Optional:<\/li>\n<li>OCI CLI: https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/API\/SDKDocs\/cliinstall.htm<\/li>\n<li>Terraform \/ OCI Resource Manager (commonly used to standardize primary\/standby infrastructure). Verify the current recommended approach in OCI docs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Region availability<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Full Stack Disaster Recovery is not guaranteed to be available in every region or for every account type.<\/li>\n<li><strong>Verify region availability<\/strong> in OCI docs and your tenancy\u2019s region subscriptions.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas \/ limits<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Service limits for compute, load balancers, block volumes, databases, and replication features vary by region and tenancy.<\/li>\n<li>Request limit increases in advance for production DR.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisite services<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">At minimum, you should already have:\n&#8211; A primary environment and a standby environment design (networking, compartments).\n&#8211; A replication strategy for stateful components (database, storage).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">9. Pricing \/ Cost<\/h2>\n\n\n\n<blockquote>\n<p>Pricing changes, regional differences, and negotiated enterprise pricing are common. Do not rely on static blog numbers for DR. Use official sources.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Current pricing model (how to think about it)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">For Full Stack Disaster Recovery, your cost model usually has two layers:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Orchestration layer (Full Stack Disaster Recovery service)<\/strong>\n&#8211; In some cloud services, orchestration may be free or low-cost compared to the infrastructure it controls.\n&#8211; <strong>Verify Full Stack Disaster Recovery pricing<\/strong> in the official OCI pricing pages; if it is not explicitly listed, it may be included at no additional cost and billed indirectly through dependent services.<\/p>\n<\/li>\n<li>\n<p><strong>Underlying infrastructure and replication costs (the real drivers)<\/strong>\nCommon cost dimensions include:\n&#8211; <strong>Standby region storage<\/strong> (replicated block volumes, object storage replication, backups).\n&#8211; <strong>Standby compute<\/strong> (if warm standby keeps instances running).\n&#8211; <strong>Network egress<\/strong> (cross-region replication traffic and DR cutover traffic).\n&#8211; <strong>Database replication<\/strong> (depends on DB service type, licensing model, and replication method).\n&#8211; <strong>Load balancers<\/strong> and <strong>public IPs<\/strong> in both regions.\n&#8211; <strong>Logging\/Monitoring retention<\/strong> and <strong>archival<\/strong> for DR audit evidence.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Free tier<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">OCI Free Tier usually covers limited always-free resources, but <strong>cross-region DR<\/strong> and replication often exceed always-free limits. Check:\n&#8211; OCI Free Tier overview: https:\/\/www.oracle.com\/cloud\/free\/<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Hidden or indirect costs to plan for<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Cross-region data transfer<\/strong> for replication.<\/li>\n<li><strong>Standby capacity reservation<\/strong> (if you need guaranteed capacity during a regional event\u2014OCI has capacity constructs; verify best option).<\/li>\n<li><strong>DR drills<\/strong> that temporarily run production-sized compute in DR.<\/li>\n<li><strong>DNS and traffic management<\/strong> (if using external DNS providers).<\/li>\n<li><strong>Third-party licensing<\/strong> (commercial software running in DR).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network\/data transfer implications<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Replication across regions generally consumes network bandwidth and may incur charges depending on OCI\u2019s current inter-region transfer pricing rules.<\/li>\n<li>Your application\u2019s user traffic during DR may shift geography, impacting latency and egress.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How to optimize cost (practical)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Choose <strong>pilot light<\/strong> or <strong>warm standby<\/strong> based on RTO:<\/li>\n<li>Pilot light: minimal DR compute running; scale up on failover.<\/li>\n<li>Warm standby: reduced capacity running; faster failover.<\/li>\n<li>Use <strong>object lifecycle policies<\/strong> and <strong>log retention policies<\/strong>.<\/li>\n<li>Right-size standby databases and compute (where supported).<\/li>\n<li>Schedule DR drills and automatically tear down temporary resources after validation.<\/li>\n<li>Tag DR resources and use OCI cost analysis tools (verify the latest OCI cost management features in docs).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Example low-cost starter estimate (no fabricated prices)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A realistic \u201cstarter\u201d DR lab cost typically includes:\n&#8211; A small compute instance in primary\n&#8211; A small standby footprint (or stopped instance) in DR region\n&#8211; Some replicated storage (block volume replication and\/or object replication)\n&#8211; Minimal logging\/monitoring retention<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Because actual charges vary by region and SKU, build your estimate using:\n&#8211; OCI Pricing: https:\/\/www.oracle.com\/cloud\/price-list\/\n&#8211; OCI Cost Estimator: https:\/\/www.oracle.com\/cloud\/costestimator.html<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Example production cost considerations (what usually dominates)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Duplicated <strong>database capacity<\/strong> and licensing model (BYOL vs license-included where applicable).<\/li>\n<li><strong>Standby compute fleet<\/strong> if warm standby is required.<\/li>\n<li><strong>Replication bandwidth<\/strong> and storage growth.<\/li>\n<li><strong>Load balancers<\/strong> and <strong>WAF\/edge<\/strong> duplicated across regions.<\/li>\n<li>Operational overhead: monitoring, alerts, log retention, and DR drills.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">10. Step-by-Step Hands-On Tutorial<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This lab is designed to be <strong>beginner-friendly<\/strong> and oriented around realistic DR setup tasks. Because supported resource types and exact console workflows can change, you will use the console and verify the exact fields against current OCI documentation.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Objective<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Create a minimal cross-region DR setup and run a <strong>DR drill<\/strong> using <strong>Oracle Cloud Full Stack Disaster Recovery<\/strong>, validating that your DR plan can be executed and tracked.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Lab Overview<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You will:\n1. Choose a <strong>primary<\/strong> and <strong>standby<\/strong> OCI region.\n2. Prepare compartments, networking, and a simple workload.\n3. Configure replication for stateful data (choose the simplest replication your workload supports).\n4. Create <strong>protection groups<\/strong> in Full Stack Disaster Recovery.\n5. Create and execute a <strong>DR drill plan<\/strong>.\n6. Validate results, troubleshoot common issues, and clean up.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Important lab notes<\/strong>\n&#8211; This lab intentionally keeps the workload small.<br\/>\n&#8211; You will incur some cost if you enable replication and create resources in two regions.\n&#8211; If your tenancy\/region does not yet have Full Stack Disaster Recovery enabled, you will not be able to complete the orchestration steps.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Select regions and establish naming<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions<\/strong>\n1. Decide your <strong>Primary Region<\/strong> (Region A) and <strong>Standby Region<\/strong> (Region B).\n2. Choose a consistent naming pattern:\n   &#8211; <code>app-prod-r1<\/code> \/ <code>app-dr-r2<\/code>\n   &#8211; <code>fsdr-pg-primary<\/code> \/ <code>fsdr-pg-standby<\/code>\n   &#8211; Tags: <code>Environment=Lab<\/code>, <code>Service=FSDR<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; You have two regions selected and a naming\/tagging standard you\u2019ll reuse consistently.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; In OCI Console, ensure both regions are subscribed\/available in your tenancy.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Create compartments and tags<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions<\/strong>\n1. Create (or reuse) a compartment for the lab, e.g. <code>DR-Lab<\/code>.\n2. Create a tag namespace and tags (optional but recommended):\n   &#8211; Namespace: <code>dr_lab<\/code>\n   &#8211; Tags: <code>owner<\/code>, <code>costcenter<\/code>, <code>environment<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; A dedicated compartment that isolates lab resources and helps cleanup.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Confirm you can switch to the compartment and see it in the compartment selector.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Configure IAM access for DR operations<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions<\/strong>\n1. Create an IAM group, e.g. <code>DR-Lab-Operators<\/code>.\n2. Add your user to this group.\n3. Create IAM policies to allow managing:\n   &#8211; Full Stack Disaster Recovery resources\n   &#8211; Underlying services you will include (networking, compute, storage, database services)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Your user can create protection groups and DR plans and can manage the dependent resources in the lab.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Try opening the Full Stack Disaster Recovery service in the console while scoped to your compartment.\n&#8211; If access is denied, review policy scope and service permissions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Common issue<\/strong>\n&#8211; Missing permissions on networking or compute causes plan execution to fail later.<\/p>\n\n\n\n<blockquote>\n<p><strong>Verify in official docs<\/strong> the exact IAM policy syntax for Full Stack Disaster Recovery, as service policy names and \u201c-family\u201d groupings can change.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Build minimal networking in both regions<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">DR works best when the standby region has networking ready.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions (Region A \/ Primary)<\/strong>\n1. Create a VCN (e.g. <code>vcn-dr-lab-r1<\/code>) with:\n   &#8211; A public subnet (for a simple test VM) or private subnet (preferred in real designs)\n   &#8211; An Internet Gateway (if using a public subnet for SSH)\n   &#8211; Route table and security rules to allow SSH from your IP (lab-only)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions (Region B \/ Standby)<\/strong>\n1. Create a VCN (e.g. <code>vcn-dr-lab-r2<\/code>) with similar structure.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Both regions have a VCN and subnet(s) ready to host the application.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Confirm the VCNs and subnets exist in both regions.\n&#8211; Confirm route tables and security rules allow required lab access.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Design tip<\/strong>\n&#8211; For real production DR, prefer <strong>private subnets<\/strong>, bastions, and strict NSGs. Public SSH is not recommended.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Create a small workload in the primary region<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions<\/strong>\n1. In Region A, create a small compute instance (e.g., a basic Linux VM).\n2. Install a simple web service:\n   &#8211; Use an OCI-provided image and cloud-init, or SSH and install NGINX\/Apache.\n3. Place a test file that proves \u201cthis is primary\u201d:\n   &#8211; <code>\/var\/www\/html\/index.html<\/code> with \u201cHello from PRIMARY\u201d.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Example cloud-init (optional)<\/strong><\/p>\n\n\n\n<pre><code class=\"language-bash\">#!\/bin\/bash\nset -euo pipefail\nyum -y install nginx || (apt-get update &amp;&amp; apt-get -y install nginx)\necho \"Hello from PRIMARY\" &gt; \/usr\/share\/nginx\/html\/index.html || true\necho \"Hello from PRIMARY\" &gt; \/var\/www\/html\/index.html || true\nsystemctl enable nginx || true\nsystemctl restart nginx || true\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; A reachable test workload in the primary region.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Access the instance (SSH) and verify the service is running.\n&#8211; If you assigned a public IP and opened port 80, curl it:\n  <code>bash\n  curl -I http:\/\/&lt;PUBLIC_IP&gt;<\/code><\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 6: Establish replication for the stateful layer (choose one)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Full Stack Disaster Recovery orchestrates recovery, but <strong>replication must be configured<\/strong> by the underlying service(s).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Choose the simplest option you can support in your tenancy:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Option A (commonly used): Replicate block volumes \/ boot volumes<\/strong><br\/>\n&#8211; Use OCI\u2019s block volume replication features if available in your regions and supported for your workload.\n&#8211; This is often used to recreate an instance in DR using replicated volumes (exact mechanics depend on OCI features and your design).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Option B: Replicate object storage<\/strong><br\/>\n&#8211; If your \u201cstate\u201d can be stored in Object Storage, use bucket replication across regions.\n&#8211; This is simpler for stateless apps that fetch state at runtime.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Option C: Database replication (for real apps)<\/strong>\n&#8211; For Oracle databases, use the appropriate replication technology (for example, Data Guard patterns where applicable).\n&#8211; This is usually the most important and most complex part.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; You have at least one replicated component that makes DR meaningful.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Confirm replication status is healthy in the source service\u2019s console page.<\/p>\n\n\n\n<blockquote>\n<p>Because replication configuration is service-specific and changes over time, <strong>verify step-by-step replication setup in official OCI docs<\/strong> for the exact service you use (Block Volume, Object Storage, Database service).<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 7: Create Full Stack Disaster Recovery protection groups<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Now you model your stack in Full Stack Disaster Recovery.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions<\/strong>\n1. In Region A, open <strong>Full Stack Disaster Recovery<\/strong> in the OCI Console.\n2. Create a <strong>primary protection group<\/strong> (e.g. <code>fsdr-pg-primary<\/code>) in your compartment.\n3. In Region B, create a <strong>standby protection group<\/strong> (e.g. <code>fsdr-pg-standby<\/code>).\n4. Associate\/pair the protection groups as primary \u2194 standby (exact pairing workflow may vary).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Add members\/resources<\/strong>\n&#8211; Add the resources your protection group should cover (networking elements, compute elements, storage replication relationships, database replication relationships, etc., depending on what Full Stack Disaster Recovery currently supports).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Two protection groups exist and are paired, representing primary and standby.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Protection groups show an \u201cactive\/paired\/ready\u201d style status (exact wording varies).\n&#8211; Members show valid references and no errors.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Common errors<\/strong>\n&#8211; Adding an unsupported resource type.\n&#8211; Missing IAM permission to read\/modify a referenced resource.\n&#8211; Standby resources not present (for designs that require pre-created standby network\/LB).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 8: Create a DR drill plan<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions<\/strong>\n1. In Full Stack Disaster Recovery, create a <strong>DR drill plan<\/strong> for the protection group pair.\n2. Review plan steps (many services generate a baseline plan; others require manual step definition\u2014verify behavior in docs).\n3. Ensure the plan includes appropriate safeguards (for example, \u201cdo not affect production traffic\u201d for drills).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; A DR drill plan exists and is associated with your protection groups.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; The plan is in a \u201cready\u201d state and passes validation checks (if the service provides them).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 9: Execute the DR drill plan<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Actions<\/strong>\n1. Start the DR drill plan execution.\n2. Watch the execution progress and step status.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; The plan execution completes successfully, or fails with clear errors that you can troubleshoot.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; In the execution details, confirm:\n  &#8211; Start\/end times\n  &#8211; Step success\/failure\n  &#8211; References to resources acted upon\n&#8211; Validate standby environment behavior:\n  &#8211; Depending on your design, this might include created\/started compute, attached volumes, promoted database, or updated routing <strong>within a drill-safe scope<\/strong>.<\/p>\n\n\n\n<blockquote>\n<p>If your drill is designed not to expose DR publicly, validation may require internal checks (private IP access, bastion, or system logs).<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Validation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Perform at least these validation checks:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Replication health<\/strong>\n&#8211; Confirm replicated resources still show healthy replication (or the expected paused\/promoted state depending on drill semantics).<\/p>\n<\/li>\n<li>\n<p><strong>Standby workload checks<\/strong>\n&#8211; If a standby instance is started\/created:\n  &#8211; SSH access works (if allowed).\n  &#8211; Web service responds (if exposed).\n&#8211; If the drill uses private networking only:\n  &#8211; Validate from a bastion or from an instance in the same VCN.<\/p>\n<\/li>\n<li>\n<p><strong>Plan execution record<\/strong>\n&#8211; Capture execution ID and status for audit purposes.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Troubleshooting<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common issues and fixes:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>IAM permission denied<\/strong>\n&#8211; Symptom: Plan fails early with authorization errors.\n&#8211; Fix:\n  &#8211; Review IAM policies for Full Stack Disaster Recovery and all referenced services.\n  &#8211; Ensure policies are in the correct compartment scope.<\/p>\n<\/li>\n<li>\n<p><strong>Unsupported member\/resource type<\/strong>\n&#8211; Symptom: You cannot add a resource, or plan fails when acting on it.\n&#8211; Fix:\n  &#8211; Check the official support matrix for Full Stack Disaster Recovery member types.\n  &#8211; Model unsupported items as external prerequisites and document manual steps if needed.<\/p>\n<\/li>\n<li>\n<p><strong>Networking mismatch<\/strong>\n&#8211; Symptom: Standby services start but are unreachable.\n&#8211; Fix:\n  &#8211; Confirm route tables, security lists\/NSGs, and DNS settings in the standby region.\n  &#8211; Validate that the standby has required gateways (NAT\/Internet\/Service Gateway) if needed.<\/p>\n<\/li>\n<li>\n<p><strong>Replication lag or unhealthy replication<\/strong>\n&#8211; Symptom: Plan blocks because replication is not synchronized.\n&#8211; Fix:\n  &#8211; Resolve replication health in the underlying service first.\n  &#8211; For database replication, confirm apply lag and role transition readiness.<\/p>\n<\/li>\n<li>\n<p><strong>Quota or capacity issues in standby region<\/strong>\n&#8211; Symptom: Compute cannot start or volumes cannot be created\/attached.\n&#8211; Fix:\n  &#8211; Request service limit increases.\n  &#8211; Pre-provision warm standby capacity if your RTO is strict.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Cleanup<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To avoid ongoing cost, clean up in reverse order:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Delete\/terminate any DR drill-created compute resources (if not automatically removed).<\/li>\n<li>Delete DR plan executions (if applicable) and DR plans (if you won\u2019t reuse them).<\/li>\n<li>Delete protection groups (standby then primary, if required by dependencies).<\/li>\n<li>Disable\/delete replication configurations (block\/object\/database replication) if this is only a lab.<\/li>\n<li>Terminate compute instances and delete volumes (including replicated volumes).<\/li>\n<li>Delete load balancers and public IPs (if created).<\/li>\n<li>Delete VCNs\/subnets\/gateways in both regions.<\/li>\n<li>Delete tags\/compartment if dedicated to this lab.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Use the compartment resource view to confirm all resources are deleted in both regions.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">11. Best Practices<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Architecture best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Design DR around measurable targets:<\/li>\n<li><strong>RTO<\/strong> (time to restore service)<\/li>\n<li><strong>RPO<\/strong> (acceptable data loss)<\/li>\n<li>Separate your stack into layers:<\/li>\n<li>Stateless tiers (easy to redeploy)<\/li>\n<li>Stateful tiers (replication and consistency dominate DR)<\/li>\n<li>Use a consistent infrastructure baseline in both regions:<\/li>\n<li>Networking layout, IAM, logging, and tagging<\/li>\n<li>Prefer automation for infrastructure provisioning (Terraform\/Resource Manager) and use Full Stack Disaster Recovery for orchestration of recovery actions.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">IAM\/security best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use least privilege:<\/li>\n<li>DR operators can execute plans but cannot modify IAM or unrelated infrastructure.<\/li>\n<li>Separate roles:<\/li>\n<li>DR plan authors vs DR plan executors<\/li>\n<li>Use compartments per environment and per application, with clear ownership.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cost best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Choose standby mode based on business needs:<\/li>\n<li>Pilot light for cost savings<\/li>\n<li>Warm standby for faster RTO<\/li>\n<li>Tag all DR resources and enforce tagging via governance where possible.<\/li>\n<li>Run DR drills on a schedule but keep them right-sized and time-bound.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Performance best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Validate that the standby region can meet performance needs:<\/li>\n<li>Compute shapes available<\/li>\n<li>Storage performance<\/li>\n<li>Network paths and latency<\/li>\n<li>Keep DNS TTLs aligned with RTO goals (lower TTL can speed cutover but increases DNS query volume).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Reliability best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Regular DR drills are non-negotiable; untested DR is not DR.<\/li>\n<li>Validate failure modes:<\/li>\n<li>Primary region unreachable<\/li>\n<li>Partial failures (DB available but app tier not)<\/li>\n<li>Document manual fallbacks for unsupported or external dependencies (third-party SaaS, on-prem systems).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operations best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Integrate DR plan executions with on-call:<\/li>\n<li>Alerts on plan failure<\/li>\n<li>Runbooks and escalation paths<\/li>\n<li>Maintain an application dependency map (DB, secrets, DNS, external APIs).<\/li>\n<li>Create post-drill checklists and store results.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Governance\/tagging\/naming best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Standardize names:<\/li>\n<li>Include app, environment, region, and role (primary\/standby).<\/li>\n<li>Use tags to track:<\/li>\n<li><code>dr_role=primary|standby<\/code><\/li>\n<li><code>application=&lt;name&gt;<\/code><\/li>\n<li><code>owner=&lt;team&gt;<\/code><\/li>\n<li><code>environment=prod|stage|lab<\/code><\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">12. Security Considerations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Identity and access model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use OCI IAM for:<\/li>\n<li>Authentication (users, federation)<\/li>\n<li>Authorization (policies)<\/li>\n<li>Restrict who can:<\/li>\n<li>Modify protection groups and plans<\/li>\n<li>Execute failover vs drills (failover is high-impact)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Encryption<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure encryption at rest is enabled for storage and database services (OCI supports encryption by default for many services, but confirm your configuration).<\/li>\n<li>Ensure encryption in transit:<\/li>\n<li>TLS for application endpoints<\/li>\n<li>Secure replication channels (managed by OCI services, but verify)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network exposure<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Avoid public IPs for DR admin access; use:<\/li>\n<li>Bastion patterns<\/li>\n<li>VPN\/FastConnect<\/li>\n<li>Private endpoints where applicable<\/li>\n<li>Keep DR network security rules symmetrical across regions but tailored to minimum needed access.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secrets handling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Store secrets in OCI Vault (recommended) rather than embedding them in scripts.<\/li>\n<li>Ensure both regions can access secrets:<\/li>\n<li>Replication\/backup strategy for secrets if required (verify OCI Vault cross-region strategy).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Audit\/logging<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use OCI Audit for tracking:<\/li>\n<li>DR plan changes<\/li>\n<li>DR plan executions<\/li>\n<li>IAM policy changes<\/li>\n<li>Centralize logs with defined retention policies.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compliance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data residency: confirm DR region meets regulatory needs.<\/li>\n<li>Evidence: store drill execution results and validation outputs.<\/li>\n<li>Access reviews: periodically review DR operator privileges.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Common security mistakes<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Over-permissive DR roles (\u201cmanage all-resources in tenancy\u201d).<\/li>\n<li>Exposing DR admin ports publicly.<\/li>\n<li>Forgetting to rotate credentials in standby or validate access in the DR region.<\/li>\n<li>Not testing failover of identity dependencies (IdP, DNS provider).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secure deployment recommendations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use least privilege IAM and compartment boundaries.<\/li>\n<li>Use private networking and controlled ingress.<\/li>\n<li>Treat DR plans as change-controlled artifacts.<\/li>\n<li>Validate your incident response process includes DR execution steps and clear go\/no-go criteria.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">13. Limitations and Gotchas<\/h2>\n\n\n\n<blockquote>\n<p>Treat this section as a checklist to validate in your own OCI tenancy and target regions.<\/p>\n<\/blockquote>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Supported resource types vary<\/strong>: Not every OCI service\/resource can be directly orchestrated as a member. Always check the current support matrix in official docs.<\/li>\n<li><strong>Replication is not automatic<\/strong>: You must configure replication for stateful layers separately.<\/li>\n<li><strong>Capacity is not guaranteed unless planned<\/strong>: During a regional incident, standby capacity may be constrained if not reserved\/pre-provisioned.<\/li>\n<li><strong>DNS behavior can dominate RTO<\/strong>: Low TTL helps, but client caching and resolver behavior can still delay cutover.<\/li>\n<li><strong>DR drills can incur real cost<\/strong>: Spinning up standby compute, running load balancers, and replicating data is not free.<\/li>\n<li><strong>External dependencies are often the real blocker<\/strong>:<\/li>\n<li>SaaS allowlists<\/li>\n<li>On-prem integrations<\/li>\n<li>Certificate authorities, SSO providers<\/li>\n<li><strong>Network CIDR mismatches complicate DR<\/strong>: If your standby uses different CIDRs, application config and security rules must be DR-aware.<\/li>\n<li><strong>IAM policies frequently cause failures<\/strong>: Missing permission on one dependent service can stop plan execution mid-way.<\/li>\n<li><strong>Operational ownership must be explicit<\/strong>: Who is allowed to execute failover at 2 a.m.?<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">14. Comparison with Alternatives<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Full Stack Disaster Recovery is an orchestration layer. Alternatives include using individual replication services, building custom runbooks, or using third-party DR tools.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Comparison table<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Option<\/th>\n<th>Best For<\/th>\n<th>Strengths<\/th>\n<th>Weaknesses<\/th>\n<th>When to Choose<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>OCI Full Stack Disaster Recovery<\/strong><\/td>\n<td>Orchestrated DR across an OCI application stack<\/td>\n<td>Centralized DR plans, drills, and coordinated execution across components<\/td>\n<td>Requires supported member types and correct prerequisite replication; learning curve<\/td>\n<td>You need repeatable, auditable DR with coordinated sequencing<\/td>\n<\/tr>\n<tr>\n<td><strong>OCI-native replication only (e.g., DB replication + volume\/object replication)<\/strong><\/td>\n<td>Simple apps or teams with strong automation maturity<\/td>\n<td>Direct control of each layer; can be cheaper\/simpler<\/td>\n<td>No unified orchestration; higher human-error risk<\/td>\n<td>You have small scope or already have reliable runbooks tested<\/td>\n<\/tr>\n<tr>\n<td><strong>Backup\/restore approach (no hot replication)<\/strong><\/td>\n<td>Cost-sensitive workloads with high RTO\/RPO tolerance<\/td>\n<td>Lowest standby cost<\/td>\n<td>Slow recovery, higher data loss risk<\/td>\n<td>Dev\/test, low-criticality workloads<\/td>\n<\/tr>\n<tr>\n<td><strong>Terraform + scripts + runbooks (self-managed orchestration)<\/strong><\/td>\n<td>Teams needing highly custom logic<\/td>\n<td>Full flexibility<\/td>\n<td>High engineering effort, testing burden, drift risk<\/td>\n<td>You need complex logic not supported by managed orchestration<\/td>\n<\/tr>\n<tr>\n<td><strong>AWS Elastic Disaster Recovery (other cloud)<\/strong><\/td>\n<td>DR for AWS-based workloads<\/td>\n<td>Managed DR workflows for AWS<\/td>\n<td>Cross-cloud mismatch; not OCI-native<\/td>\n<td>Your workloads are primarily on AWS<\/td>\n<\/tr>\n<tr>\n<td><strong>Azure Site Recovery (other cloud)<\/strong><\/td>\n<td>DR for Azure-centric workloads<\/td>\n<td>Strong Azure integration<\/td>\n<td>Not OCI-native<\/td>\n<td>Your workloads are primarily on Azure<\/td>\n<\/tr>\n<tr>\n<td><strong>Third-party DR tools (e.g., Zerto\/Veeam, depending on stack)<\/strong><\/td>\n<td>Heterogeneous environments (VMware, multi-cloud)<\/td>\n<td>Mature DR tooling for specific ecosystems<\/td>\n<td>Licensing cost, operational complexity, integration constraints<\/td>\n<td>You need cross-platform DR beyond OCI-native scope<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">15. Real-World Example<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Enterprise example (regulated industry)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Problem<\/strong><br\/>\nA financial services company runs a customer portal on OCI. Regulators require regular DR tests and evidence. The portal includes load balancers, multiple app tiers, and an Oracle database replication setup across regions. Manual DR runbooks have failed in past tests due to missed sequencing steps and DNS errors.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Proposed architecture<\/strong>\n&#8211; Region A: primary application stack\n&#8211; Region B: standby stack with:\n  &#8211; Pre-created networking and load balancer\n  &#8211; Database standby configured via the appropriate replication technology\n  &#8211; Replicated storage for required state\n&#8211; Full Stack Disaster Recovery:\n  &#8211; Protection groups representing the portal stack\n  &#8211; DR drill plan executed monthly\n  &#8211; Switchover plan for planned maintenance\n  &#8211; Failover plan for outages\n&#8211; Observability:\n  &#8211; Events\/Notifications for plan status\n  &#8211; Audit retention for compliance<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why this service was chosen<\/strong>\n&#8211; Provides standardized DR plan execution with clear tracking.\n&#8211; Supports repeatable drills for compliance evidence.\n&#8211; Reduces human error in complex sequencing.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcomes<\/strong>\n&#8211; Lower RTO variability due to standardized orchestration.\n&#8211; Documented, repeatable DR evidence for audits.\n&#8211; Faster incident response with clearer operational steps.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Startup \/ small-team example<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Problem<\/strong><br\/>\nA SaaS startup runs a small multi-tier app on OCI. They can tolerate some downtime but want a reliable process for region-level failures. They don\u2019t have time to maintain complex runbooks.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Proposed architecture<\/strong>\n&#8211; Pilot light DR:\n  &#8211; Minimal standby infrastructure in Region B\n  &#8211; Replication configured for critical data\n&#8211; Full Stack Disaster Recovery:\n  &#8211; A simple protection group covering only essential components\n  &#8211; A DR drill plan executed quarterly\n&#8211; Cost controls:\n  &#8211; Standby compute off or minimal where possible\n  &#8211; Tight log retention policies<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why this service was chosen<\/strong>\n&#8211; Lets a small team operationalize DR without building an internal orchestration platform.\n&#8211; Encourages regular drills with less operational overhead.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcomes<\/strong>\n&#8211; A repeatable \u201cknown-good\u201d DR process.\n&#8211; Better readiness without major headcount increase.\n&#8211; Controlled DR cost aligned to business stage.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">16. FAQ<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">1) <strong>Is Full Stack Disaster Recovery the same as backups?<\/strong><br\/>\nNo. Backups are point-in-time recovery. Full Stack Disaster Recovery focuses on orchestrating the steps to recover an application stack, typically using replication plus coordinated cutover actions. Backups can still be part of the strategy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) <strong>Does Full Stack Disaster Recovery replicate my data automatically?<\/strong><br\/>\nUsually no. Replication is performed by underlying OCI services (database replication, volume replication, object replication, etc.). Full Stack Disaster Recovery orchestrates the overall recovery workflow.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) <strong>Is Full Stack Disaster Recovery cross-region only?<\/strong><br\/>\nMost DR designs are cross-region, but some organizations also design intra-region recovery. Verify current capabilities and recommended patterns in official docs.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) <strong>What RTO\/RPO can I achieve?<\/strong><br\/>\nIt depends on replication method, standby capacity, DNS strategy, and application startup time. The orchestration layer helps reduce human delay, but infrastructure and data replication dominate.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">5) <strong>Can I run DR drills without impacting production?<\/strong><br\/>\nThat is the goal of a drill plan, but drill safety depends on how you design networking, DNS, and plan steps. Validate drill behavior carefully.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">6) <strong>Do I need identical infrastructure in both regions?<\/strong><br\/>\nNot always, but the more symmetrical your environments, the simpler DR becomes. Differences increase the chance of plan failures and configuration drift.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">7) <strong>How do I handle DNS during failover?<\/strong><br\/>\nCommon approaches include DNS steering, low TTL records, and health checks. The exact integration depends on your DNS provider and traffic management design.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">8) <strong>Does Full Stack Disaster Recovery support my specific OCI service?<\/strong><br\/>\nSupport varies by resource type and evolves over time. Check the official supported member list for your service and region.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">9) <strong>What\u2019s the difference between switchover and failover?<\/strong><br\/>\nSwitchover is planned and typically aims for minimal data loss by coordinating a clean transition. Failover is for emergencies when the primary may be unreachable and may accept some data loss depending on replication.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">10) <strong>How often should I run DR drills?<\/strong><br\/>\nCommon cadences are monthly or quarterly depending on compliance needs and change velocity. Run a drill after major architecture changes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">11) <strong>What are the most common reasons DR plans fail?<\/strong><br\/>\nIAM permission gaps, unhealthy replication, missing standby capacity\/quotas, and networking\/DNS misconfiguration.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">12) <strong>How do I keep standby costs low?<\/strong><br\/>\nUse pilot light designs, minimize always-on compute, optimize storage replication scope, and control logging retention. Balance savings against RTO.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">13) <strong>Can I automate infrastructure creation for DR?<\/strong><br\/>\nYes. Many teams use Terraform\/OCI Resource Manager to keep primary and standby environments consistent and reduce drift.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">14) <strong>How do I prove DR readiness to auditors?<\/strong><br\/>\nRun scheduled drills, capture execution IDs and results, store validation outputs, and retain OCI Audit logs per policy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">15) <strong>Is Full Stack Disaster Recovery suitable for multi-cloud DR?<\/strong><br\/>\nIt is designed for OCI-centric orchestration. For multi-cloud, you may need third-party tooling or custom orchestration.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">17. Top Online Resources to Learn Full Stack Disaster Recovery<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Use official sources first, especially for supported member types, IAM policies, and plan workflows.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Resource Type<\/th>\n<th>Name<\/th>\n<th>Why It Is Useful<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Official documentation (search)<\/td>\n<td>OCI Docs search for \u201cFull Stack Disaster Recovery\u201d \u2014 https:\/\/docs.oracle.com\/search\/<\/td>\n<td>Fastest way to find the current docs landing page, IAM policies, and supported resources<\/td>\n<\/tr>\n<tr>\n<td>Official OCI documentation home<\/td>\n<td>OCI Documentation \u2014 https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/home.htm<\/td>\n<td>Starting point to navigate to Disaster Recovery and related services<\/td>\n<\/tr>\n<tr>\n<td>Pricing<\/td>\n<td>Oracle Cloud price list \u2014 https:\/\/www.oracle.com\/cloud\/price-list\/<\/td>\n<td>Official pricing reference for underlying services that drive DR cost<\/td>\n<\/tr>\n<tr>\n<td>Pricing calculator<\/td>\n<td>Oracle Cloud Cost Estimator \u2014 https:\/\/www.oracle.com\/cloud\/costestimator.html<\/td>\n<td>Build region-specific estimates for standby compute\/storage and replication<\/td>\n<\/tr>\n<tr>\n<td>Architecture Center<\/td>\n<td>Oracle Architecture Center \u2014 https:\/\/docs.oracle.com\/en\/solutions\/<\/td>\n<td>Reference architectures and DR patterns (verify DR-specific solutions)<\/td>\n<\/tr>\n<tr>\n<td>Free Tier<\/td>\n<td>Oracle Cloud Free Tier \u2014 https:\/\/www.oracle.com\/cloud\/free\/<\/td>\n<td>Understand always-free limits and where DR likely exceeds free tier<\/td>\n<\/tr>\n<tr>\n<td>CLI docs<\/td>\n<td>OCI CLI install and use \u2014 https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/API\/SDKDocs\/cliinstall.htm<\/td>\n<td>Helpful for scripting validations and operational automation<\/td>\n<\/tr>\n<tr>\n<td>Observability docs<\/td>\n<td>OCI Observability &amp; Management docs (navigate from docs home) \u2014 https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/home.htm<\/td>\n<td>Learn how to wire Monitoring\/Logging\/Events\/Notifications for DR operations<\/td>\n<\/tr>\n<tr>\n<td>Community learning<\/td>\n<td>Oracle Cloud community forums \u2014 https:\/\/community.oracle.com\/<\/td>\n<td>Practical experiences and troubleshooting (validate against official docs)<\/td>\n<\/tr>\n<tr>\n<td>Video learning<\/td>\n<td>Oracle Cloud Infrastructure YouTube \u2014 https:\/\/www.youtube.com\/user\/OracleCloudInfrastructure<\/td>\n<td>Product walkthroughs and best-practice sessions (verify recency)<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">18. Training and Certification Providers<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The following institutes are listed as training providers. Verify current course titles, syllabi, delivery modes, and accreditation directly on their websites.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Institute<\/th>\n<th>Suitable Audience<\/th>\n<th>Likely Learning Focus<\/th>\n<th>Mode<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps engineers, SREs, platform teams<\/td>\n<td>DevOps, cloud operations, automation fundamentals that support DR<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>ScmGalaxy.com<\/td>\n<td>Beginners to intermediate IT professionals<\/td>\n<td>SCM\/DevOps practices; may complement DR automation and runbooks<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.scmgalaxy.com\/<\/td>\n<\/tr>\n<tr>\n<td>CLoudOpsNow.in<\/td>\n<td>Cloud ops practitioners<\/td>\n<td>Cloud operations practices; may include DR and reliability topics<\/td>\n<td>Check website<\/td>\n<td>https:\/\/cloudopsnow.in\/<\/td>\n<\/tr>\n<tr>\n<td>SreSchool.com<\/td>\n<td>SREs, reliability engineers<\/td>\n<td>SRE principles: SLIs\/SLOs, incident response, DR testing<\/td>\n<td>Check website<\/td>\n<td>https:\/\/sreschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>AiOpsSchool.com<\/td>\n<td>Ops teams exploring AIOps<\/td>\n<td>Monitoring, event correlation, ops automation supporting DR ops<\/td>\n<td>Check website<\/td>\n<td>https:\/\/aiopsschool.com\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">19. Top Trainers<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">These are listed as trainer-related sites\/platforms. Verify trainer profiles, course coverage, and references directly.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Platform\/Site<\/th>\n<th>Likely Specialization<\/th>\n<th>Suitable Audience<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>RajeshKumar.xyz<\/td>\n<td>DevOps\/cloud guidance (verify offerings)<\/td>\n<td>Beginners to intermediate engineers<\/td>\n<td>https:\/\/rajeshkumar.xyz\/<\/td>\n<\/tr>\n<tr>\n<td>devopstrainer.in<\/td>\n<td>DevOps training (verify cloud\/OCI coverage)<\/td>\n<td>DevOps engineers and students<\/td>\n<td>https:\/\/devopstrainer.in\/<\/td>\n<\/tr>\n<tr>\n<td>devopsfreelancer.com<\/td>\n<td>Freelance DevOps services\/training platform (verify)<\/td>\n<td>Teams needing hands-on guidance<\/td>\n<td>https:\/\/devopsfreelancer.com\/<\/td>\n<\/tr>\n<tr>\n<td>devopssupport.in<\/td>\n<td>DevOps support\/training (verify)<\/td>\n<td>Operations teams and engineers<\/td>\n<td>https:\/\/devopssupport.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">20. Top Consulting Companies<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Listed neutrally as consulting resources. Verify service scope, case studies, and references directly with each company.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Company<\/th>\n<th>Likely Service Area<\/th>\n<th>Where They May Help<\/th>\n<th>Consulting Use Case Examples<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>cotocus.com<\/td>\n<td>Cloud\/DevOps consulting (verify exact offerings)<\/td>\n<td>DR planning, automation, operational readiness<\/td>\n<td>DR assessment, runbook design, drill execution process<\/td>\n<td>https:\/\/cotocus.com\/<\/td>\n<\/tr>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps consulting\/training services<\/td>\n<td>CI\/CD + operations automation that supports DR<\/td>\n<td>DR drill automation integration, IAM\/process setup<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>DEVOPSCONSULTING.IN<\/td>\n<td>DevOps consulting (verify service catalog)<\/td>\n<td>Platform operations, reliability practices<\/td>\n<td>Monitoring\/alerting for DR, infrastructure automation<\/td>\n<td>https:\/\/devopsconsulting.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">21. Career and Learning Roadmap<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn before this service<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To use Full Stack Disaster Recovery well, build fundamentals in:\n&#8211; OCI basics: compartments, VCN, subnets, routing, security lists\/NSGs\n&#8211; IAM policy writing and least privilege design\n&#8211; Compute and storage basics (instances, block volumes, object storage)\n&#8211; Database replication fundamentals (if your app is database-backed)\n&#8211; DR concepts: RTO, RPO, blast radius, failure domains<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn after this service<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Advanced multi-region traffic management:<\/li>\n<li>DNS steering, health checks, application-level failover<\/li>\n<li>Infrastructure as Code:<\/li>\n<li>Terraform modules for mirrored regions<\/li>\n<li>Observability for DR operations:<\/li>\n<li>SLOs, synthetic checks, alert routing<\/li>\n<li>Chaos engineering principles (carefully applied) to validate failure assumptions<\/li>\n<li>Security hardening for multi-region environments<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Job roles that use it<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cloud Solutions Architect (OCI)<\/li>\n<li>Site Reliability Engineer (SRE)<\/li>\n<li>Platform Engineer<\/li>\n<li>DevOps Engineer<\/li>\n<li>Cloud Operations Engineer<\/li>\n<li>Security Engineer (for governance and audit readiness)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Certification path (if available)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Oracle certification programs evolve frequently. For OCI certification pathways, start here and verify current tracks:\n&#8211; Oracle University \/ OCI training and certification: https:\/\/education.oracle.com\/<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project ideas for practice<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Build a two-region \u201chello app\u201d with DNS-based failover and run monthly drills.<\/li>\n<li>Add a database tier with replication and measure RTO\/RPO across drills.<\/li>\n<li>Implement least-privilege DR operator role and validate plan execution under constrained permissions.<\/li>\n<li>Create a DR readiness dashboard (health checks + replication status + drill history).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">22. Glossary<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>DR (Disaster Recovery)<\/strong>: The process of restoring services after an outage or disaster.<\/li>\n<li><strong>RTO (Recovery Time Objective)<\/strong>: Target time to restore service after an outage.<\/li>\n<li><strong>RPO (Recovery Point Objective)<\/strong>: Maximum acceptable data loss measured in time.<\/li>\n<li><strong>Primary region\/site<\/strong>: The main production environment location.<\/li>\n<li><strong>Standby\/DR region\/site<\/strong>: The secondary environment used for recovery.<\/li>\n<li><strong>Protection group<\/strong>: A logical grouping in Full Stack Disaster Recovery representing the resources of an application stack to recover together.<\/li>\n<li><strong>DR plan<\/strong>: An orchestrated sequence of DR actions (drill\/switchover\/failover).<\/li>\n<li><strong>DR drill<\/strong>: A test execution of DR steps to validate readiness.<\/li>\n<li><strong>Switchover<\/strong>: Planned transition of services from primary to standby.<\/li>\n<li><strong>Failover<\/strong>: Unplanned emergency transition to standby during an outage.<\/li>\n<li><strong>Control plane<\/strong>: The management\/orchestration layer (APIs, console operations).<\/li>\n<li><strong>Data plane<\/strong>: The layer where actual data movement and application traffic occur.<\/li>\n<li><strong>Compartment (OCI)<\/strong>: A logical container for organizing and isolating OCI resources.<\/li>\n<li><strong>IAM policy (OCI)<\/strong>: Rules defining who can access which resources and how.<\/li>\n<li><strong>VCN (Virtual Cloud Network)<\/strong>: OCI\u2019s virtual network construct.<\/li>\n<li><strong>NSG (Network Security Group)<\/strong>: Stateful virtual firewall rules applied to VNICs\/resources.<\/li>\n<li><strong>Replication lag<\/strong>: Delay between primary writes and standby receiving\/applying changes.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">23. Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Oracle Cloud Full Stack Disaster Recovery<\/strong> is OCI\u2019s orchestration-focused service for building, testing, and executing disaster recovery workflows for complete application stacks. It matters because real DR is a coordinated sequence across networking, compute, storage, and databases\u2014manual runbooks often fail under pressure.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In Oracle Cloud\u2019s <strong>Migration and Disaster Recovery<\/strong> toolbox, Full Stack Disaster Recovery fits as the <strong>automation and orchestration layer<\/strong>, while the underlying OCI services provide the actual replication and standby infrastructure. Cost is usually driven less by the orchestration feature and more by <strong>standby compute<\/strong>, <strong>replicated storage<\/strong>, <strong>database replication<\/strong>, and <strong>cross-region data transfer<\/strong>, so optimize by aligning standby design to your RTO\/RPO and practicing right-sized drills.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Security success depends on <strong>least-privilege IAM<\/strong>, controlled network exposure, strong audit\/log retention, and disciplined change management for DR plans.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next step: open the official OCI documentation search for the latest Full Stack Disaster Recovery docs, confirm supported member types and IAM policies, then implement a small two-region lab and run your first drill:\n&#8211; https:\/\/docs.oracle.com\/search\/<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Migration and Disaster Recovery<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[72,62],"tags":[],"class_list":["post-934","post","type-post","status-publish","format-standard","hentry","category-migration-and-disaster-recovery","category-oracle-cloud"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/934","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/comments?post=934"}],"version-history":[{"count":0,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/934\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/media?parent=934"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/categories?post=934"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/tags?post=934"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}