{"id":709,"date":"2026-04-15T03:07:42","date_gmt":"2026-04-15T03:07:42","guid":{"rendered":"https:\/\/www.devopsschool.com\/tutorials\/google-cloud-dual-run-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-migration\/"},"modified":"2026-04-15T03:07:42","modified_gmt":"2026-04-15T03:07:42","slug":"google-cloud-dual-run-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-migration","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/tutorials\/google-cloud-dual-run-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-migration\/","title":{"rendered":"Google Cloud Dual Run Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Migration"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Category<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Migration<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Introduction<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What this service is<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Dual Run<\/strong> in Google Cloud Migration is a <strong>migration strategy\/pattern<\/strong> where you run the <strong>legacy (source) system and the new (target) system in parallel<\/strong>, compare outcomes, and then progressively shift production traffic and\/or data processing from old to new with a safe rollback path.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">One-paragraph simple explanation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If you\u2019re migrating an application or data pipeline to Google Cloud and you\u2019re worried about outages, incorrect results, or missed edge cases, Dual Run lets you keep the old system working while the new system \u201cproves\u201d it can handle real workloads. You can start with a small percentage of traffic, validate behavior, and then increase gradually\u2014without a risky big-bang cutover.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">One-paragraph technical explanation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Technically, Dual Run is implemented by running <strong>two production-capable stacks<\/strong> at the same time and controlling <strong>traffic distribution<\/strong>, <strong>data synchronization<\/strong>, and <strong>result validation<\/strong>. In Google Cloud, Dual Run commonly uses tools such as <strong>Cloud Run \/ GKE<\/strong>, <strong>Cloud Load Balancing<\/strong>, <strong>Cloud Logging &amp; Monitoring<\/strong>, <strong>Database Migration Service (DMS)<\/strong> for replication, <strong>Pub\/Sub<\/strong> for event duplication, <strong>Dataflow<\/strong> for parallel pipelines, and <strong>CI\/CD<\/strong> (Cloud Build \/ Cloud Deploy or your existing toolchain) for controlled rollouts. Dual Run is not a single managed \u201cproduct SKU\u201d; it\u2019s a disciplined approach built using Google Cloud services.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What problem it solves<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run solves the hardest part of Migration: <strong>confidence at cutover<\/strong>. It reduces:\n&#8211; <strong>Downtime risk<\/strong> (move gradually instead of switching instantly)\n&#8211; <strong>Correctness risk<\/strong> (compare outputs under real load)\n&#8211; <strong>Operational risk<\/strong> (learn performance characteristics in production)\n&#8211; <strong>Rollback risk<\/strong> (fail back quickly by shifting traffic back)<\/p>\n\n\n\n<blockquote>\n<p>Naming note (important): At the time of writing, <strong>\u201cDual Run\u201d is not a standalone Google Cloud product with its own console page, API, or pricing SKU<\/strong>. It appears in migration guidance as a <strong>parallel-run strategy<\/strong> (sometimes also called <em>parallel run<\/em>). If your organization\u2019s migration program uses \u201cDual Run\u201d as an internal phase name or a Google Cloud reference architecture term, confirm the exact scope in the official Google Cloud migration documentation. When in doubt, <strong>verify in official docs<\/strong>.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">2. What is Dual Run?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Official purpose<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The purpose of Dual Run is to enable <strong>safe, measurable, and reversible<\/strong> migrations by operating <strong>old and new systems simultaneously<\/strong> until the new system meets agreed success criteria (SLOs, correctness checks, security controls, and cost\/performance targets).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Core capabilities (what Dual Run enables)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run enables you to:\n&#8211; <strong>Run two versions of a workload at once<\/strong> (legacy and target)\n&#8211; <strong>Split, shape, or duplicate traffic<\/strong> to the new environment (gradual rollout, shadow tests, canary)\n&#8211; <strong>Validate results<\/strong> (functional outputs, data integrity, latency, error rate)\n&#8211; <strong>Roll back quickly<\/strong> (shift traffic back to legacy if needed)\n&#8211; <strong>Cut over safely<\/strong> when confidence thresholds are met<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Major components (typical building blocks on Google Cloud)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Because Dual Run is a pattern, components vary by workload. Common building blocks include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Compute runtime(s)<\/strong>: Cloud Run, Google Kubernetes Engine (GKE), Compute Engine, App Engine (legacy), or managed services<\/li>\n<li><strong>Traffic management<\/strong>: Cloud Run traffic splitting, Cloud Load Balancing, service mesh (Cloud Service Mesh \/ Istio) for advanced routing (verify capabilities per product version)<\/li>\n<li><strong>Data sync\/replication<\/strong>: Database Migration Service (DMS), Cloud SQL read replicas (where supported), storage replication patterns, streaming duplication via Pub\/Sub<\/li>\n<li><strong>Observability<\/strong>: Cloud Logging, Cloud Monitoring, Error Reporting, Trace (as applicable)<\/li>\n<li><strong>CI\/CD and release control<\/strong>: Cloud Build, Cloud Deploy, Artifact Registry, and policy guardrails<\/li>\n<li><strong>Security controls<\/strong>: IAM, Secret Manager, VPC, firewall policies, Cloud Audit Logs, organization policies<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Service type<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Type<\/strong>: Migration strategy \/ operating model (not a managed product)<\/li>\n<li><strong>Scope<\/strong>: Applies at workload level; implemented within one or more <strong>Google Cloud projects<\/strong> and environments (dev\/test\/prod)<\/li>\n<li><strong>Regional\/global considerations<\/strong>: Depends on chosen services (for example, Cloud Run is regional; Cloud Load Balancing can be global; databases are regional with replicas depending on product)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How it fits into the Google Cloud ecosystem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run is typically used as part of a broader Google Cloud Migration program:\n&#8211; <strong>Assess and plan<\/strong> (inventory, dependency mapping, landing zone)\n&#8211; <strong>Build foundations<\/strong> (networking, IAM, logging, org policies)\n&#8211; <strong>Migrate and modernize<\/strong> (move workloads and data)\n&#8211; <strong>Dual Run<\/strong> (parallel operations + validation)\n&#8211; <strong>Cutover and decommission<\/strong> (final shift, retire legacy)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Google Cloud\u2019s architecture guidance and migration best practices (see Architecture Center) commonly recommend progressive cutovers and validation mechanisms\u2014Dual Run is where those controls become operational reality.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">3. Why use Dual Run?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Business reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Reduce revenue risk<\/strong>: Avoid long outages or severe incidents during migration.<\/li>\n<li><strong>Protect customer trust<\/strong>: Roll out changes gradually, detect issues early.<\/li>\n<li><strong>Lower migration program risk<\/strong>: Replace \u201cone big date\u201d with measurable gates.<\/li>\n<li><strong>Enable stakeholder confidence<\/strong>: Provide objective proof (metrics, comparisons) before cutover.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Technical reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Correctness under real production data<\/strong>: Staging rarely matches production edge cases.<\/li>\n<li><strong>Performance profiling<\/strong>: Validate latency, throughput, and scaling behavior under actual load.<\/li>\n<li><strong>Compatibility and integration testing<\/strong>: Confirm upstream\/downstream systems behave correctly.<\/li>\n<li><strong>Safer data transitions<\/strong>: Validate replication, schema changes, and consistency.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Incremental rollout<\/strong>: Slowly increase traffic and watch SLOs.<\/li>\n<li><strong>Faster rollback<\/strong>: Shift traffic back without redeploying or restoring backups (in many patterns).<\/li>\n<li><strong>Operational learning<\/strong>: Build runbooks and on-call readiness while legacy remains a safety net.<\/li>\n<li><strong>Controlled decommissioning<\/strong>: Retire legacy components only after proof.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/compliance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Validate security controls<\/strong>: Ensure IAM, encryption, audit logging, and network policies work in production.<\/li>\n<li><strong>Evidence collection<\/strong>: Dual Run creates measurable validation artifacts (logs, dashboards, change records).<\/li>\n<li><strong>Regulated change management<\/strong>: Supports phased approvals and risk reduction strategies.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scalability\/performance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Progressively scale<\/strong>: Confirm autoscaling and capacity planning.<\/li>\n<li><strong>Reduce blast radius<\/strong>: Initial small traffic share limits impact if issues exist.<\/li>\n<li><strong>Tune caching and database performance<\/strong>: Identify bottlenecks before full cutover.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should choose it<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Choose Dual Run when:\n&#8211; You need <strong>high confidence<\/strong> migration with limited risk tolerance.\n&#8211; Downtime is expensive and <strong>rollback must be fast<\/strong>.\n&#8211; The workload is complex or business-critical.\n&#8211; You can afford temporary duplicate run costs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should not choose it<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Avoid or limit Dual Run when:\n&#8211; The workload is <strong>simple and low risk<\/strong> (a short maintenance window cutover is fine).\n&#8211; Running two systems in parallel creates <strong>data consistency hazards<\/strong> you can\u2019t mitigate.\n&#8211; Budget constraints cannot support parallel operations.\n&#8211; The legacy system cannot be safely operated in parallel (licensing, capacity, or compliance constraints).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">4. Where is Dual Run used?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Industries<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run is common in environments where downtime or incorrect outcomes are costly:\n&#8211; Financial services (payments, trading, risk)\n&#8211; Healthcare (clinical systems, patient portals)\n&#8211; Retail\/e-commerce (checkout, inventory)\n&#8211; Media\/streaming (content delivery, billing)\n&#8211; SaaS and B2B platforms (multi-tenant workloads)\n&#8211; Manufacturing\/logistics (ERP integration, IoT pipelines)\n&#8211; Public sector (citizen services with strict change control)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Team types<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Platform engineering teams building migration factories<\/li>\n<li>DevOps\/SRE teams responsible for production stability<\/li>\n<li>Application teams modernizing services<\/li>\n<li>Data engineering teams migrating ETL\/ELT pipelines<\/li>\n<li>Security and compliance teams validating controls<\/li>\n<li>Enterprise architects coordinating multi-system cutovers<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Workloads<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>HTTP APIs and web apps<\/li>\n<li>Event-driven systems (Pub\/Sub, Kafka migrations)<\/li>\n<li>Batch and streaming data pipelines<\/li>\n<li>Databases (especially when changing engines or versions)<\/li>\n<li>Identity and authentication flows (careful: dual run can introduce subtle issues)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Architectures<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Microservices (dual run per service)<\/li>\n<li>Monolith-to-microservices (strangler patterns + dual run validation)<\/li>\n<li>Hybrid (on-prem + Google Cloud)<\/li>\n<li>Multi-region or active-active (advanced; requires careful data strategy)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Real-world deployment contexts<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Production Dual Run<\/strong>: the most valuable\u2014real load validates correctness.<\/li>\n<li><strong>Pre-production Dual Run<\/strong>: lower risk but less representative.<\/li>\n<li><strong>Dev\/test Dual Run<\/strong>: useful for tooling and automation rehearsal.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">5. Top Use Cases and Scenarios<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Below are realistic Dual Run scenarios on Google Cloud. Each example highlights the problem, why Dual Run fits, and a short scenario.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Web API migration with gradual traffic shift<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: A critical API must move to Cloud Run with minimal risk.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Split traffic between old and new, monitor errors\/latency, roll back instantly.<\/li>\n<li><strong>Example<\/strong>: Start at 1% traffic to the Cloud Run revision, then ramp to 10\/50\/100%.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) Monolith decomposition using parallel validation<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: You\u2019re extracting a \u201cbilling\u201d module into a microservice, but outputs must match.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Run old module and new service concurrently and compare results.<\/li>\n<li><strong>Example<\/strong>: The monolith still produces invoices, while the new service produces invoices in parallel for comparison.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) Database engine migration with read-only dual run<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Migrating from self-managed PostgreSQL to Cloud SQL; you want to ensure query performance and correctness.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Replicate data to Cloud SQL and direct read-only queries to the new database first.<\/li>\n<li><strong>Example<\/strong>: Move reporting dashboards to Cloud SQL reads while writes stay on legacy until stable.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) Event streaming migration (Kafka to Pub\/Sub)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Changing event backbone without losing events or breaking consumers.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Duplicate publishing to both systems temporarily; migrate consumers gradually.<\/li>\n<li><strong>Example<\/strong>: Producers publish to Kafka and Pub\/Sub; consumers are moved one by one.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Data pipeline modernization (on-prem Spark to Dataflow)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: A nightly pipeline must produce identical aggregates.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Run both pipelines for days\/weeks, compare outputs before decommission.<\/li>\n<li><strong>Example<\/strong>: Dataflow writes to a parallel BigQuery dataset; results are compared with legacy outputs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) Authentication service migration (with strict rollback)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Moving auth to a new identity provider integration.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Start with a small segment of users; roll back quickly if login issues appear.<\/li>\n<li><strong>Example<\/strong>: Route 5% of login traffic to new auth flow; monitor success rate and latency.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) Multi-region failover rehearsal during migration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Migrating while also improving resiliency.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Keep legacy as fallback while testing multi-region routing.<\/li>\n<li><strong>Example<\/strong>: New stack runs in two regions; legacy remains available for rollback.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) SaaS feature flag rollout during migration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: New platform changes behavior; customers should opt in gradually.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Parallel run combined with feature flags provides controlled exposure.<\/li>\n<li><strong>Example<\/strong>: Premium tenants are migrated first, with quick opt-out.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) Legacy queue migration (RabbitMQ to Pub\/Sub)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Message semantics differ; you need to confirm ordering\/duplication handling.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Run both queues; verify consumer idempotency.<\/li>\n<li><strong>Example<\/strong>: Consumers read from Pub\/Sub with dedup logic while RabbitMQ remains primary.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Network perimeter migration (on-prem ingress to Cloud Load Balancing)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Changing edge routing can break clients, TLS, or headers.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Gradual DNS\/traffic shift reduces risk and reveals edge-case clients.<\/li>\n<li><strong>Example<\/strong>: Weighted routing moves 10% of traffic to Google Cloud Load Balancing, then increases.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">11) Storage migration with parallel reads<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Moving from on-prem object storage to Cloud Storage; applications must continue serving files.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Copy data and read from both; compare checksums and access patterns.<\/li>\n<li><strong>Example<\/strong>: App reads from Cloud Storage first, falls back to legacy if missing, until fully synchronized.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">12) FinOps validation during migration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: New architecture must meet cost constraints.<\/li>\n<li><strong>Why Dual Run fits<\/strong>: Run both stacks and measure real cost\/performance before committing.<\/li>\n<li><strong>Example<\/strong>: Compare Cloud Run cost vs GKE cost for same traffic profile before finalizing.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">6. Core Features<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Because Dual Run is a strategy, \u201cfeatures\u201d are best understood as <strong>capabilities<\/strong> you implement using Google Cloud services. The list below focuses on what matters most in real migrations.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Parallel production operation<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Runs legacy and target systems at the same time.<\/li>\n<li><strong>Why it matters<\/strong>: Enables real-world validation without stopping the old system.<\/li>\n<li><strong>Practical benefit<\/strong>: Reduced cutover risk and easier rollback.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Costs can double temporarily; requires careful operational discipline.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) Controlled traffic shifting (progressive delivery)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Moves traffic in increments (1% \u2192 10% \u2192 50% \u2192 100%).<\/li>\n<li><strong>Why it matters<\/strong>: Limits blast radius and surfaces issues early.<\/li>\n<li><strong>Practical benefit<\/strong>: Safer than big-bang cutovers.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Requires routing control (Cloud Run traffic splitting, Cloud Load Balancing, service mesh, or DNS strategies).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) Shadow testing \/ request duplication (when applicable)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Sends production requests to the new system without impacting user responses (the legacy response is still returned).<\/li>\n<li><strong>Why it matters<\/strong>: Validates correctness under real traffic before serving users.<\/li>\n<li><strong>Practical benefit<\/strong>: Finds subtle correctness\/performance issues.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Not always supported at the load balancer layer; often requires service mesh\/proxy patterns. <strong>Verify in official docs<\/strong> for your chosen routing layer.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) Data replication and synchronization<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Keeps target data store in sync with legacy (e.g., DMS replication).<\/li>\n<li><strong>Why it matters<\/strong>: Prevents stale data and enables read traffic shift.<\/li>\n<li><strong>Practical benefit<\/strong>: Enables phased read\/write migration strategies.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Replication lag, schema drift, and compatibility differences can be significant.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Dual-write or write-forward patterns (advanced)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Writes to both old and new systems during a transition.<\/li>\n<li><strong>Why it matters<\/strong>: Supports zero-downtime write cutovers in some cases.<\/li>\n<li><strong>Practical benefit<\/strong>: Enables faster final cutover.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Risky if not idempotent; can create divergence if one write fails.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) Automated validation and diffing<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Compares outputs, database rows, aggregates, or API responses.<\/li>\n<li><strong>Why it matters<\/strong>: Correctness is the #1 migration risk.<\/li>\n<li><strong>Practical benefit<\/strong>: Objective go\/no-go criteria.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Requires defining \u201cequivalence\u201d (e.g., timestamps, ordering, floating point rounding).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) Observability and SLO gating<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Uses metrics\/logs\/traces to determine if the new system is healthy enough to scale traffic.<\/li>\n<li><strong>Why it matters<\/strong>: Prevents \u201chope-based\u201d cutovers.<\/li>\n<li><strong>Practical benefit<\/strong>: Faster detection and safer ramp-ups.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Logging can become expensive; dashboards must be designed intentionally.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) Rapid rollback<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Shifts traffic back to legacy quickly if issues are detected.<\/li>\n<li><strong>Why it matters<\/strong>: Reduces MTTR during migration.<\/li>\n<li><strong>Practical benefit<\/strong>: Avoids emergency redeploys and complex restores.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Rollback may be complicated if dual writes already occurred.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) Environment and configuration isolation<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Separates configs, secrets, and IAM so both stacks can run safely.<\/li>\n<li><strong>Why it matters<\/strong>: Prevents accidental cross-environment access.<\/li>\n<li><strong>Practical benefit<\/strong>: Cleaner governance and reduced security risk.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Requires consistent naming\/tagging and policy controls.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Release orchestration and approvals<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Uses CI\/CD to enforce repeatable deployments and approval gates.<\/li>\n<li><strong>Why it matters<\/strong>: Dual Run often lasts weeks; manual steps introduce risk.<\/li>\n<li><strong>Practical benefit<\/strong>: Repeatability and auditability.<\/li>\n<li><strong>Limitations\/caveats<\/strong>: Tooling integration takes effort; avoid over-automation without guardrails.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">7. Architecture and How It Works<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">High-level architecture<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run has two \u201clanes\u201d:\n1. <strong>Legacy lane<\/strong>: current production system (on-prem or older platform)\n2. <strong>Target lane<\/strong>: new system in Google Cloud<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Traffic and\/or data flows are controlled so you can:\n&#8211; Start with low risk (shadow or small percentage)\n&#8211; Validate (correctness + SLOs)\n&#8211; Ramp up\n&#8211; Cut over and decommission legacy<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Request\/data\/control flow<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Typical flow for an API migration:\n1. Clients send requests to an entry point (DNS, load balancer, API gateway).\n2. Routing splits traffic between legacy and new.\n3. Observability captures request metrics and logs from both.\n4. Validation compares outcomes; errors trigger alerts.\n5. CI\/CD promotes a new target release or rolls back.\n6. Once success criteria are met, routing shifts fully to new.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For data migrations:\n1. Data replicates from legacy DB to target DB (e.g., DMS).\n2. Read traffic shifts first to target.\n3. Write cutover happens when replication lag is acceptable and the application is ready.\n4. Legacy becomes read-only, then is decommissioned.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Integrations with related services<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common Google Cloud integrations:\n&#8211; <strong>Cloud Run \/ GKE \/ Compute Engine<\/strong> for runtime\n&#8211; <strong>Cloud Load Balancing<\/strong> for ingress and traffic management (or Cloud Run native traffic splitting)\n&#8211; <strong>Cloud Logging \/ Cloud Monitoring<\/strong> for observability\n&#8211; <strong>Secret Manager<\/strong> for secrets\n&#8211; <strong>Cloud KMS<\/strong> for encryption keys (when needed)\n&#8211; <strong>Database Migration Service<\/strong> for DB replication\n&#8211; <strong>Pub\/Sub<\/strong> for event duplication and decoupling\n&#8211; <strong>Artifact Registry<\/strong> for container images\n&#8211; <strong>Cloud Build \/ Cloud Deploy<\/strong> for CI\/CD\n&#8211; <strong>VPC \/ Cloud VPN \/ Cloud Interconnect<\/strong> for hybrid connectivity<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Dependency services<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run depends on whichever services implement:\n&#8211; Routing\/traffic control\n&#8211; Compute runtime(s)\n&#8211; Data sync method(s)\n&#8211; Logging\/metrics and alerting\n&#8211; IAM, networking, secrets, and policy guardrails<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/authentication model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>IAM<\/strong> governs who can deploy, route traffic, view logs, and manage data replication.<\/li>\n<li>Workloads typically use <strong>service accounts<\/strong> with least privilege.<\/li>\n<li>Network controls (VPC, firewall policies, Private Service Connect, VPC connectors) reduce exposure.<\/li>\n<li><strong>Cloud Audit Logs<\/strong> record admin and data access events for many services (verify for each service).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Networking model<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run can be:\n&#8211; <strong>Internet-facing<\/strong> (external clients, public endpoints)\n&#8211; <strong>Private<\/strong> (internal clients, internal load balancing, private service access)\n&#8211; <strong>Hybrid<\/strong> (on-prem + Google Cloud over VPN\/Interconnect)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A key networking design decision: whether legacy and target can be reached via a <strong>single entry point<\/strong> (ideal for controlled routing) or require DNS-based split (more limited control).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Monitoring\/logging\/governance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Define migration SLOs (error rate, p95 latency, throughput, correctness checks).<\/li>\n<li>Create dashboards per lane and per version\/revision.<\/li>\n<li>Decide log retention and sampling to manage cost.<\/li>\n<li>Use labels\/tags consistently for cost allocation and governance.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Simple architecture diagram (Mermaid)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart LR\n  U[Users\/Clients] --&gt; R[Traffic Split \/ Router]\n  R --&gt; L[Legacy Service]\n  R --&gt; N[New Service on Google Cloud]\n  L --&gt; O[Logs &amp; Metrics]\n  N --&gt; O[Logs &amp; Metrics]\n  O --&gt; G[Go\/No-Go Gates\\n(SLOs + Validation)]\n  G --&gt;|Increase traffic| R\n  G --&gt;|Rollback| R\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Production-style architecture diagram (Mermaid)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart TB\n  subgraph Clients\n    C1[Web\/Mobile Clients]\n    C2[Partner Systems]\n  end\n\n  subgraph Edge\n    DNS[Cloud DNS \/ External DNS]\n    LB[Cloud Load Balancing\\n(or Cloud Run Traffic Split)]\n  end\n\n  subgraph Legacy\n    LEGAPP[Legacy App\\n(on-prem \/ old platform)]\n    LEGDB[Legacy DB]\n  end\n\n  subgraph GoogleCloud[Google Cloud Project(s)]\n    direction TB\n\n    subgraph Runtime\n      NEWAPP[New App\\nCloud Run or GKE]\n      AR[Artifact Registry]\n      CICD[Cloud Build \/ Cloud Deploy]\n    end\n\n    subgraph Data\n      DMS[Database Migration Service\\n(replication)]\n      NEWDB[Cloud SQL \/ AlloyDB \/ Spanner\\n(target)]\n      PS[Pub\/Sub\\n(optional event duplication)]\n    end\n\n    subgraph Observability\n      LOG[Cloud Logging]\n      MON[Cloud Monitoring]\n      ALERT[Alerting + SLOs]\n    end\n\n    subgraph Security\n      IAM[IAM + Service Accounts]\n      SM[Secret Manager]\n      KMS[Cloud KMS\\n(optional)]\n      VPC[VPC + Connectivity\\nVPN\/Interconnect]\n    end\n  end\n\n  C1 --&gt; DNS --&gt; LB\n  C2 --&gt; DNS --&gt; LB\n\n  LB --&gt;|x%| LEGAPP\n  LB --&gt;|y%| NEWAPP\n\n  LEGAPP --&gt; LEGDB\n  DMS --&gt; NEWDB\n  LEGDB --&gt; DMS\n\n  NEWAPP --&gt; NEWDB\n  NEWAPP --&gt; PS\n\n  LEGAPP --&gt; LOG\n  NEWAPP --&gt; LOG\n  NEWDB --&gt; LOG\n  LOG --&gt; MON --&gt; ALERT\n\n  CICD --&gt; AR --&gt; NEWAPP\n\n  IAM --- NEWAPP\n  SM --- NEWAPP\n  VPC --- NEWAPP\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">8. Prerequisites<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Because Dual Run is a pattern, prerequisites depend on the chosen implementation. For the hands-on lab in this tutorial (Dual Run using <strong>Cloud Run traffic splitting<\/strong>), you need:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Account\/project requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A Google Cloud account with access to create or use a <strong>Google Cloud project<\/strong><\/li>\n<li><strong>Billing enabled<\/strong> on the project (Cloud Run and build operations require billing)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Permissions \/ IAM roles<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Minimum suggested roles for the lab (project-level):\n&#8211; <strong>Cloud Run Admin<\/strong> (<code>roles\/run.admin<\/code>)\n&#8211; <strong>Service Account User<\/strong> (<code>roles\/iam.serviceAccountUser<\/code>) on the runtime service account\n&#8211; <strong>Cloud Build Editor<\/strong> (<code>roles\/cloudbuild.builds.editor<\/code>) or permissions to run builds\n&#8211; <strong>Logs Viewer<\/strong> (<code>roles\/logging.viewer<\/code>) for validation in Cloud Logging<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In production, split these duties across separate personas and use least privilege.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">CLI\/SDK\/tools needed<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/cloud.google.com\/sdk\/docs\/install\">Google Cloud CLI (<code>gcloud<\/code>)<\/a><\/li>\n<li>A local terminal with:<\/li>\n<li><code>curl<\/code><\/li>\n<li>Optional: <code>jq<\/code> (for formatting), and a load tool like <code>hey<\/code> (optional)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Region availability<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cloud Run is <strong>regional<\/strong>. Choose a region where Cloud Run is available.<\/li>\n<li>Verify current Cloud Run locations: https:\/\/cloud.google.com\/run\/docs\/locations<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas\/limits<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cloud Run has quotas for services, revisions, requests, and CPU\/memory per region.<\/li>\n<li>Cloud Logging has ingestion and retention considerations that can affect cost.<\/li>\n<li>Always check quotas in <strong>IAM &amp; Admin \u2192 Quotas<\/strong> and service-specific quota docs. <strong>Verify in official docs<\/strong>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisite services\/APIs<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Enable these APIs in the project (the lab will do this via <code>gcloud<\/code>):\n&#8211; Cloud Run Admin API\n&#8211; Cloud Build API\n&#8211; Artifact Registry API (optional, depending on build approach)\n&#8211; Cloud Logging API (typically enabled by default)<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">9. Pricing \/ Cost<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing model (accurate framing)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Dual Run itself has no direct price<\/strong> because it is not a separately billed Google Cloud product. The cost comes from <strong>running two environments in parallel<\/strong> and from the services you use to route traffic, replicate data, and observe results.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing dimensions (what you pay for)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common cost dimensions in a Dual Run migration include:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Compute costs (duplicated during Dual Run)<\/strong>\n   &#8211; Cloud Run: billed by request, CPU\/memory time, and networking (see official pricing).\n   &#8211; GKE: cluster management + node compute + networking.\n   &#8211; Compute Engine: VM hours, disks, load balancers, etc.<\/p>\n<\/li>\n<li>\n<p><strong>Traffic management<\/strong>\n   &#8211; Cloud Load Balancing: typically billed by forwarding rules, data processed, and sometimes additional features (SKU-specific).<br\/>\n     Pricing: https:\/\/cloud.google.com\/load-balancing\/pricing<\/p>\n<\/li>\n<li>\n<p><strong>Data replication and storage<\/strong>\n   &#8211; Database Migration Service: pricing depends on source\/target and replication approach.<br\/>\n     DMS docs\/pricing: verify in official docs (start here: https:\/\/cloud.google.com\/database-migration)\n   &#8211; Cloud SQL\/AlloyDB\/Spanner: instance size, storage, I\/O, backups, replicas.<\/p>\n<\/li>\n<li>\n<p><strong>Observability<\/strong>\n   &#8211; Cloud Logging: ingestion volume, retention beyond included amounts, log-based metrics.<br\/>\n     Pricing: https:\/\/cloud.google.com\/logging\/pricing\n   &#8211; Cloud Monitoring: metrics volume, uptime checks, alerting policies (see pricing).<br\/>\n     Pricing: https:\/\/cloud.google.com\/monitoring\/pricing<\/p>\n<\/li>\n<li>\n<p><strong>Network egress<\/strong>\n   &#8211; Cross-region traffic, internet egress, and hybrid connectivity egress can be major cost drivers.\n   &#8211; Hybrid connectivity (Cloud VPN \/ Cloud Interconnect) has its own pricing model.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Free tier (if applicable)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Some services (notably Cloud Run and Cloud Logging) may include free usage tiers or included allocations depending on account and region. These details change over time\u2014<strong>verify in official pricing pages<\/strong>:\n&#8211; Cloud Run pricing: https:\/\/cloud.google.com\/run\/pricing\n&#8211; Cloud Logging pricing: https:\/\/cloud.google.com\/logging\/pricing<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Biggest cost drivers in Dual Run<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Running two stacks<\/strong> at \u201cproduction readiness\u201d capacity<\/li>\n<li><strong>Increased logging\/metrics<\/strong> due to parallel validation<\/li>\n<li><strong>Data replication<\/strong> (continuous replication, additional replicas)<\/li>\n<li><strong>Network egress<\/strong> between legacy and cloud or between regions<\/li>\n<li><strong>Duplicate third-party licensing<\/strong> (legacy software + new platform, if applicable)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Hidden\/indirect costs to plan for<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Extended Dual Run duration (weeks\/months) due to validation or org approvals<\/li>\n<li>Engineering time building validation harnesses and runbooks<\/li>\n<li>Incident response load (two systems to operate)<\/li>\n<li>Additional QA requirements in regulated environments<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How to optimize cost<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Keep Dual Run duration as short as risk allows\u2014define success criteria early.<\/li>\n<li>Start with <strong>shadow tests or small traffic<\/strong>, then ramp up deliberately.<\/li>\n<li>Implement <strong>log sampling<\/strong> and structured logs to reduce ingestion.<\/li>\n<li>Use <strong>budgets and alerts<\/strong>; separate cost centers via labels.<\/li>\n<li>Use <strong>right-sized resources<\/strong> for the \u201cnew\u201d system early; avoid overprovisioning.<\/li>\n<li>If hybrid traffic is expensive, keep validation data local where feasible or use private connectivity efficiently.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Example low-cost starter estimate (qualitative)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A small lab-style Dual Run for a web service might cost primarily:\n&#8211; Cloud Run requests and compute time (often low for minimal load)\n&#8211; Cloud Build minutes for a few builds\n&#8211; Cloud Logging ingestion for test requests<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Exact cost depends on region and usage. Use:\n&#8211; Google Cloud Pricing Calculator: https:\/\/cloud.google.com\/products\/calculator<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Example production cost considerations<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In production Dual Run, expect:\n&#8211; Near <strong>2\u00d7 compute<\/strong> (legacy + new), at least for the migrated component\n&#8211; Additional database replicas or replication instances\n&#8211; Increased logging\/metrics volume during validation\n&#8211; Possible additional load balancing or connectivity costs<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A practical FinOps rule: before starting, estimate <strong>cost per day of Dual Run<\/strong> and multiply by an expected duration + buffer. This avoids surprise overruns.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">10. Step-by-Step Hands-On Tutorial<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This lab demonstrates Dual Run using <strong>Cloud Run traffic splitting<\/strong>. It\u2019s a realistic, low-risk way to practice parallel operation and progressive cutover for a web service.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Objective<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Deploy a Cloud Run service with two revisions (v1 and v2), run them <strong>in parallel<\/strong>, split traffic (e.g., 90\/10), validate using responses and Cloud Logging, then cut over (100% to v2) and learn how to roll back.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Lab Overview<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You will:\n1. Set up a Google Cloud project and enable APIs.\n2. Build and deploy a simple containerized web service to Cloud Run (revision v1).\n3. Deploy an updated revision (v2) without sending traffic.\n4. Configure <strong>traffic splitting<\/strong> between v1 and v2 (Dual Run).\n5. Generate requests and verify distribution and logs.\n6. Cut over to v2 (and optionally roll back).\n7. Clean up resources.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Create\/select a project, set region, and enable APIs<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">1) In your terminal, authenticate and set a project:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud auth login\ngcloud config set project YOUR_PROJECT_ID\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">2) Choose a region (example: <code>us-central1<\/code>) and set it:<\/p>\n\n\n\n<pre><code class=\"language-bash\">export REGION=\"us-central1\"\ngcloud config set run\/region \"$REGION\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">3) Enable required APIs:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud services enable \\\n  run.googleapis.com \\\n  cloudbuild.googleapis.com \\\n  artifactregistry.googleapis.com \\\n  logging.googleapis.com\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; APIs enable successfully (may take a minute).\n&#8211; Your <code>gcloud<\/code> context points to the correct project and region.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Create a minimal Cloud Run app (containerized)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Create a new folder and add these files.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) <code>main.py<\/code><\/p>\n\n\n\n<pre><code class=\"language-python\">import os\nimport time\nfrom flask import Flask, request, jsonify\n\napp = Flask(__name__)\n\n@app.get(\"\/\")\ndef root():\n    version = os.environ.get(\"APP_VERSION\", \"unknown\")\n    # Correlation ID for tracing across systems (client can send one too)\n    rid = request.headers.get(\"X-Request-Id\", f\"auto-{int(time.time() * 1000)}\")\n    return jsonify({\n        \"service\": \"dualrun-demo\",\n        \"version\": version,\n        \"request_id\": rid\n    })\n\nif __name__ == \"__main__\":\n    app.run(host=\"0.0.0.0\", port=int(os.environ.get(\"PORT\", \"8080\")))\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">2) <code>requirements.txt<\/code><\/p>\n\n\n\n<pre><code class=\"language-txt\">Flask==3.0.3\ngunicorn==22.0.0\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">3) <code>Dockerfile<\/code><\/p>\n\n\n\n<pre><code class=\"language-dockerfile\">FROM python:3.12-slim\n\nWORKDIR \/app\nCOPY requirements.txt .\nRUN pip install --no-cache-dir -r requirements.txt\n\nCOPY main.py .\n\nENV PORT=8080\nCMD [\"gunicorn\", \"-b\", \":8080\", \"main:app\"]\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; You have a buildable container that returns JSON including a <code>version<\/code>.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Build the container image with Cloud Build<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Use Artifact Registry (recommended) for images.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) Create an Artifact Registry repository (one-time):<\/p>\n\n\n\n<pre><code class=\"language-bash\">export REPO=\"dualrun-repo\"\ngcloud artifacts repositories create \"$REPO\" \\\n  --repository-format=docker \\\n  --location=\"$REGION\" \\\n  --description=\"Images for Dual Run lab\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">2) Configure Docker auth for Artifact Registry:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud auth configure-docker \"$REGION-docker.pkg.dev\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">3) Build and push the image:<\/p>\n\n\n\n<pre><code class=\"language-bash\">export PROJECT_ID=\"$(gcloud config get-value project)\"\nexport IMAGE=\"$REGION-docker.pkg.dev\/$PROJECT_ID\/$REPO\/dualrun-demo:latest\"\n\ngcloud builds submit --tag \"$IMAGE\" .\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Cloud Build completes successfully.\n&#8211; An image named <code>dualrun-demo:latest<\/code> exists in Artifact Registry.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Deploy revision v1 (legacy) to Cloud Run<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Deploy the service and tag the revision as <code>v1<\/code>.<\/p>\n\n\n\n<pre><code class=\"language-bash\">export SERVICE=\"dualrun-demo\"\n\ngcloud run deploy \"$SERVICE\" \\\n  --image \"$IMAGE\" \\\n  --allow-unauthenticated \\\n  --set-env-vars \"APP_VERSION=v1\" \\\n  --tag \"v1\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Fetch the service URL:<\/p>\n\n\n\n<pre><code class=\"language-bash\">export URL=\"$(gcloud run services describe \"$SERVICE\" --format='value(status.url)')\"\necho \"$URL\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Test:<\/p>\n\n\n\n<pre><code class=\"language-bash\">curl -s \"$URL\" | python -m json.tool\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Response includes <code>\"version\": \"v1\"<\/code>.\n&#8211; Service is reachable via the Cloud Run URL.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Deploy revision v2 (new) with NO traffic (Dual Run preparation)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Deploy a second revision, tag it as <code>v2<\/code>, but do not send traffic yet:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run deploy \"$SERVICE\" \\\n  --image \"$IMAGE\" \\\n  --allow-unauthenticated \\\n  --set-env-vars \"APP_VERSION=v2\" \\\n  --tag \"v2\" \\\n  --no-traffic\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">List revisions:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run revisions list --service \"$SERVICE\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Two revisions exist for the same service.\n&#8211; v2 exists but receives 0% traffic.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 6: Start Dual Run by splitting traffic (e.g., 90% v1, 10% v2)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Update traffic:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run services update-traffic \"$SERVICE\" \\\n  --to-tags \"v1=90,v2=10\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Confirm traffic:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run services describe \"$SERVICE\" \\\n  --format=\"table(status.traffic[].tag,status.traffic[].percent,status.traffic[].revisionName)\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; The service routes ~90% of requests to v1 and ~10% to v2.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 7: Generate traffic and observe version distribution<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Send 50 requests and count versions:<\/p>\n\n\n\n<pre><code class=\"language-bash\">for i in $(seq 1 50); do\n  curl -s -H \"X-Request-Id: req-$i\" \"$URL\" | python -c \"import sys, json; print(json.load(sys.stdin)['version'])\"\ndone | sort | uniq -c\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Counts should be roughly 45 responses from v1 and 5 from v2 (not exact due to randomness and small sample size).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 8: Validate with Cloud Logging (revision-level visibility)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In Google Cloud Console:\n&#8211; Go to <strong>Logging \u2192 Logs Explorer<\/strong>\n&#8211; Use a query like:<\/p>\n\n\n\n<pre><code class=\"language-txt\">resource.type=\"cloud_run_revision\"\nresource.labels.service_name=\"dualrun-demo\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">To filter by revision tag, use the revision name shown in the traffic table output (Cloud Run logs label revisions). For example:<\/p>\n\n\n\n<pre><code class=\"language-txt\">resource.type=\"cloud_run_revision\"\nresource.labels.service_name=\"dualrun-demo\"\nresource.labels.revision_name=\"YOUR_REVISION_NAME\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; You can see request logs for both revisions.\n&#8211; You can correlate by <code>X-Request-Id<\/code> if included in logs (headers may or may not be logged by default). If you need deeper request correlation, implement structured logging in the app.<\/p>\n\n\n\n<blockquote>\n<p>Tip: For real Dual Run migrations, define explicit validation signals:\n&#8211; Error rate (5xx)\n&#8211; p95 latency\n&#8211; Business correctness checks (domain-specific)<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 9: Cut over to v2 (100% traffic) and keep rollback option<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Cut over:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run services update-traffic \"$SERVICE\" \\\n  --to-tags \"v2=100\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Verify:<\/p>\n\n\n\n<pre><code class=\"language-bash\">curl -s \"$URL\" | python -m json.tool\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Responses show <code>\"version\": \"v2\"<\/code> consistently.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Optional rollback practice<\/strong>\nIf v2 had issues, roll back instantly:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run services update-traffic \"$SERVICE\" \\\n  --to-tags \"v1=100\"\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Validation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Use this checklist:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) <strong>Traffic split configured<\/strong>\n&#8211; Confirm with:\n  <code>bash\n  gcloud run services describe \"$SERVICE\" --format=\"table(status.traffic[].tag,status.traffic[].percent)\"<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) <strong>Parallel runtime<\/strong>\n&#8211; Confirm two revisions exist:\n  <code>bash\n  gcloud run revisions list --service \"$SERVICE\"<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) <strong>Functional validation<\/strong>\n&#8211; Confirm both versions respond:\n  &#8211; <code>curl $URL<\/code> repeatedly shows both during 90\/10 stage.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) <strong>Observability<\/strong>\n&#8211; Logs Explorer shows entries for both revisions.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Troubleshooting<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Issue: <code>PERMISSION_DENIED<\/code> when deploying<\/strong>\n&#8211; Fix: ensure you have <code>roles\/run.admin<\/code> and <code>roles\/iam.serviceAccountUser<\/code> on the runtime service account.\n&#8211; Also confirm billing is enabled.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Issue: Cloud Build fails to push to Artifact Registry<\/strong>\n&#8211; Fix: ensure Artifact Registry API is enabled and Docker auth configured:\n  <code>bash\n  gcloud services enable artifactregistry.googleapis.com\n  gcloud auth configure-docker \"$REGION-docker.pkg.dev\"<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Issue: 401\/403 errors accessing the service<\/strong>\n&#8211; If you removed <code>--allow-unauthenticated<\/code>, you must call with authentication.\n&#8211; For a public lab, ensure:\n  <code>bash\n  gcloud run services get-iam-policy dualrun-demo<\/code>\n  and that <code>allUsers<\/code> has <code>roles\/run.invoker<\/code> (public). For production, do <strong>not<\/strong> use public access unless required.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Issue: Traffic splitting doesn\u2019t seem to match percentages<\/strong>\n&#8211; Small sample sizes fluctuate. Increase request count.\n&#8211; Ensure your traffic update succeeded and that caches aren\u2019t masking results.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Cleanup<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To avoid ongoing charges, delete resources:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) Delete the Cloud Run service:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run services delete \"$SERVICE\" --region \"$REGION\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">2) Delete the Artifact Registry repository (and images):<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud artifacts repositories delete \"$REPO\" --location \"$REGION\"\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">3) (Optional) If this was a dedicated lab project, delete the project (most thorough cleanup):\n&#8211; Console: <strong>IAM &amp; Admin \u2192 Manage resources \u2192 Delete project<\/strong>\n&#8211; Or via CLI (use with caution):\n  <code>bash\n  gcloud projects delete \"$PROJECT_ID\"<\/code><\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">11. Best Practices<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Architecture best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Design for reversibility<\/strong>: Every Dual Run plan should include a clear rollback mechanism (traffic shift, feature flag, or DNS rollback).<\/li>\n<li><strong>Prefer progressive rollout<\/strong> over big-bang cutovers for critical services.<\/li>\n<li><strong>Decouple with events<\/strong> where feasible (Pub\/Sub) so consumers migrate independently.<\/li>\n<li><strong>Plan data strategy early<\/strong>: replication method, lag tolerance, schema evolution, and conflict handling.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">IAM\/security best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>least-privilege service accounts<\/strong> per component (runtime, CI\/CD, replication).<\/li>\n<li>Separate roles for deployers vs approvers (where required).<\/li>\n<li>Use <strong>Secret Manager<\/strong> for secrets; avoid embedding secrets in images or env vars without controls.<\/li>\n<li>Turn on and review <strong>Cloud Audit Logs<\/strong> for admin activity.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cost best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Set a <strong>Dual Run timebox<\/strong> (e.g., 2 weeks) and define extension criteria.<\/li>\n<li>Implement <strong>log-based sampling<\/strong> and structured logging to reduce ingestion.<\/li>\n<li>Use <strong>labels<\/strong> (e.g., <code>env=dualrun<\/code>, <code>app=...<\/code>, <code>migration-wave=...<\/code>) for cost allocation.<\/li>\n<li>Use <strong>budgets and alerts<\/strong> during Dual Run; costs often spike unexpectedly.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Performance best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Define SLOs per revision\/version and measure:<\/li>\n<li>p95\/p99 latency<\/li>\n<li>error rate<\/li>\n<li>saturation (CPU, memory, DB connections)<\/li>\n<li>Load test the new system before ramping traffic.<\/li>\n<li>Watch downstream bottlenecks (DB connection limits are common when traffic increases).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Reliability best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Automate rollback actions (runbooks + predefined commands).<\/li>\n<li>Use multi-zone\/regional designs where appropriate for the target stack.<\/li>\n<li>Make the application <strong>idempotent<\/strong> (especially important with retries and event duplication).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operations best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Maintain a <strong>single pane of glass dashboard<\/strong> comparing legacy vs new.<\/li>\n<li>Use consistent <strong>request correlation IDs<\/strong> across both systems for debugging.<\/li>\n<li>Keep an up-to-date <strong>migration runbook<\/strong> and on-call readiness checklist.<\/li>\n<li>Use change management gates: \u201cno traffic increase unless SLOs are green for N hours\u201d.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Governance\/tagging\/naming best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Standardize naming for versions and environments:<\/li>\n<li><code>service-name<\/code>, <code>service-name-canary<\/code>, <code>service-name-shadow<\/code><\/li>\n<li>Use resource labels:<\/li>\n<li><code>owner<\/code>, <code>cost-center<\/code>, <code>env<\/code>, <code>migration-wave<\/code>, <code>data-classification<\/code><\/li>\n<li>Apply org policies (where applicable) to restrict risky configs (public access, weak TLS, etc.).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">12. Security Considerations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Identity and access model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Humans<\/strong>: Use groups and roles; avoid direct user permissions where possible.<\/li>\n<li><strong>Workloads<\/strong>: Use <strong>service accounts<\/strong> with the minimum required roles.<\/li>\n<li><strong>CI\/CD<\/strong>: Ensure build\/deploy identities cannot access production data unless needed.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Encryption<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>In Google Cloud, data is encrypted at rest by default for many services, with options for customer-managed keys (Cloud KMS) depending on service.<\/li>\n<li>For regulated workloads:<\/li>\n<li>Consider <strong>CMEK<\/strong> (customer-managed encryption keys) where supported.<\/li>\n<li>Verify CMEK support for each service in official docs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network exposure<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common risks during Dual Run:\n&#8211; Accidentally exposing internal test endpoints publicly.\n&#8211; Running legacy and new with inconsistent TLS or header handling.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Recommendations:\n&#8211; Prefer private connectivity patterns when possible.\n&#8211; Use load balancers\/gateways to centralize TLS policy.\n&#8211; Restrict ingress via IAM (Cloud Run Invoker) or network controls.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Secrets handling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Store secrets in <strong>Secret Manager<\/strong>.<\/li>\n<li>Rotate secrets during migration when feasible.<\/li>\n<li>Avoid dual-run configurations that require copying long-lived secrets broadly.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Audit\/logging<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure Cloud Audit Logs are enabled appropriately at org\/folder\/project levels.<\/li>\n<li>Ensure logs contain enough information for incident response without leaking sensitive data.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compliance considerations<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run can help compliance by providing:\n&#8211; Evidence of staged rollout controls\n&#8211; Audit trails for approvals and changes\n&#8211; Validation results recorded over time<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But it can hurt compliance if:\n&#8211; Data is duplicated into environments without proper classification\/controls\n&#8211; Access expands broadly \u201ctemporarily\u201d and never gets tightened<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Common security mistakes<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Leaving canary\/shadow endpoints publicly accessible<\/li>\n<li>Over-permissioned service accounts \u201cfor speed\u201d<\/li>\n<li>Copying production secrets into dev\/test for dual-run testing<\/li>\n<li>Logging sensitive payloads while validating outputs<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secure deployment recommendations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use separate projects or clearly separated environments for prod vs non-prod.<\/li>\n<li>Use organization policies to enforce baseline constraints.<\/li>\n<li>Use VPC Service Controls (where appropriate) to reduce data exfiltration risk (verify applicability to your services).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">13. Limitations and Gotchas<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Known limitations (pattern-level)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Cost duplication<\/strong>: Dual Run often increases spend significantly.<\/li>\n<li><strong>Complexity<\/strong>: Two systems to operate means more moving parts and more operational overhead.<\/li>\n<li><strong>Data consistency<\/strong>: Dual writes and replication introduce divergence risk.<\/li>\n<li><strong>Behavioral differences<\/strong>: Timezones, locale, floating-point math, and ordering can break \u201cexact equality\u201d comparisons.<\/li>\n<li><strong>Third-party dependencies<\/strong>: External APIs can behave differently based on IP ranges, TLS stacks, or request headers.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas and service constraints<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cloud Run revision\/service quotas can limit how many parallel versions you keep.<\/li>\n<li>Logging\/monitoring quotas and cost controls can constrain how much validation telemetry you can store.<\/li>\n<li>Database connection limits are a frequent bottleneck when traffic increases.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Always validate the current quotas for your chosen services. <strong>Verify in official docs<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Regional constraints<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Some services are regional; dual run across regions may add latency and egress.<\/li>\n<li>If legacy is on-prem, hybrid connectivity latency can affect comparisons.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing surprises<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Logging ingestion<\/strong> and long retention<\/li>\n<li><strong>Egress<\/strong> from on-prem to cloud during replication\/validation<\/li>\n<li><strong>Load balancing<\/strong> data processing at scale<\/li>\n<li><strong>Double database costs<\/strong> (replicas + target + backups)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compatibility issues<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Schema differences during database migration<\/li>\n<li>Differences in retry policies and timeouts<\/li>\n<li>Event ordering differences between messaging systems<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational gotchas<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Comparing results requires robust normalization (ignore fields that are expected to differ).<\/li>\n<li>Rollback may not be simple if writes have already moved.<\/li>\n<li>Teams often forget to decommission legacy, paying for it indefinitely.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Migration challenges<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Defining what \u201csuccess\u201d means (SLOs + correctness)<\/li>\n<li>Building validation harnesses that don\u2019t overload production systems<\/li>\n<li>Coordinating cutover across multiple dependent services<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Vendor-specific nuances (Google Cloud)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Traffic splitting is easy in some runtimes (e.g., Cloud Run revisions) but more complex across heterogeneous backends. Plan routing early.<\/li>\n<li>Observability is powerful but can be expensive at high volume; plan sampling and metrics strategy.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">14. Comparison with Alternatives<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run is one option among several migration cutover strategies. Here\u2019s how it compares.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Options to consider<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Big-bang cutover<\/strong>: switch everything at once during a maintenance window<\/li>\n<li><strong>Blue\/Green deployment<\/strong>: maintain two environments and switch traffic<\/li>\n<li><strong>Canary release<\/strong>: small percentage to new version, gradually increase<\/li>\n<li><strong>Shadow traffic<\/strong>: duplicate traffic to new version without serving responses<\/li>\n<li><strong>Strangler pattern<\/strong>: incrementally replace parts of a monolith behind routing rules<\/li>\n<li><strong>Active-active<\/strong>: run both as authoritative systems (hard; requires careful data strategy)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Comparison table<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Option<\/th>\n<th>Best For<\/th>\n<th>Strengths<\/th>\n<th>Weaknesses<\/th>\n<th>When to Choose<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Dual Run (parallel run)<\/td>\n<td>Critical migrations needing high confidence<\/td>\n<td>Real-world validation, safer cutover, fast rollback potential<\/td>\n<td>Higher cost, operational complexity, data consistency challenges<\/td>\n<td>When correctness and uptime matter more than temporary cost<\/td>\n<\/tr>\n<tr>\n<td>Big-bang cutover<\/td>\n<td>Small\/low-risk systems<\/td>\n<td>Simple plan, short overlap<\/td>\n<td>High risk, hard rollback, downtime<\/td>\n<td>When downtime is acceptable and system is simple<\/td>\n<\/tr>\n<tr>\n<td>Blue\/Green<\/td>\n<td>Web apps\/APIs with clear routing boundary<\/td>\n<td>Clear rollback (switch back), isolated envs<\/td>\n<td>Often requires duplicate infra; DB changes complicate rollback<\/td>\n<td>When you can keep DB compatible or use compatible migration steps<\/td>\n<\/tr>\n<tr>\n<td>Canary (progressive delivery)<\/td>\n<td>Services that can tolerate some user exposure<\/td>\n<td>Lower risk than big-bang, fast feedback<\/td>\n<td>Still exposes users; needs strong monitoring<\/td>\n<td>When you can handle small failures and have good SLOs<\/td>\n<\/tr>\n<tr>\n<td>Shadow traffic<\/td>\n<td>Validating correctness without user impact<\/td>\n<td>Very safe for users; strong validation<\/td>\n<td>Harder to implement; doubles load; output comparison complexity<\/td>\n<td>When you need correctness proof before serving users<\/td>\n<\/tr>\n<tr>\n<td>Strangler pattern<\/td>\n<td>Monolith modernization<\/td>\n<td>Incremental replacement, reduces scope of each change<\/td>\n<td>Requires routing layer and careful domain boundaries<\/td>\n<td>When refactoring is needed and you want continuous value delivery<\/td>\n<\/tr>\n<tr>\n<td>Active-active<\/td>\n<td>Global, high-availability systems<\/td>\n<td>High resilience, no single cutover<\/td>\n<td>Very complex, data conflict resolution<\/td>\n<td>When business requires multi-site active operation<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Nearest services in Google Cloud (how they relate)<\/strong>\n&#8211; <strong>Cloud Run traffic splitting<\/strong>: a practical way to implement Dual Run for HTTP services.\n&#8211; <strong>Cloud Load Balancing<\/strong>: implements traffic steering across backends and environments.\n&#8211; <strong>Cloud Deploy<\/strong>: release orchestration and approvals (may complement Dual Run).\n&#8211; <strong>Cloud Service Mesh<\/strong>: advanced routing and telemetry patterns (verify current features).\n&#8211; <strong>Database Migration Service<\/strong>: supports replication-based transitions for databases.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Nearest services in other clouds (conceptual parallels)<\/strong>\n&#8211; AWS: weighted routing with Application Load Balancer\/Route 53; CodeDeploy canary\/blue-green.\n&#8211; Azure: Front Door\/Traffic Manager weighted routing; deployment slots in App Service.\n(These are comparisons of approach, not identical products.)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Open-source\/self-managed alternatives<\/strong>\n&#8211; Kubernetes-based progressive delivery: Argo Rollouts, Flagger\n&#8211; Service mesh routing with Istio\/Envoy\n&#8211; Custom traffic splitting via NGINX\/HAProxy\nThese can implement Dual Run but require additional ops overhead.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">15. Real-World Example<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Enterprise example (regulated, mission-critical)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Problem<\/strong><br\/>\nA bank is migrating a payment authorization API from on-prem middleware to Google Cloud (Cloud Run + Cloud SQL). The API must meet strict uptime, auditability, and correctness requirements.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Proposed architecture<\/strong>\n&#8211; Cloud Load Balancing (or API gateway) as a controlled entry point\n&#8211; Legacy payment API remains primary initially\n&#8211; New Cloud Run service deployed in Google Cloud\n&#8211; Database Migration Service replicates on-prem DB to Cloud SQL (or a target database chosen for the workload)\n&#8211; Cloud Logging\/Monitoring dashboards compare:\n  &#8211; auth success rate\n  &#8211; p95 latency\n  &#8211; downstream error types\n&#8211; Change approvals via CI\/CD with gated promotions<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why Dual Run was chosen<\/strong>\n&#8211; Payment correctness must be proven under real traffic.\n&#8211; The bank needs a rapid rollback path.\n&#8211; Compliance requires audit evidence of staged rollout and controls.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcomes<\/strong>\n&#8211; Reduced cutover risk and fewer high-severity incidents\n&#8211; Objective go\/no-go decisions based on SLOs\n&#8211; Cleaner decommissioning plan with audit trails<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Startup\/small-team example (fast-moving SaaS)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Problem<\/strong><br\/>\nA SaaS startup is moving from a single VM-based API to Cloud Run for autoscaling and simpler ops. They want to avoid extended downtime and reduce incident risk with a small team.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Proposed architecture<\/strong>\n&#8211; Cloud Run service with two revisions: v1 (legacy behavior) and v2 (optimized)\n&#8211; Cloud Run traffic splitting for 95\/5 \u2192 80\/20 \u2192 50\/50 \u2192 100 cutover\n&#8211; Logging-based validation and basic dashboards\n&#8211; Simple rollback runbook: shift traffic back to v1<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why Dual Run was chosen<\/strong>\n&#8211; The team wants safer deployments without building complex infrastructure.\n&#8211; Cloud Run makes parallel revisions and traffic splitting straightforward.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcomes<\/strong>\n&#8211; Faster deployment cadence with lower risk\n&#8211; Autoscaling under spikes without pre-provisioning\n&#8211; Reduced maintenance compared to managing VMs<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">16. FAQ<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">1) <strong>Is Dual Run a Google Cloud product I can enable?<\/strong><br\/>\nNo. Dual Run is a <strong>migration strategy<\/strong> implemented using Google Cloud services (routing, compute, data replication, and observability).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) <strong>Is Dual Run the same as blue\/green?<\/strong><br\/>\nThey\u2019re related. Blue\/green is a deployment pattern with two environments and a switch. Dual Run emphasizes <strong>running both for validation<\/strong> and often includes comparison gates and longer parallel periods.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) <strong>How long should Dual Run last?<\/strong><br\/>\nAs short as possible while meeting confidence requirements. Many teams timebox it (days to weeks). Long Dual Runs increase cost and complexity.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) <strong>What\u2019s the safest Dual Run approach for APIs?<\/strong><br\/>\nOften: start with <strong>shadow traffic<\/strong> (if feasible), then small canary percentages, with strong SLO monitoring and rollback.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">5) <strong>How do I implement Dual Run quickly on Google Cloud?<\/strong><br\/>\nFor HTTP services, <strong>Cloud Run traffic splitting<\/strong> is one of the fastest practical methods.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">6) <strong>What about data\u2014how do I dual run databases safely?<\/strong><br\/>\nCommonly: replicate data to the target, shift <strong>read traffic first<\/strong>, then plan a controlled write cutover. Dual writes are possible but risky.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">7) <strong>Does Dual Run always require traffic splitting?<\/strong><br\/>\nNo. Some migrations dual-run <strong>batch jobs<\/strong> or <strong>pipelines<\/strong> by running both and comparing outputs, without splitting interactive traffic.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">8) <strong>How do I compare outputs between legacy and new systems?<\/strong><br\/>\nUse a validation harness: store outputs, normalize expected differences, and compare with thresholds. For APIs, capture structured responses and error codes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">9) <strong>What metrics should gate traffic increases?<\/strong><br\/>\nAt minimum: error rate, latency (p95\/p99), saturation (CPU\/memory\/DB connections), and domain correctness checks.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">10) <strong>What is the biggest risk during Dual Run?<\/strong><br\/>\nData inconsistency and operational confusion. Clear ownership, runbooks, and strong observability reduce risk.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">11) <strong>Will Dual Run double my cloud bill?<\/strong><br\/>\nNot always exactly double, but it often increases costs materially because you run two stacks plus additional observability and replication.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">12) <strong>How do I roll back safely if I\u2019ve already migrated writes?<\/strong><br\/>\nRollback is harder after write cutover. Plan for this by:\n&#8211; delaying write cutover until high confidence\n&#8211; using backups and point-in-time recovery\n&#8211; ensuring compatibility or a forward-fix plan<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">13) <strong>Can I use Dual Run for migrating to GKE?<\/strong><br\/>\nYes. You can run legacy and GKE services in parallel and use load balancing\/service mesh for routing. The exact routing method depends on your entry point and mesh strategy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">14) <strong>Is Dual Run useful for compliance audits?<\/strong><br\/>\nOften yes\u2014if you capture evidence: change records, dashboards, alerts, approvals, and validation results.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">15) <strong>What\u2019s the simplest rollback mechanism in Google Cloud?<\/strong><br\/>\nIf you\u2019re using Cloud Run revisions, rollback can be as simple as shifting traffic back to the previous revision.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">16) <strong>Does Dual Run require two projects?<\/strong><br\/>\nNot required, but common in larger orgs (separate projects\/environments). For smaller teams, one project with strict separation can work.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">17. Top Online Resources to Learn Dual Run<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Since Dual Run is a strategy, the best resources are a combination of migration guidance and the specific services used to implement it.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Resource Type<\/th>\n<th>Name<\/th>\n<th>Why It Is Useful<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Official architecture guidance<\/td>\n<td>Migration to Google Cloud (Architecture Center) \u2013 https:\/\/cloud.google.com\/architecture\/migration-to-google-cloud<\/td>\n<td>Foundational migration concepts, patterns, and phases that often include parallel-run ideas<\/td>\n<\/tr>\n<tr>\n<td>Official architecture center<\/td>\n<td>Google Cloud Architecture Center \u2013 https:\/\/cloud.google.com\/architecture<\/td>\n<td>Reference architectures and best practices for implementing migration patterns<\/td>\n<\/tr>\n<tr>\n<td>Official service docs<\/td>\n<td>Cloud Run documentation \u2013 https:\/\/cloud.google.com\/run\/docs<\/td>\n<td>Practical traffic splitting via revisions; operational guidance<\/td>\n<\/tr>\n<tr>\n<td>Official pricing<\/td>\n<td>Cloud Run pricing \u2013 https:\/\/cloud.google.com\/run\/pricing<\/td>\n<td>Understand request\/compute billing when running multiple revisions<\/td>\n<\/tr>\n<tr>\n<td>Official service docs<\/td>\n<td>Cloud Load Balancing documentation \u2013 https:\/\/cloud.google.com\/load-balancing\/docs<\/td>\n<td>Traffic management patterns for multi-backend or hybrid dual run<\/td>\n<\/tr>\n<tr>\n<td>Official pricing<\/td>\n<td>Cloud Load Balancing pricing \u2013 https:\/\/cloud.google.com\/load-balancing\/pricing<\/td>\n<td>Cost model for routing\/edge services during migration<\/td>\n<\/tr>\n<tr>\n<td>Official observability docs<\/td>\n<td>Cloud Logging documentation \u2013 https:\/\/cloud.google.com\/logging\/docs<\/td>\n<td>Logging queries and strategies for validating parallel systems<\/td>\n<\/tr>\n<tr>\n<td>Official pricing<\/td>\n<td>Cloud Logging pricing \u2013 https:\/\/cloud.google.com\/logging\/pricing<\/td>\n<td>Manage log ingestion\/retention costs during Dual Run<\/td>\n<\/tr>\n<tr>\n<td>Official observability docs<\/td>\n<td>Cloud Monitoring documentation \u2013 https:\/\/cloud.google.com\/monitoring\/docs<\/td>\n<td>SLOs, alerting, and dashboards for go\/no-go gates<\/td>\n<\/tr>\n<tr>\n<td>Official pricing<\/td>\n<td>Cloud Monitoring pricing \u2013 https:\/\/cloud.google.com\/monitoring\/pricing<\/td>\n<td>Understand metrics\/monitoring cost factors<\/td>\n<\/tr>\n<tr>\n<td>Official migration service<\/td>\n<td>Database Migration Service \u2013 https:\/\/cloud.google.com\/database-migration<\/td>\n<td>Common building block for dual-run database replication patterns<\/td>\n<\/tr>\n<tr>\n<td>Official CI\/CD docs<\/td>\n<td>Cloud Build documentation \u2013 https:\/\/cloud.google.com\/build\/docs<\/td>\n<td>Build automation for repeated deployments during Dual Run<\/td>\n<\/tr>\n<tr>\n<td>Official CI\/CD docs<\/td>\n<td>Cloud Deploy documentation \u2013 https:\/\/cloud.google.com\/deploy\/docs<\/td>\n<td>Release orchestration and approval flows (complements Dual Run)<\/td>\n<\/tr>\n<tr>\n<td>Official calculator<\/td>\n<td>Google Cloud Pricing Calculator \u2013 https:\/\/cloud.google.com\/products\/calculator<\/td>\n<td>Estimate costs for parallel run periods<\/td>\n<\/tr>\n<tr>\n<td>Video (official)<\/td>\n<td>Google Cloud Tech YouTube \u2013 https:\/\/www.youtube.com\/googlecloudtech<\/td>\n<td>Architecture and operations content; search for migration and progressive delivery topics<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">18. Training and Certification Providers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Institute<\/th>\n<th>Suitable Audience<\/th>\n<th>Likely Learning Focus<\/th>\n<th>Mode<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps engineers, SREs, platform teams, architects<\/td>\n<td>DevOps, CI\/CD, cloud operations, migration practices<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>ScmGalaxy.com<\/td>\n<td>Beginners to intermediate DevOps practitioners<\/td>\n<td>SCM, DevOps foundations, tooling and processes<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.scmgalaxy.com\/<\/td>\n<\/tr>\n<tr>\n<td>CLoudOpsNow.in<\/td>\n<td>Cloud engineers, operations teams<\/td>\n<td>Cloud operations, monitoring, reliability, cost awareness<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.cloudopsnow.in\/<\/td>\n<\/tr>\n<tr>\n<td>SreSchool.com<\/td>\n<td>SREs, production engineers, incident responders<\/td>\n<td>SRE practices, SLOs, observability, reliability engineering<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.sreschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>AiOpsSchool.com<\/td>\n<td>Ops teams exploring automation<\/td>\n<td>AIOps concepts, automation, monitoring analytics<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.aiopsschool.com\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">19. Top Trainers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Platform\/Site<\/th>\n<th>Likely Specialization<\/th>\n<th>Suitable Audience<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>RajeshKumar.xyz<\/td>\n<td>DevOps\/cloud training and guidance (verify offerings)<\/td>\n<td>Beginners to working engineers<\/td>\n<td>https:\/\/www.rajeshkumar.xyz\/<\/td>\n<\/tr>\n<tr>\n<td>devopstrainer.in<\/td>\n<td>DevOps tooling, CI\/CD, cloud practices (verify offerings)<\/td>\n<td>DevOps engineers, students<\/td>\n<td>https:\/\/www.devopstrainer.in\/<\/td>\n<\/tr>\n<tr>\n<td>devopsfreelancer.com<\/td>\n<td>Freelance DevOps support\/training (verify offerings)<\/td>\n<td>Small teams needing hands-on help<\/td>\n<td>https:\/\/www.devopsfreelancer.com\/<\/td>\n<\/tr>\n<tr>\n<td>devopssupport.in<\/td>\n<td>DevOps support and training resources (verify offerings)<\/td>\n<td>Ops\/DevOps teams<\/td>\n<td>https:\/\/www.devopssupport.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">20. Top Consulting Companies<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Company<\/th>\n<th>Likely Service Area<\/th>\n<th>Where They May Help<\/th>\n<th>Consulting Use Case Examples<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>cotocus.com<\/td>\n<td>Cloud\/DevOps consulting (verify exact scope)<\/td>\n<td>Migration planning, CI\/CD, cloud operations<\/td>\n<td>Dual Run rollout planning, observability dashboards, cost controls<\/td>\n<td>https:\/\/cotocus.com\/<\/td>\n<\/tr>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps consulting and enablement (verify exact scope)<\/td>\n<td>DevOps transformation, training + implementation<\/td>\n<td>Migration factory setup, CI\/CD pipelines, deployment guardrails<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>DEVOPSCONSULTING.IN<\/td>\n<td>DevOps consulting services (verify exact scope)<\/td>\n<td>Delivery automation, reliability practices<\/td>\n<td>Progressive delivery, monitoring\/alerting, runbook automation<\/td>\n<td>https:\/\/www.devopsconsulting.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">21. Career and Learning Roadmap<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn before Dual Run<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To execute Dual Run well on Google Cloud, build fundamentals in:\n&#8211; Google Cloud projects, IAM, service accounts\n&#8211; VPC networking basics (subnets, routing, firewall rules)\n&#8211; Cloud Logging and Cloud Monitoring basics\n&#8211; Containers and CI\/CD (Cloud Build, Artifact Registry)\n&#8211; Basic SRE concepts: SLOs, SLIs, error budgets, incident response<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn after Dual Run<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Once comfortable, level up with:\n&#8211; Advanced traffic management (Cloud Load Balancing, service mesh patterns)\n&#8211; Database migration deeper skills (DMS, replication, cutover planning)\n&#8211; Event-driven migrations (Pub\/Sub patterns, idempotent consumers)\n&#8211; Policy-as-code and governance (organization policies, IAM conditions)\n&#8211; FinOps practices for migration programs (budgets, cost attribution, optimization)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Job roles that use it<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cloud Solutions Architect<\/li>\n<li>Cloud\/Platform Engineer<\/li>\n<li>DevOps Engineer<\/li>\n<li>Site Reliability Engineer (SRE)<\/li>\n<li>Migration Lead \/ Technical Program Lead<\/li>\n<li>Data Engineer (for pipeline dual runs)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Certification path (if available)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run is a strategy, not a cert topic by itself, but it maps strongly to:\n&#8211; Google Cloud Professional Cloud Architect\n&#8211; Google Cloud Professional Cloud DevOps Engineer\n&#8211; Google Cloud Associate Cloud Engineer<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Verify current certification paths: https:\/\/cloud.google.com\/learn\/certification<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project ideas for practice<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Implement Cloud Run dual run with traffic splitting + SLO-based gating.<\/li>\n<li>Migrate a sample database using replication, shift reads first, then plan write cutover.<\/li>\n<li>Build a shadow traffic harness with a proxy (advanced; verify feasibility with your stack).<\/li>\n<li>Create a \u201cmigration dashboard\u201d comparing legacy vs new error rates and latency.<\/li>\n<li>Implement idempotent event consumers and dual publish to two topics for a migration period.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">22. Glossary<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Dual Run<\/strong>: Running legacy and new systems in parallel during Migration to validate and reduce cutover risk.<\/li>\n<li><strong>Parallel Run<\/strong>: Another term commonly used for Dual Run.<\/li>\n<li><strong>Cutover<\/strong>: The act of switching production traffic and\/or writes to the new system.<\/li>\n<li><strong>Rollback<\/strong>: Switching back to the legacy system after issues are detected.<\/li>\n<li><strong>Canary<\/strong>: Releasing to a small subset of traffic\/users to reduce risk.<\/li>\n<li><strong>Blue\/Green<\/strong>: Two environments; one serves production while the other is staged, then traffic switches.<\/li>\n<li><strong>Shadow traffic<\/strong>: Duplicating production traffic to a new system without returning its response to users.<\/li>\n<li><strong>SLO\/SLI<\/strong>: Service Level Objective \/ Indicator; quantitative reliability targets and measurements.<\/li>\n<li><strong>Idempotency<\/strong>: Repeating an operation produces the same effect; crucial for retries and duplicated events.<\/li>\n<li><strong>Replication lag<\/strong>: Delay between source and target data stores during replication.<\/li>\n<li><strong>Revision (Cloud Run)<\/strong>: An immutable version of a Cloud Run service created per deployment.<\/li>\n<li><strong>Traffic splitting (Cloud Run)<\/strong>: Assigning percentage-based traffic to revisions\/tags.<\/li>\n<li><strong>Artifact Registry<\/strong>: Google Cloud service to store container images and artifacts.<\/li>\n<li><strong>Cloud Logging<\/strong>: Central log ingestion, query, and retention service.<\/li>\n<li><strong>Cloud Monitoring<\/strong>: Metrics, dashboards, alerting, and SLO tooling.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">23. Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Dual Run (Google Cloud Migration) is a <strong>parallel-run migration strategy<\/strong> where you operate the <strong>legacy<\/strong> and <strong>new<\/strong> systems simultaneously, validate correctness and SLOs under real conditions, and then <strong>progressively cut over<\/strong> with a rollback option.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It matters because Migration failures are usually caused by <strong>unknown production behaviors<\/strong>\u2014Dual Run turns those unknowns into measurable signals before you fully commit. In Google Cloud, Dual Run is commonly implemented using <strong>Cloud Run\/GKE\/Compute Engine<\/strong>, <strong>traffic management<\/strong> (Cloud Run splitting or Cloud Load Balancing), <strong>data replication<\/strong> (often via DMS), and <strong>observability<\/strong> (Cloud Logging and Monitoring).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key cost\/security points:\n&#8211; Cost often increases temporarily due to <strong>duplicate compute<\/strong>, additional telemetry, and replication\/egress.\n&#8211; Security requires tight <strong>IAM<\/strong>, careful endpoint exposure, and disciplined secrets handling.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Use Dual Run when the workload is business-critical, downtime is expensive, and you need high confidence. The best next step is to practice the lab using <strong>Cloud Run traffic splitting<\/strong>, then extend the pattern to your real workload with production-grade validation gates and data migration planning.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Migration<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[51,46],"tags":[],"class_list":["post-709","post","type-post","status-publish","format-standard","hentry","category-google-cloud","category-migration"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/709","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/comments?post=709"}],"version-history":[{"count":0,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/709\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/media?parent=709"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/categories?post=709"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/tags?post=709"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}