{"id":623,"date":"2026-04-14T19:00:09","date_gmt":"2026-04-14T19:00:09","guid":{"rendered":"https:\/\/www.devopsschool.com\/tutorials\/google-cloud-capacity-planner-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-compute\/"},"modified":"2026-04-14T19:00:09","modified_gmt":"2026-04-14T19:00:09","slug":"google-cloud-capacity-planner-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-compute","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/tutorials\/google-cloud-capacity-planner-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-compute\/","title":{"rendered":"Google Cloud Capacity Planner Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Compute"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Category<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Compute<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Introduction<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Capacity planning is one of the least glamorous but most important parts of running reliable systems. In Google Cloud Compute, <strong>Capacity Planner<\/strong> is best understood as the <strong>capacity planning workflow and tooling around Compute Engine capacity management<\/strong>\u2014primarily <strong>Compute Engine reservations<\/strong> (and, where applicable, commitments\/discount programs and recommendations).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In simple terms: <strong>Capacity Planner helps you make sure the compute capacity you need will be available when you need it<\/strong>\u2014especially in specific zones, for specific machine families, and for predictable workloads. It is most relevant when you cannot rely solely on \u201cbest effort\u201d capacity allocation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Technically, Capacity Planner is not typically a separate, billable \u201cstandalone product\u201d with its own runtime. Instead, it is an <strong>operational approach implemented through Google Cloud\u2019s Compute Engine control plane<\/strong>, using features such as:\n&#8211; <strong>Zonal reservations<\/strong> (to guarantee capacity for VM instances)\n&#8211; <strong>Quota awareness<\/strong> and fleet planning\n&#8211; <strong>Observability and usage analysis<\/strong> (Cloud Monitoring\/Logging + billing\/asset data)\n&#8211; <strong>Automation<\/strong> (gcloud\/Terraform\/CI pipelines)\n&#8211; <strong>(Optional) purchase\/discount planning<\/strong> such as committed use discounts for predictable usage (verify the latest discount programs and how they interact with your environment in official docs)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">What problem it solves: Without deliberate capacity planning, teams can hit <strong>allocation failures<\/strong>, experience <strong>launch delays during peak demand<\/strong>, or build fragile systems that fail during regional events or sudden growth. Capacity Planner mitigates these issues by making capacity needs explicit, reserving the required resources, and operationalizing governance, cost control, and reliability.<\/p>\n\n\n\n<blockquote>\n<p>Naming note (important): If you are expecting a dedicated \u201cCapacity Planner\u201d product page\/API, verify in official Google Cloud documentation whether your organization is referring to a <strong>console experience<\/strong> or an internal program name. In practice, the concrete, official Compute feature most closely associated with \u201ccapacity planning\u201d is <strong>Compute Engine Reservations<\/strong>. Start here: https:\/\/cloud.google.com\/compute\/docs\/instances\/reserving-zonal-resources<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">2. What is Capacity Planner?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Official purpose (practical interpretation in Google Cloud Compute):<\/strong><br\/>\nCapacity Planner is the practice and associated Google Cloud tooling used to <strong>forecast, allocate, and guarantee Compute Engine capacity<\/strong> so workloads can reliably scale and launch without capacity-related failures.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Because \u201cCapacity Planner\u201d is often used as a capability label rather than a single API surface, the most concrete \u201cmajor components\u201d in Google Cloud are:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Core capabilities (what you can do)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Reserve VM capacity<\/strong> in a specific zone for a machine type (and related attributes) so that capacity is available when you create VMs.<\/li>\n<li><strong>Control which workloads consume reserved capacity<\/strong> using reservation affinity (specific vs any).<\/li>\n<li><strong>Plan for predictable workloads<\/strong> by combining reservations with disciplined sizing, automation, and (optionally) commitment\/discount planning.<\/li>\n<li><strong>Operationalize capacity<\/strong> with monitoring, alerting, quota management, and change management.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Major components<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Compute Engine Reservations (zonal)<\/strong>: The core mechanism to guarantee VM capacity in a zone.<\/li>\n<li><strong>Compute Engine VM provisioning<\/strong>: Instances and\/or Managed Instance Groups (MIGs) that consume the reservation.<\/li>\n<li><strong>IAM &amp; policy controls<\/strong>: Who can create\/modify reservations and who can consume them.<\/li>\n<li><strong>Monitoring &amp; logging<\/strong>: Track reservation utilization and provisioning errors; audit changes.<\/li>\n<li><strong>Infrastructure as Code (IaC)<\/strong>: Terraform or CI pipelines for repeatable reservation and VM configuration.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Service type<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Control-plane feature<\/strong> in <strong>Google Cloud Compute (Compute Engine)<\/strong>.<\/li>\n<li>Backed by Google Cloud APIs (Compute Engine API).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scope (regional\/global\/zonal\/project-scoped)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Reservations are zonal resources<\/strong> (created in a specific zone).<\/li>\n<li>They are typically <strong>project-scoped<\/strong> resources (created and managed within a Google Cloud project).<br\/>\n  Some organizations also use cross-project patterns (for example, Shared VPC or reservation sharing). Availability and configuration details should be verified in official docs for your org\u2019s structure and policies.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How it fits into the Google Cloud ecosystem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Capacity Planner connects the \u201cbusiness requirement\u201d (reliable scale and predictable launch) to the \u201cplatform primitives\u201d:\n&#8211; <strong>Compute Engine<\/strong> for VM-based workloads\n&#8211; <strong>GKE<\/strong> and other platforms that may indirectly depend on VM capacity (for node pools, where applicable)\n&#8211; <strong>Cloud Monitoring\/Logging<\/strong> for operational visibility\n&#8211; <strong>Cloud Billing<\/strong> for cost governance and forecasting\n&#8211; <strong>Cloud Asset Inventory<\/strong> for inventory and governance visibility\n&#8211; <strong>IAM<\/strong> and <strong>Organization Policy<\/strong> for control and compliance<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">3. Why use Capacity Planner?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Business reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Avoid revenue-impacting outages<\/strong> caused by capacity shortages during launches or scaling events.<\/li>\n<li><strong>Meet customer commitments<\/strong> (SLAs, delivery timelines, seasonal peaks).<\/li>\n<li><strong>Improve predictability<\/strong> for product launches, migrations, and batch windows.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Technical reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Guaranteed capacity in a specific zone<\/strong> for a specific VM shape (subject to the reservation\u2019s definition).<\/li>\n<li><strong>Reduced \u201cinsufficient capacity\u201d provisioning failures<\/strong>.<\/li>\n<li><strong>More deterministic scaling<\/strong> behavior for autoscalers and orchestration systems.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Clear ownership of \u201ccapacity as an SLO\u201d: you can measure, audit, and improve it.<\/li>\n<li>Better change management: reservations can be versioned and controlled via IaC.<\/li>\n<li>Better incident response: capacity-related incidents become diagnosable (quota vs capacity vs config).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/compliance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Segregation of duties: separate who can <strong>reserve capacity<\/strong> from who can <strong>consume it<\/strong>.<\/li>\n<li>Auditability: reservation changes are visible in audit logs (verify exact audit log events in your environment).<\/li>\n<li>Governance alignment: labels\/tags, org policies, and approval workflows can be applied.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scalability\/performance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>More reliable horizontal scaling for stateless services.<\/li>\n<li>Better planning for latency-sensitive deployments that require \u201cclose-to-users\u201d zones.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should choose Capacity Planner<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>You run <strong>production workloads<\/strong> where failure to scale is unacceptable.<\/li>\n<li>You have <strong>predictable baseline usage<\/strong> and known growth patterns.<\/li>\n<li>You have <strong>strict zonal requirements<\/strong> (data locality, latency, compliance).<\/li>\n<li>You operate <strong>large fleets<\/strong> where \u201cbest effort\u201d capacity introduces unacceptable variance.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should not choose it<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Your workloads are small, non-critical, or highly flexible on where\/when they run.<\/li>\n<li>You can tolerate occasional provisioning delays and prefer operational simplicity.<\/li>\n<li>Your architecture can use alternatives (e.g., multi-zone designs that shift load) rather than guaranteeing capacity in one zone.<\/li>\n<li>You have not yet implemented basic hygiene (quotas, autoscaling, monitoring); reservations alone won\u2019t fix foundational gaps.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">4. Where is Capacity Planner used?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Industries<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Retail\/e-commerce (seasonal traffic spikes)<\/li>\n<li>Media\/streaming (event-driven demand)<\/li>\n<li>Financial services (batch windows, trading peaks, regulated locality)<\/li>\n<li>Gaming (launch events, regional latency)<\/li>\n<li>Healthcare (regulated workloads, strict uptime)<\/li>\n<li>Manufacturing\/IoT (fleet ingestion + analytics batch cycles)<\/li>\n<li>SaaS platforms (multi-tenant steady baseline with bursts)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Team types<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>SRE and Platform Engineering teams responsible for availability<\/li>\n<li>DevOps teams managing production release pipelines<\/li>\n<li>Cloud Center of Excellence (CCoE) teams enforcing governance<\/li>\n<li>FinOps teams collaborating on commitments and utilization<\/li>\n<li>Security teams ensuring access control and auditability<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Workloads<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>VM-based microservices, API backends, and web tiers<\/li>\n<li>Managed Instance Groups (MIGs) behind load balancers<\/li>\n<li>Stateful VM workloads that must live in specific zones (with careful design)<\/li>\n<li>Build farms \/ CI runners<\/li>\n<li>Batch processing fleets (when timing is strict)<\/li>\n<li>Migration cutovers and replatforming where timing is fixed<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Architectures<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Multi-zone active-active with per-zone baseline capacity<\/li>\n<li>Hub-and-spoke Shared VPC environments (central network + project-level workloads)<\/li>\n<li>Hybrid systems where on-prem capacity is supplemented by reserved cloud capacity<\/li>\n<li>Regulated deployments requiring zonal locality<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Production vs dev\/test usage<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Production:<\/strong> common and valuable, especially with known scaling floors.<\/li>\n<li><strong>Dev\/test:<\/strong> usually unnecessary unless teams frequently hit capacity limits or need deterministic performance for performance tests.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">5. Top Use Cases and Scenarios<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Below are realistic scenarios where Capacity Planner (implemented using Compute Engine reservations and related controls) is a good fit.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Baseline capacity for a regional API tier<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Your API must always maintain at least N instances per zone. Autoscaling sometimes fails due to temporary zonal capacity constraints.<\/li>\n<li><strong>Why this fits:<\/strong> Reservations guarantee baseline VM capacity in each zone.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve 20 <code>n2-standard-4<\/code> VMs in <code>us-central1-a<\/code> and <code>us-central1-b<\/code> for a MIG that scales between 20\u2013200.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) Launch-day capacity for a new product<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> You anticipate a one-time surge and cannot risk instance provisioning failures.<\/li>\n<li><strong>Why this fits:<\/strong> Create reservations ahead of launch to ensure initial scale-out succeeds.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve capacity for 500 VMs for 48 hours, then scale down and remove reservations (verify operational best practice and timing policies).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) Guarantee capacity for latency-sensitive workloads in a specific zone<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Your service must run close to a specific exchange, customer base, or data source.<\/li>\n<li><strong>Why this fits:<\/strong> Zonal reservations provide deterministic availability in that zone.<\/li>\n<li><strong>Example scenario:<\/strong> A trading analytics tier must run in a particular zone; reserve the exact VM shapes required.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) CI\/CD runner fleet with predictable daytime utilization<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Build runners must be available during business hours; failure to allocate runners blocks developers.<\/li>\n<li><strong>Why this fits:<\/strong> Reserve capacity for a fixed baseline of runner VMs.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve 100 VMs from 08:00\u201318:00 weekdays and automate scaling outside this window (reservation scheduling may require custom automation; verify if \u201cfuture reservations\u201d or scheduling features fit your needs).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Batch processing window with strict deadlines<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Nightly batch must finish by 06:00; delays have downstream impacts.<\/li>\n<li><strong>Why this fits:<\/strong> Reservations ensure the batch fleet can start on time.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve capacity for 2,000 cores in one zone during batch start, then release after completion.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) Regulated workloads requiring strict locality<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Policy dictates workloads remain in a specific geography\/zone.<\/li>\n<li><strong>Why this fits:<\/strong> Reservations help ensure locality constraints don\u2019t cause provisioning failures.<\/li>\n<li><strong>Example scenario:<\/strong> Healthcare analytics must run in a specific zone; reserve baseline compute.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) Stateful legacy VM workloads during migration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> You are migrating a legacy VM stack and need deterministic provisioning during cutover.<\/li>\n<li><strong>Why this fits:<\/strong> Reservations reduce risk of cutover failure due to capacity issues.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve a set of VMs matching the legacy footprint for cutover weekend.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) Dedicated capacity for an internal platform team<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Shared projects lead to noisy-neighbor capacity competition.<\/li>\n<li><strong>Why this fits:<\/strong> Reservations can isolate capacity for priority workloads.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve capacity for \u201cplatform-core\u201d workloads; restrict consumption through reservation affinity and IAM processes.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) GPU or specialized VM capacity planning (where supported)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Accelerator capacity can be constrained; provisioning fails at critical times.<\/li>\n<li><strong>Why this fits:<\/strong> Use reservations\/future reservations when available for the accelerator\/machine type.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve GPU-capable VM capacity for a training window (verify official support and requirements for GPU reservations in your regions).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Disaster recovery rehearsal capacity in a secondary zone<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> DR tests fail because you can\u2019t scale in the secondary zone when you need to test.<\/li>\n<li><strong>Why this fits:<\/strong> Reserve minimal DR test capacity so rehearsals are reliable.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve enough capacity for a reduced \u201cDR mode\u201d footprint.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">11) Multi-tenant SaaS with per-tenant capacity guarantees<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Premium tenants require guaranteed performance even during spikes.<\/li>\n<li><strong>Why this fits:<\/strong> Reserve a baseline pool and map premium workloads to it.<\/li>\n<li><strong>Example scenario:<\/strong> Premium-tier MIGs consume reserved capacity; standard tier uses best effort.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">12) Controlled rollout environments (blue\/green capacity)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Blue\/green deployment doubles capacity briefly; best-effort provisioning is risky.<\/li>\n<li><strong>Why this fits:<\/strong> Reserve temporary capacity to ensure the \u201cgreen\u201d environment can come up.<\/li>\n<li><strong>Example scenario:<\/strong> Reserve 1:1 additional capacity for a cutover window, then delete reservations afterward.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">6. Core Features<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Because \u201cCapacity Planner\u201d is best implemented via Compute Engine reservations and operational tooling, the features below focus on what you can do today with official Compute primitives. Verify the latest capabilities in official docs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 1: Zonal capacity reservations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Reserves a specified number of VM \u201cslots\u201d (based on machine type and attributes) in a particular zone.<\/li>\n<li><strong>Why it matters:<\/strong> You can reliably create VMs even when the zone is under capacity pressure.<\/li>\n<li><strong>Practical benefit:<\/strong> Fewer failed scale-outs and fewer launch delays.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> <\/li>\n<li>Zonal: a reservation in one zone does not guarantee capacity in another.  <\/li>\n<li>Reservation definition must match VM requirements (machine family\/type and other attributes).  <\/li>\n<li>Availability depends on quotas and the product\u2019s reservation support (verify exact matching rules in docs).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 2: Reservation affinity (control who consumes the reservation)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Lets a VM specify whether it must consume a specific reservation, can consume any reservation, or should not use reservations.<\/li>\n<li><strong>Why it matters:<\/strong> Prevents unintended workloads from using reserved capacity.<\/li>\n<li><strong>Practical benefit:<\/strong> Isolation of priority capacity pools.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> Misconfiguration can lead to \u201creservation not found\/mismatch\u201d provisioning errors.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 3: Observability for capacity and provisioning outcomes<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Use Cloud Monitoring\/Logging to track VM provisioning failures, utilization signals, and fleet behavior.<\/li>\n<li><strong>Why it matters:<\/strong> Capacity planning is only reliable if you measure utilization and failures.<\/li>\n<li><strong>Practical benefit:<\/strong> Proactive alerts before shortages become incidents.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> You may need to define custom SLOs and dashboards; metrics availability varies\u2014verify in product metrics documentation.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 4: Quota and limit awareness as part of planning<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Ensures you have sufficient quotas (CPUs, instances, GPUs, etc.) to back your plan.<\/li>\n<li><strong>Why it matters:<\/strong> Many \u201ccapacity issues\u201d are actually <strong>quota<\/strong> issues.<\/li>\n<li><strong>Practical benefit:<\/strong> Faster provisioning and fewer surprises during launches.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> Quota increases can require approvals and time; plan ahead.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 5: Labels\/tags and governance integration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Attach labels to reservations and VMs and use org policies where appropriate.<\/li>\n<li><strong>Why it matters:<\/strong> Enables chargeback\/showback and policy controls.<\/li>\n<li><strong>Practical benefit:<\/strong> Better FinOps reporting and operational ownership.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> Governance is only effective if naming and labeling are consistent.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 6: Automation via gcloud, Terraform, and CI<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Treat reservations as code and deploy them consistently across environments.<\/li>\n<li><strong>Why it matters:<\/strong> Manual capacity changes are error-prone.<\/li>\n<li><strong>Practical benefit:<\/strong> Repeatable scaling floors per environment and per zone.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> You must manage rollout sequencing (create reservation before scaling up consumers).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 7: Integration with fleet patterns (MIGs and load balancing)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Reservations can back instance groups, allowing scalable services to have guaranteed baseline capacity.<\/li>\n<li><strong>Why it matters:<\/strong> Most production Compute workloads use MIGs for resilience.<\/li>\n<li><strong>Practical benefit:<\/strong> Baseline capacity per zone + elastic burst.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> Ensure distribution policies (multi-zone) and reservations align; otherwise you can \u201cguarantee\u201d in the wrong place.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 8: Auditability and change tracking<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> IAM + audit logs enable tracking who changed capacity-related resources.<\/li>\n<li><strong>Why it matters:<\/strong> Capacity changes can cause outages or cost spikes.<\/li>\n<li><strong>Practical benefit:<\/strong> Faster incident investigations and compliance evidence.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> Audit log retention and routing may require configuration (Cloud Logging sinks).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Feature 9: Cost planning via predictable usage programs (optional)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> For predictable workloads, teams may combine capacity planning with discount mechanisms (for example, committed use discounts).<\/li>\n<li><strong>Why it matters:<\/strong> Baseline capacity often maps to baseline spend.<\/li>\n<li><strong>Practical benefit:<\/strong> Lower unit costs for predictable usage.<\/li>\n<li><strong>Limitations\/caveats:<\/strong> Commitments have terms and constraints; verify current discount programs and applicability.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">7. Architecture and How It Works<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">High-level service architecture<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">At a high level, \u201cCapacity Planner\u201d (capacity planning for Compute) is a control-plane workflow:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Plan<\/strong>: Determine baseline VM needs per zone and machine type from historical usage, SLOs, and growth forecasts.<\/li>\n<li><strong>Prepare<\/strong>: Ensure quotas are sufficient; align IAM and governance.<\/li>\n<li><strong>Reserve<\/strong>: Create Compute Engine reservations in target zones for target VM shapes.<\/li>\n<li><strong>Consume<\/strong>: Configure workloads (instances\/MIGs) with reservation affinity so they use the reservation appropriately.<\/li>\n<li><strong>Operate<\/strong>: Monitor reservation utilization, provisioning errors, and costs; iterate.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Request\/data\/control flow<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Control plane<\/strong>: Admin actions create\/modify reservations via the Cloud Console, gcloud, Terraform, or Compute Engine API.<\/li>\n<li><strong>Provisioning<\/strong>: When a VM is created, Compute Engine scheduler checks:<\/li>\n<li>Is there a matching reservation in the zone?<\/li>\n<li>Does the VM have affinity settings that allow\/require a reservation?<\/li>\n<li>Is quota available?<\/li>\n<li>Can the VM be placed on available physical capacity?<\/li>\n<li><strong>Telemetry<\/strong>: Logs and metrics are emitted for provisioning actions and errors.<\/li>\n<li><strong>Governance<\/strong>: IAM governs who can act; audit logs record administrative actions.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Integrations with related services<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Cloud Monitoring<\/strong>: dashboards\/alerts for instance counts, error rates, and capacity signals.<\/li>\n<li><strong>Cloud Logging<\/strong>: audit and troubleshooting.<\/li>\n<li><strong>Cloud Billing<\/strong>: cost analysis and forecasting.<\/li>\n<li><strong>Cloud Asset Inventory<\/strong>: inventory and governance reporting.<\/li>\n<li><strong>Organization Policy Service<\/strong>: constraints (for example, allowed regions, external IP constraints) that can affect provisioning.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Dependency services<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Compute Engine API<\/strong> is the primary dependency.<\/li>\n<li><strong>IAM<\/strong> for access control.<\/li>\n<li><strong>Cloud Resource Manager<\/strong> for project\/folder\/org context.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/authentication model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>IAM roles<\/strong> determine who can manage reservations and instances.<\/li>\n<li><strong>Service accounts<\/strong> are used by automation pipelines to apply IaC changes.<\/li>\n<li><strong>Audit logs<\/strong> record administrative changes (ensure admin activity logs are enabled and retained per your requirements).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Networking model<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Reservations are not \u201cnetwork resources\u201d; they are compute placement capacity in a zone. Networking considerations still matter because:\n&#8211; Your architecture may require multi-zone load balancing (e.g., Cloud Load Balancing) with per-zone MIGs.\n&#8211; Firewall rules, VPC design, and NAT can impact VM provisioning and operational readiness (though not reservation itself).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Monitoring\/logging\/governance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Track: provisioning failures, MIG health, autoscaler events, instance creation latency, and reservation utilization (where exposed).<\/li>\n<li>Add alerts for \u201cinsufficient quota\u201d and recurring \u201cinsufficient capacity\u201d errors.<\/li>\n<li>Enforce labels\/tags for ownership, environment, and cost center.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Simple architecture diagram (Mermaid)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart LR\n  U[Ops\/Platform Engineer] --&gt;|Plan &amp; Reserve| C[Google Cloud Console \/ gcloud \/ Terraform]\n  C --&gt;|Create\/Update| R[Compute Engine Reservation (Zone)]\n  A[App Deployment (MIG\/VM)] --&gt;|Create VM with reservation affinity| CE[Compute Engine]\n  CE --&gt;|Consume capacity| R\n  CE --&gt; L[Cloud Logging]\n  CE --&gt; M[Cloud Monitoring]\n  B[FinOps] --&gt;|Cost analysis| CB[Cloud Billing]\n  CB --&gt; U\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Production-style architecture diagram (Mermaid)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart TB\n  subgraph Org[\"Organization \/ Governance\"]\n    IAM[IAM &amp; Org Policies]\n    CAI[Cloud Asset Inventory]\n    LOGSINK[Logging Sinks \/ SIEM Export]\n  end\n\n  subgraph Project[\"Prod Project\"]\n    subgraph Net[\"Shared VPC \/ VPC\"]\n      LB[Cloud Load Balancing]\n      FW[Firewall Policies]\n      NAT[Cloud NAT (optional)]\n    end\n\n    subgraph ZoneA[\"Zone A\"]\n      RESA[Reservation A]\n      MIGA[Managed Instance Group A]\n    end\n\n    subgraph ZoneB[\"Zone B\"]\n      RESB[Reservation B]\n      MIGB[Managed Instance Group B]\n    end\n\n    MON[Cloud Monitoring &amp; Alerting]\n    LOG[Cloud Logging]\n    BILL[Cloud Billing \/ Budgets]\n    CICD[CI\/CD + Terraform]\n  end\n\n  Users[End Users] --&gt; LB\n  LB --&gt; MIGA\n  LB --&gt; MIGB\n\n  CICD --&gt;|Apply IaC| RESA\n  CICD --&gt;|Apply IaC| RESB\n  CICD --&gt;|Deploy\/Scale| MIGA\n  CICD --&gt;|Deploy\/Scale| MIGB\n\n  IAM --&gt; CICD\n  IAM --&gt; Project\n\n  MIGA --&gt;|Consume| RESA\n  MIGB --&gt;|Consume| RESB\n\n  MIGA --&gt; LOG\n  MIGB --&gt; LOG\n  LOG --&gt; LOGSINK\n  LOG --&gt; MON\n  MON --&gt; OnCall[On-call \/ SRE]\n\n  BILL --&gt; FinOps[FinOps Team]\n  CAI --&gt; SecOps[Security\/Compliance]\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">8. Prerequisites<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Account\/project requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A <strong>Google Cloud project<\/strong> with <strong>billing enabled<\/strong>.<\/li>\n<li>Compute Engine API enabled in the project.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Permissions \/ IAM roles (typical)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Exact least-privilege depends on your org and tooling; verify in official IAM docs and your security policies. Common patterns:\n&#8211; For creating\/managing reservations: a role with Compute admin permissions (often <strong><code>roles\/compute.admin<\/code><\/strong> in many orgs).\n&#8211; For creating\/managing VM instances: <strong><code>roles\/compute.instanceAdmin.v1<\/code><\/strong> (commonly used).\n&#8211; For viewing: <strong><code>roles\/compute.viewer<\/code><\/strong>.\n&#8211; For IaC automation: a dedicated CI service account with only required permissions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">IAM docs: https:\/\/cloud.google.com\/iam\/docs<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Billing requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Billing must be enabled to run VM instances and related resources.<\/li>\n<li>Reservations are a control-plane resource; whether they have direct charges depends on the feature and program\u2014<strong>verify in official docs<\/strong>. In many common Compute Engine reservation workflows, billing is primarily driven by <strong>running VMs and attached resources<\/strong>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">CLI\/SDK\/tools needed<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Optional but recommended: <strong>gcloud CLI<\/strong><br\/>\n  Install: https:\/\/cloud.google.com\/sdk\/docs\/install<\/li>\n<li>Optional: <strong>Terraform<\/strong> (if you prefer IaC)<br\/>\n  Provider docs: https:\/\/registry.terraform.io\/providers\/hashicorp\/google\/latest\/docs<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Region availability<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Compute Engine is global, but <strong>reservations are zonal<\/strong> and availability varies by zone and machine family.<\/li>\n<li>For specialized machine types (GPUs, very large shapes), availability constraints can be tighter.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas\/limits<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Compute quotas (vCPU, instances, GPUs, etc.) can block both reservations and VM provisioning.<\/li>\n<li>Review quotas: https:\/\/cloud.google.com\/compute\/quotas<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisite services<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Compute Engine API<\/strong><\/li>\n<li><strong>Cloud Logging<\/strong> and <strong>Cloud Monitoring<\/strong> (generally available by default in projects, but ensure access)<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">9. Pricing \/ Cost<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Capacity Planner, as described here (capacity planning using Compute Engine reservations and operational tooling), usually does not introduce a separate \u201cCapacity Planner SKU.\u201d Costs are driven by the resources you run and the operational footprint you add.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Official pricing references<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Compute Engine pricing: https:\/\/cloud.google.com\/compute\/pricing  <\/li>\n<li>Google Cloud Pricing Calculator: https:\/\/cloud.google.com\/products\/calculator  <\/li>\n<li>Cloud Billing docs: https:\/\/cloud.google.com\/billing\/docs<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing dimensions (what you pay for)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You generally pay for:\n&#8211; <strong>VM instance runtime<\/strong> (vCPU, memory) by machine type and region\/zone.\n&#8211; <strong>Disks<\/strong> (Persistent Disk \/ Hyperdisk where applicable), snapshots, images.\n&#8211; <strong>Network egress<\/strong> (internet egress, inter-region, and some inter-zone patterns\u2014verify current network pricing).\n&#8211; <strong>Load balancing<\/strong> (if used) and <strong>public IP<\/strong> (if applicable).\n&#8211; <strong>Cloud Logging<\/strong> ingestion\/retention beyond free allotments.\n&#8211; <strong>Cloud Monitoring<\/strong> metrics beyond free allotments (varies by metric volume).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Is there a free tier?<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Google Cloud provides a general Free Tier for certain products. For Compute Engine, a small always-free VM exists in limited regions under specific conditions (verify the current Free Tier details). Reservations themselves are not typically positioned as \u201cfree tier\u201d items; they are control-plane constructs, while VM usage drives cost.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Free Tier overview: https:\/\/cloud.google.com\/free<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cost drivers specific to capacity planning<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Over-reserving baseline<\/strong>: If you reserve capacity and then run more baseline VMs than needed (or keep baseline too high), you spend more overall because you run more compute than necessary.<\/li>\n<li><strong>Under-utilization of committed spend programs<\/strong>: If you buy commitments\/discounts for baseline but workload drops, you can pay for unused commitment value (verify commitment program rules).<\/li>\n<li><strong>Multi-zone redundancy<\/strong>: Reliability often means duplicating baseline across zones (worth it, but costs more).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Hidden\/indirect costs<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Operational tooling costs<\/strong>: SIEM export, long log retention, custom dashboards.<\/li>\n<li><strong>Data transfer<\/strong>: Multi-zone designs can increase cross-zone traffic.<\/li>\n<li><strong>Pipeline and artifact storage<\/strong>: If you automate heavily, build artifacts and logs can add up.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network\/data transfer implications<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If your architecture spreads across zones\/regions for resilience, evaluate:<\/li>\n<li>Cross-zone service calls<\/li>\n<li>Cross-region database replication<\/li>\n<li>Egress to the internet<\/li>\n<li>Always validate with the official Network pricing pages (pricing can vary and change).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How to optimize cost (practical)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Reserve only the <strong>true baseline<\/strong> you need for SLOs.<\/li>\n<li>Use autoscaling for burst above baseline.<\/li>\n<li>Use rightsizing and delete idle resources.<\/li>\n<li>Use labels to drive chargeback\/showback.<\/li>\n<li>Set budgets and alerts in Cloud Billing.<\/li>\n<li>If your baseline is stable, evaluate committed spend programs (verify current Compute discount offerings and constraints in official docs).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Example low-cost starter estimate (no fabricated numbers)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A low-cost lab can be done with:\n&#8211; 1 small VM instance for a short period\n&#8211; Standard persistent disk\n&#8211; Minimal logging\nUse the Pricing Calculator to estimate for your region\/zone and runtime duration:\nhttps:\/\/cloud.google.com\/products\/calculator<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Example production cost considerations<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">For a production API service:\n&#8211; Baseline of N VMs per zone across 2\u20133 zones (high availability)\n&#8211; Load balancer, NAT (if private instances), monitoring, logs, CI automation\n&#8211; Possible committed spend alignment for baseline\nCost is driven primarily by baseline + peak headroom and network patterns. Use the calculator and export billing to BigQuery for ongoing analysis (verify billing export setup docs for your org).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">10. Step-by-Step Hands-On Tutorial<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This lab uses <strong>Compute Engine reservations<\/strong> to demonstrate a practical \u201cCapacity Planner\u201d workflow: reserve zonal capacity and create a VM configured to use it.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Objective<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Enable Compute Engine<\/li>\n<li>Create a <strong>zonal reservation<\/strong> for a specific machine type<\/li>\n<li>Provision a VM that consumes the reservation (or validate reservation readiness, depending on your environment\u2019s options)<\/li>\n<li>Verify behavior<\/li>\n<li>Clean up resources to avoid ongoing charges<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Lab Overview<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You will:\n1. Prepare a project and enable APIs\n2. Choose a zone and machine type suitable for a low-cost test\n3. Create a reservation in that zone\n4. Create a VM configured with reservation affinity\n5. Validate reservation usage and troubleshoot common failures\n6. Delete resources<\/p>\n\n\n\n<blockquote>\n<p>Note: The Cloud Console UI and gcloud flags can evolve. Where you see differences, rely on the authoritative help output (<code>gcloud ... --help<\/code>) and official docs. Reservations doc: https:\/\/cloud.google.com\/compute\/docs\/instances\/reserving-zonal-resources<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Set your project and enable Compute Engine API<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">Option A: Cloud Console<\/h4>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open the Cloud Console: https:\/\/console.cloud.google.com\/<\/li>\n<li>Select (or create) a project.<\/li>\n<li>Go to <strong>APIs &amp; Services \u2192 Library<\/strong>.<\/li>\n<li>Search for <strong>Compute Engine API<\/strong> and click <strong>Enable<\/strong>.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Compute Engine API is enabled for the project.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Option B: gcloud<\/h4>\n\n\n\n<pre><code class=\"language-bash\">gcloud auth login\ngcloud config set project PROJECT_ID\ngcloud services enable compute.googleapis.com\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Command completes successfully.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification:<\/strong><\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud services list --enabled --filter=\"name:compute.googleapis.com\"\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Pick a zone and machine type<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Choose a zone where you are allowed to run VMs (quota, policy) and a common machine type.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Pick a region\/zone (example): <code>us-central1-a<\/code><\/li>\n<li>Pick a machine type (example): <code>e2-medium<\/code> or <code>n2-standard-2<\/code> (choose based on what\u2019s available and affordable in your region)<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> You have a chosen <code>(zone, machine type, count)<\/code> for the reservation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification (optional):<\/strong><\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute zones describe us-central1-a\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Create a zonal reservation<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">Option A: Cloud Console<\/h4>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go to <strong>Compute Engine \u2192 Reservations<\/strong> (in the Cloud Console navigation).<\/li>\n<li>Click <strong>Create reservation<\/strong>.<\/li>\n<li>Configure:\n   &#8211; <strong>Name:<\/strong> <code>lab-reservation-1<\/code>\n   &#8211; <strong>Zone:<\/strong> your selected zone (e.g., <code>us-central1-a<\/code>)\n   &#8211; <strong>Machine type:<\/strong> your selected type (e.g., <code>e2-medium<\/code>)\n   &#8211; <strong>VM count:<\/strong> <code>1<\/code>\n   &#8211; (Optional) <strong>Labels:<\/strong> <code>env=lab<\/code>, <code>owner=YOUR_NAME<\/code><\/li>\n<li>Create the reservation.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> The reservation appears in the list in the selected zone.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Option B: gcloud (verify flags with <code>--help<\/code>)<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Run:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute reservations create lab-reservation-1 \\\n  --zone=us-central1-a \\\n  --machine-type=e2-medium \\\n  --vm-count=1\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">If your gcloud version uses different flags, run:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute reservations create --help\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Reservation is created.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification:<\/strong><\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute reservations list --zones=us-central1-a\ngcloud compute reservations describe lab-reservation-1 --zone=us-central1-a\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Create a VM that uses the reservation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You have two common patterns:\n&#8211; <strong>Specific reservation affinity<\/strong>: VM must consume <code>lab-reservation-1<\/code>\n&#8211; <strong>Any reservation affinity<\/strong>: VM can consume any matching reservation in the zone<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Option A: Cloud Console (recommended for beginners)<\/h4>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go to <strong>Compute Engine \u2192 VM instances \u2192 Create instance<\/strong>.<\/li>\n<li>Set:\n   &#8211; <strong>Name:<\/strong> <code>lab-vm-1<\/code>\n   &#8211; <strong>Region\/Zone:<\/strong> same zone as reservation (e.g., <code>us-central1-a<\/code>)\n   &#8211; <strong>Machine type:<\/strong> must match the reservation (e.g., <code>e2-medium<\/code>)<\/li>\n<li>Expand <strong>Advanced options<\/strong> (or similar) and locate <strong>Reservation<\/strong> \/ <strong>Capacity<\/strong> settings.<\/li>\n<li>Choose:\n   &#8211; <strong>Consume a specific reservation<\/strong> and select <code>lab-reservation-1<\/code><br\/>\n     (UI labels may vary; verify in your console)<\/li>\n<li>Create the instance.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> VM is created successfully and should consume the reserved capacity.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Option B: gcloud (verify exact flags)<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Because reservation affinity flags can change across gcloud versions, use help:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute instances create --help | grep -i reservation -n\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Then create the instance using the reservation-affinity flags shown in your help output. The intent is:\n&#8211; same zone\n&#8211; same machine type\n&#8211; reservation affinity set to <em>specific<\/em> reservation <code>lab-reservation-1<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> VM is running.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification:<\/strong><\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute instances list --filter=\"name=lab-vm-1\"\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Observe reservation utilization and instance placement behavior<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">In Cloud Console<\/h4>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go to <strong>Compute Engine \u2192 Reservations<\/strong><\/li>\n<li>Click <code>lab-reservation-1<\/code><\/li>\n<li>Check utilization\/consumption indicators (exact fields vary).<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Reservation shows reduced available capacity or indicates it is consumed by <code>lab-vm-1<\/code>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">With gcloud<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Describe the reservation and look for fields indicating:\n&#8211; allocated count\n&#8211; consumed count\n&#8211; specific consumers (if shown)<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute reservations describe lab-reservation-1 --zone=us-central1-a\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> You can confirm the reservation exists and see its configured capacity. If consumption fields are not obvious, verify the reservation and instance settings in the console and official docs (field names can vary).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Validation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You have successfully completed the lab if:\n&#8211; A reservation exists in the same zone and machine type as your VM\n&#8211; A VM instance is running and is configured to consume the reservation (specific or any affinity)\n&#8211; The reservation indicates consumption (or, at minimum, VM provisioning succeeds when pinned to the reservation)<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Troubleshooting<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common issues and fixes:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>VM creation fails with \u201cquota exceeded\u201d<\/strong>\n   &#8211; <strong>Cause:<\/strong> Project quota (vCPU, instances, etc.) is insufficient.\n   &#8211; <strong>Fix:<\/strong> Request quota increase or reduce machine size\/count.<br\/>\n     Quotas: https:\/\/cloud.google.com\/compute\/quotas<\/p>\n<\/li>\n<li>\n<p><strong>VM creation fails with \u201cno matching reservation found\u201d \/ \u201creservation mismatch\u201d<\/strong>\n   &#8211; <strong>Cause:<\/strong> Reservation and VM do not match (zone, machine type, attributes).\n   &#8211; <strong>Fix:<\/strong> Ensure <strong>same zone<\/strong> and <strong>same machine type<\/strong>, and that reservation affinity points to the correct reservation.<\/p>\n<\/li>\n<li>\n<p><strong>VM still fails with capacity error even with reservation<\/strong>\n   &#8211; <strong>Cause:<\/strong> The VM isn\u2019t actually configured to consume the reservation, or reservation is exhausted, or there are additional constraints (e.g., GPUs, local SSD) not included in the reservation.\n   &#8211; <strong>Fix:<\/strong> Confirm affinity settings; confirm reservation count; confirm VM attributes.<\/p>\n<\/li>\n<li>\n<p><strong>Can\u2019t find Reservations in the Console<\/strong>\n   &#8211; <strong>Cause:<\/strong> UI navigation differences or permissions.\n   &#8211; <strong>Fix:<\/strong> Ensure you have Compute permissions; try searching \u201cReservations\u201d in the console search bar.<\/p>\n<\/li>\n<li>\n<p><strong>gcloud flags don\u2019t match this tutorial<\/strong>\n   &#8211; <strong>Cause:<\/strong> CLI version differences.\n   &#8211; <strong>Fix:<\/strong> Use <code>--help<\/code> output as the source of truth. Keep the conceptual requirements: same zone, matching machine type, correct affinity.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Cleanup<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To avoid ongoing charges, delete the VM and any associated billable resources.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Delete the VM<\/h4>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute instances delete lab-vm-1 --zone=us-central1-a\n<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Delete the reservation<\/h4>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute reservations delete lab-reservation-1 --zone=us-central1-a\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Final verification:<\/strong><\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud compute instances list --filter=\"name=lab-vm-1\"\ngcloud compute reservations list --zones=us-central1-a\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">11. Best Practices<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Architecture best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Design for multi-zone<\/strong>: Reservations are zonal. For high availability, reserve baseline capacity in at least two zones and use load balancing + multi-zone MIGs.<\/li>\n<li><strong>Separate baseline vs burst<\/strong>: Reserve baseline; use autoscaling for burst above baseline.<\/li>\n<li><strong>Use failure domains intentionally<\/strong>: Align reservations to your failover plan (zone-level or region-level).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">IAM\/security best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Separate duties<\/strong>:<\/li>\n<li>Capacity admins (can create\/modify reservations)<\/li>\n<li>Workload deployers (can create instances but not change reservation pools)<\/li>\n<li>Use <strong>service accounts<\/strong> for automation with minimal permissions.<\/li>\n<li>Apply consistent labels and ownership metadata.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cost best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Reserve only what your SLO truly requires (baseline).<\/li>\n<li>Periodically re-evaluate baseline as product usage changes.<\/li>\n<li>Use budgets and alerts; label resources for cost attribution.<\/li>\n<li>If using commitment\/discount programs, tie them to observed baseline usage and re-check regularly.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Performance best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure VM shapes match real workload requirements (CPU\/memory\/IO).<\/li>\n<li>Avoid pinning to overly constrained zones unless required.<\/li>\n<li>Validate that network and disk performance match your scaling goals.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Reliability best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Treat capacity as an SRE concern: set SLOs around successful scale-out and provisioning latency.<\/li>\n<li>Use health checks + autohealing on MIGs; reservations don\u2019t fix unhealthy instances.<\/li>\n<li>Drill failover: ensure secondary zones have adequate reserved baseline.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operations best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Manage reservations using IaC and change control.<\/li>\n<li>Create dashboards for:<\/li>\n<li>Instance counts per zone<\/li>\n<li>Provisioning failure rates<\/li>\n<li>Autoscaler events<\/li>\n<li>Reservation utilization (where available)<\/li>\n<li>Automate cleanup of temporary reservations used for launches or tests.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Governance\/tagging\/naming best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Standardize naming:<\/li>\n<li><code>resv-&lt;app&gt;-&lt;env&gt;-&lt;zone&gt;-&lt;shape&gt;<\/code><\/li>\n<li>Standardize labels:<\/li>\n<li><code>env<\/code>, <code>app<\/code>, <code>owner<\/code>, <code>cost_center<\/code>, <code>lifecycle<\/code><\/li>\n<li>Track reservations in asset inventory exports and compliance reporting.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">12. Security Considerations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Identity and access model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>IAM governs all reservation and instance operations.<\/strong><\/li>\n<li>Use least privilege:<\/li>\n<li>Create\/manage reservations: restricted to a small admin group or automation SA<\/li>\n<li>Consume reservations: instance creation rights can be broader, but ensure affinity is controlled<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Encryption<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Compute Engine encrypts data at rest by default for persistent disks (verify current encryption behavior and options such as CMEK in official docs).<\/li>\n<li>Encryption in transit is your responsibility at the application layer and via TLS termination patterns.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network exposure<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Reservations do not expose endpoints; your VMs do. Apply:\n&#8211; Private VMs where possible (no external IPs)\n&#8211; Cloud NAT for outbound internet if needed\n&#8211; Firewall rules or hierarchical firewall policies\n&#8211; Load balancers for controlled ingress<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Secrets handling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Do not store secrets in VM metadata or images.<\/li>\n<li>Use Secret Manager (recommended) and IAM-controlled access.\n  Secret Manager docs: https:\/\/cloud.google.com\/secret-manager\/docs<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Audit\/logging<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure Admin Activity audit logs are retained per policy.<\/li>\n<li>Export logs to SIEM if required.<\/li>\n<li>Monitor for unexpected reservation changes.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compliance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data locality: reservations can help keep capacity within required zones, but compliance requires broader controls (org policies, data storage location, access control).<\/li>\n<li>Change control: treat reservation changes as production-impacting.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Common security mistakes<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Letting broad developer roles create\/modify reservations without review.<\/li>\n<li>Not labeling reservations (no ownership, harder incident response).<\/li>\n<li>Relying on reservations as a substitute for multi-zone reliability.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secure deployment recommendations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use a dedicated \u201ccapacity-admin\u201d pipeline with approval gates.<\/li>\n<li>Apply org policy constraints for allowed regions\/zones if required.<\/li>\n<li>Use separate projects for prod vs non-prod; apply consistent patterns.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">13. Limitations and Gotchas<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Zonal scope<\/strong>: A reservation in <code>zone A<\/code> does nothing for <code>zone B<\/code>.<\/li>\n<li><strong>Matching rules matter<\/strong>: VM must match reservation requirements (machine type and other attributes). Mismatches are a common source of failures.<\/li>\n<li><strong>Quotas are separate from capacity<\/strong>: Even with a reservation, insufficient quota can block VM creation.<\/li>\n<li><strong>Operational drift<\/strong>: Manual edits can diverge from IaC; enforce policies and periodic reconciliation.<\/li>\n<li><strong>Misuse risk<\/strong>: If affinity is too open, non-critical workloads can consume reserved capacity.<\/li>\n<li><strong>Cost surprises are indirect<\/strong>: Reservations may not bill directly (verify), but they can encourage overprovisioning baseline fleets.<\/li>\n<li><strong>Specialized capacity<\/strong> (large shapes, GPUs) can be constrained; reservation availability and rules may differ\u2014verify official docs for your machine family.<\/li>\n<li><strong>Console\/CLI evolution<\/strong>: UI labels and gcloud flags can change; rely on official docs and <code>--help<\/code>.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">14. Comparison with Alternatives<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Capacity planning can be done with different approaches depending on your workload and tolerance for risk.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Options to compare<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Google Cloud Compute Engine Reservations<\/strong> (the core of Capacity Planner)<\/li>\n<li><strong>Autoscaling without reservations<\/strong> (best effort)<\/li>\n<li><strong>Committed use discounts (CUDs)<\/strong> for cost (not capacity) planning<\/li>\n<li><strong>Multi-cloud capacity management<\/strong> (AWS EC2 Capacity Reservations; Azure Reserved VM Instances\/Capacity)<\/li>\n<li><strong>Self-managed schedulers<\/strong> (Kubernetes cluster autoscaler + node pools; HashiCorp Nomad)<\/li>\n<\/ul>\n\n\n\n<blockquote>\n<p>Note: CUDs primarily address <strong>cost<\/strong>, not guaranteed <strong>capacity<\/strong>. Reservations primarily address <strong>capacity availability<\/strong> in a zone.<\/p>\n<\/blockquote>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Option<\/th>\n<th>Best For<\/th>\n<th>Strengths<\/th>\n<th>Weaknesses<\/th>\n<th>When to Choose<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Google Cloud Capacity Planner (via Compute Engine Reservations)<\/strong><\/td>\n<td>Workloads needing guaranteed zonal VM availability<\/td>\n<td>Deterministic VM launch capacity, better reliability<\/td>\n<td>Zonal complexity, requires governance; doesn\u2019t replace HA design<\/td>\n<td>When you must reduce provisioning failures and guarantee baseline capacity<\/td>\n<\/tr>\n<tr>\n<td>Autoscaling (no reservations)<\/td>\n<td>Flexible workloads tolerant of occasional provisioning delays<\/td>\n<td>Simple operations, no reservation management<\/td>\n<td>Can fail during capacity pressure; unpredictable<\/td>\n<td>Early-stage apps, dev\/test, or globally flexible services<\/td>\n<\/tr>\n<tr>\n<td>Committed Use Discounts (Compute)<\/td>\n<td>Predictable baseline usage cost optimization<\/td>\n<td>Lower unit cost for steady-state workloads<\/td>\n<td>Commitment risk; not a capacity guarantee<\/td>\n<td>When cost is the primary goal and capacity is acceptable best effort<\/td>\n<\/tr>\n<tr>\n<td>AWS EC2 Capacity Reservations<\/td>\n<td>Organizations standardizing on AWS needing capacity guarantees<\/td>\n<td>Mature capacity reservation constructs; integrates with AWS ecosystem<\/td>\n<td>Different cloud; migration and ops overhead<\/td>\n<td>If you\u2019re on AWS and need guaranteed capacity in AZs<\/td>\n<\/tr>\n<tr>\n<td>Azure capacity\/reservations equivalents<\/td>\n<td>Azure-first enterprises<\/td>\n<td>Integrated with Azure governance<\/td>\n<td>Different cloud; migration and ops overhead<\/td>\n<td>If you\u2019re on Azure and require capacity planning there<\/td>\n<\/tr>\n<tr>\n<td>Self-managed schedulers (K8s\/Nomad)<\/td>\n<td>Platform teams with sophisticated scheduling needs<\/td>\n<td>Fine-grained placement control, multi-tenant scheduling<\/td>\n<td>Still depends on underlying capacity; complex<\/td>\n<td>When you need advanced scheduling plus you still manage baseline capacity<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">15. Real-World Example<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Enterprise example: Multi-zone payments platform<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> A payments platform must maintain strict latency and uptime. During seasonal peaks, VM scale-outs occasionally fail in a preferred zone, causing elevated error rates.<\/li>\n<li><strong>Proposed architecture:<\/strong><\/li>\n<li>Multi-zone MIGs behind a regional load balancer<\/li>\n<li>Baseline reservations per zone for the core API tier<\/li>\n<li>Autoscaling above baseline<\/li>\n<li>Cloud Monitoring SLOs for provisioning success and request latency<\/li>\n<li>Strict IAM separation: capacity-admin vs app deployers<\/li>\n<li><strong>Why Capacity Planner was chosen:<\/strong> The enterprise needed a deterministic baseline in each zone to prevent capacity-related incidents.<\/li>\n<li><strong>Expected outcomes:<\/strong><\/li>\n<li>Fewer scale-out failures<\/li>\n<li>More predictable incident response<\/li>\n<li>Better governance and auditability of capacity changes<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Startup\/small-team example: SaaS CI runner pool<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> A small SaaS team relies on VM-based CI runners. Occasionally the runner pool can\u2019t expand quickly, delaying releases.<\/li>\n<li><strong>Proposed architecture:<\/strong><\/li>\n<li>A small baseline reservation for runner VMs in one zone<\/li>\n<li>Simple autoscaling for extra runners<\/li>\n<li>Budget alerts to avoid runaway costs<\/li>\n<li><strong>Why Capacity Planner was chosen:<\/strong> The team needed reliable runner availability during working hours without building a complex platform.<\/li>\n<li><strong>Expected outcomes:<\/strong><\/li>\n<li>Reduced developer wait time<\/li>\n<li>Predictable baseline costs<\/li>\n<li>Minimal operational overhead compared to more complex scheduling solutions<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">16. FAQ<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Is \u201cCapacity Planner\u201d a separate Google Cloud product?<\/strong><br\/>\n   Often, \u201cCapacity Planner\u201d refers to capacity planning workflows rather than a standalone product. For Compute, the most concrete official feature is <strong>Compute Engine Reservations<\/strong>. Verify your org\u2019s terminology and check official docs.<\/p>\n<\/li>\n<li>\n<p><strong>What does a Compute Engine reservation guarantee?<\/strong><br\/>\n   It is intended to guarantee the ability to provision matching VMs in a specific zone by reserving capacity. Exact guarantees and matching rules should be verified in official documentation for your machine family and zone.<\/p>\n<\/li>\n<li>\n<p><strong>Are reservations regional or zonal?<\/strong><br\/>\n   Reservations are typically <strong>zonal<\/strong> resources in Compute Engine.<\/p>\n<\/li>\n<li>\n<p><strong>Do reservations cost money by themselves?<\/strong><br\/>\n   In many common cases, billing is driven by <strong>running VMs<\/strong> rather than the reservation object. However, pricing models can evolve\u2014verify in official docs and pricing pages.<\/p>\n<\/li>\n<li>\n<p><strong>What\u2019s the difference between reservations and committed use discounts (CUDs)?<\/strong><br\/>\n   Reservations focus on <strong>capacity availability<\/strong>; CUDs focus on <strong>cost reduction<\/strong> for predictable usage. They solve different problems.<\/p>\n<\/li>\n<li>\n<p><strong>Can I use reservations with Managed Instance Groups (MIGs)?<\/strong><br\/>\n   Yes\u2014commonly by ensuring the MIG instances are created in the zone(s) with reservations and configured with appropriate reservation affinity (verify the best practice for your specific MIG configuration).<\/p>\n<\/li>\n<li>\n<p><strong>How do I stop non-critical workloads from consuming reserved capacity?<\/strong><br\/>\n   Use reservation affinity rules (specific vs any) and IAM governance. Ensure critical workloads explicitly target the reservation.<\/p>\n<\/li>\n<li>\n<p><strong>What if I reserve capacity in the wrong zone?<\/strong><br\/>\n   The reservation won\u2019t help workloads in other zones. You may need to create additional reservations or adjust your architecture.<\/p>\n<\/li>\n<li>\n<p><strong>How do quotas relate to reservations?<\/strong><br\/>\n   Quotas are separate limits. Even if you have a reservation, insufficient quota can still prevent instance creation.<\/p>\n<\/li>\n<li>\n<p><strong>How do I measure reservation utilization?<\/strong><br\/>\n   Use the Compute Engine console reservation details and relevant APIs\/fields. For broader insight, correlate instance inventory (Asset Inventory) and deployment metrics. Verify current utilization metrics availability in docs.<\/p>\n<\/li>\n<li>\n<p><strong>Can I reserve capacity for GPUs?<\/strong><br\/>\n   In some cases and regions, yes, but specialized capacity has additional constraints. Verify GPU reservation support for your selected region, zone, and machine type.<\/p>\n<\/li>\n<li>\n<p><strong>What is reservation affinity?<\/strong><br\/>\n   A VM setting that controls whether the VM must use a particular reservation, can use any matching reservation, or should not use reservations.<\/p>\n<\/li>\n<li>\n<p><strong>Does reserving capacity improve performance?<\/strong><br\/>\n   Reservations primarily improve <strong>availability to provision<\/strong>, not runtime performance. Performance depends on machine type, disk, network, and application design.<\/p>\n<\/li>\n<li>\n<p><strong>Is capacity planning only for large enterprises?<\/strong><br\/>\n   No. Any team that experiences provisioning failures during critical moments (releases, batch windows) can benefit.<\/p>\n<\/li>\n<li>\n<p><strong>What\u2019s the first step to adopt Capacity Planner?<\/strong><br\/>\n   Start by measuring your baseline usage per zone and machine type, confirm quotas, then create a small reservation for a critical workload and validate consumption.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">17. Top Online Resources to Learn Capacity Planner<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The most reliable resources are Compute Engine reservation docs and related operational documentation.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Resource Type<\/th>\n<th>Name<\/th>\n<th>Why It Is Useful<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Official documentation<\/td>\n<td>Compute Engine Reservations<\/td>\n<td>Primary reference for reserving zonal resources and configuration details: https:\/\/cloud.google.com\/compute\/docs\/instances\/reserving-zonal-resources<\/td>\n<\/tr>\n<tr>\n<td>Official documentation<\/td>\n<td>Compute Engine Quotas<\/td>\n<td>Helps distinguish quota failures from capacity failures: https:\/\/cloud.google.com\/compute\/quotas<\/td>\n<\/tr>\n<tr>\n<td>Official pricing page<\/td>\n<td>Compute Engine Pricing<\/td>\n<td>Understand VM, disk, and related cost drivers: https:\/\/cloud.google.com\/compute\/pricing<\/td>\n<\/tr>\n<tr>\n<td>Official tool<\/td>\n<td>Google Cloud Pricing Calculator<\/td>\n<td>Build region-specific estimates without guessing: https:\/\/cloud.google.com\/products\/calculator<\/td>\n<\/tr>\n<tr>\n<td>Official documentation<\/td>\n<td>Cloud Billing<\/td>\n<td>Budgets, exports, and governance: https:\/\/cloud.google.com\/billing\/docs<\/td>\n<\/tr>\n<tr>\n<td>Official documentation<\/td>\n<td>Cloud Monitoring<\/td>\n<td>Operational dashboards and alerting: https:\/\/cloud.google.com\/monitoring\/docs<\/td>\n<\/tr>\n<tr>\n<td>Official documentation<\/td>\n<td>Cloud Logging<\/td>\n<td>Troubleshooting and audit trails: https:\/\/cloud.google.com\/logging\/docs<\/td>\n<\/tr>\n<tr>\n<td>Official documentation<\/td>\n<td>IAM<\/td>\n<td>Least privilege and access governance: https:\/\/cloud.google.com\/iam\/docs<\/td>\n<\/tr>\n<tr>\n<td>Official documentation<\/td>\n<td>Recommender<\/td>\n<td>Useful for cost\/rightsizing recommendations (capacity planning adjacent): https:\/\/cloud.google.com\/recommender\/docs<\/td>\n<\/tr>\n<tr>\n<td>Official videos<\/td>\n<td>Google Cloud Tech YouTube<\/td>\n<td>Search for Compute Engine reservations\/capacity planning content: https:\/\/www.youtube.com\/@googlecloudtech<\/td>\n<\/tr>\n<tr>\n<td>Trusted hands-on labs<\/td>\n<td>Google Cloud Skills Boost<\/td>\n<td>Search for Compute Engine labs that include capacity and operations topics: https:\/\/www.cloudskillsboost.google\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">18. Training and Certification Providers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Institute<\/th>\n<th>Suitable Audience<\/th>\n<th>Likely Learning Focus<\/th>\n<th>Mode<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps engineers, SREs, cloud engineers<\/td>\n<td>DevOps, cloud operations, automation, IaC foundations<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>ScmGalaxy.com<\/td>\n<td>Beginners to intermediate engineers<\/td>\n<td>SCM, DevOps practices, CI\/CD and tooling<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.scmgalaxy.com\/<\/td>\n<\/tr>\n<tr>\n<td>CLoudOpsNow.in<\/td>\n<td>Cloud operations teams<\/td>\n<td>Cloud ops, monitoring, reliability practices<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.cloudopsnow.in\/<\/td>\n<\/tr>\n<tr>\n<td>SreSchool.com<\/td>\n<td>SREs, platform teams<\/td>\n<td>SRE principles, observability, reliability engineering<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.sreschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>AiOpsSchool.com<\/td>\n<td>Operations + data\/automation practitioners<\/td>\n<td>AIOps concepts, automation, operational analytics<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.aiopsschool.com\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">19. Top Trainers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Platform\/Site<\/th>\n<th>Likely Specialization<\/th>\n<th>Suitable Audience<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>RajeshKumar.xyz<\/td>\n<td>DevOps\/cloud training and guidance (verify offerings)<\/td>\n<td>Beginners to intermediate engineers<\/td>\n<td>https:\/\/rajeshkumar.xyz\/<\/td>\n<\/tr>\n<tr>\n<td>devopstrainer.in<\/td>\n<td>DevOps tooling, CI\/CD, cloud operations (verify offerings)<\/td>\n<td>DevOps engineers, SREs<\/td>\n<td>https:\/\/www.devopstrainer.in\/<\/td>\n<\/tr>\n<tr>\n<td>devopsfreelancer.com<\/td>\n<td>Freelance DevOps support\/training resources (verify offerings)<\/td>\n<td>Teams needing short-term expertise<\/td>\n<td>https:\/\/www.devopsfreelancer.com\/<\/td>\n<\/tr>\n<tr>\n<td>devopssupport.in<\/td>\n<td>DevOps support and training resources (verify offerings)<\/td>\n<td>Ops teams and engineers<\/td>\n<td>https:\/\/www.devopssupport.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">20. Top Consulting Companies<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Company Name<\/th>\n<th>Likely Service Area<\/th>\n<th>Where They May Help<\/th>\n<th>Consulting Use Case Examples<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>cotocus.com<\/td>\n<td>Cloud\/DevOps consulting (verify service catalog)<\/td>\n<td>Architecture, DevOps enablement, migrations<\/td>\n<td>Capacity planning strategy, IaC adoption, monitoring\/alerting design<\/td>\n<td>https:\/\/cotocus.com\/<\/td>\n<\/tr>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps\/cloud consulting and training (verify service catalog)<\/td>\n<td>Platform enablement, CI\/CD, automation<\/td>\n<td>Implement reservation\/IaC workflows, governance and operational runbooks<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>DEVOPSCONSULTING.IN<\/td>\n<td>DevOps consulting (verify service catalog)<\/td>\n<td>DevOps transformation and operations<\/td>\n<td>Build deployment pipelines, implement monitoring and cost governance<\/td>\n<td>https:\/\/www.devopsconsulting.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">21. Career and Learning Roadmap<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn before Capacity Planner<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Compute Engine fundamentals: instances, images, disks, networking<\/li>\n<li>IAM basics: roles, service accounts, least privilege<\/li>\n<li>Basic networking: VPCs, subnets, firewall rules, NAT<\/li>\n<li>Observability basics: logs vs metrics, alerting<\/li>\n<li>FinOps basics: budgets, labels, pricing calculator<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn after Capacity Planner<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Advanced MIG design: multi-zone deployments, autoscaling policies, rollout strategies<\/li>\n<li>Reliability engineering: SLOs\/SLIs, incident response, capacity error budgets<\/li>\n<li>IaC maturity: Terraform modules, policy-as-code, approval workflows<\/li>\n<li>Cost optimization: rightsizing, discount program strategy (verify current offerings)<\/li>\n<li>Governance: organization policies, hierarchical firewalls, centralized logging<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Job roles that use it<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Site Reliability Engineer (SRE)<\/li>\n<li>Platform Engineer<\/li>\n<li>Cloud Infrastructure Engineer<\/li>\n<li>DevOps Engineer<\/li>\n<li>Cloud Solutions Architect<\/li>\n<li>FinOps Analyst (capacity-cost alignment)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Certification path (if available)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Google Cloud certifications do not typically certify \u201cCapacity Planner\u201d specifically, but relevant certifications include:\n&#8211; Associate Cloud Engineer\n&#8211; Professional Cloud Architect\n&#8211; Professional Cloud DevOps Engineer<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Verify the latest certification paths here: https:\/\/cloud.google.com\/learn\/certification<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project ideas for practice<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Build a multi-zone web service with MIGs and baseline reservations per zone.<\/li>\n<li>Create a \u201ccapacity runbook\u201d that includes quota checks, reservation validation, and rollback steps.<\/li>\n<li>Implement a Terraform module for reservations + MIG configuration and integrate it with CI approvals.<\/li>\n<li>Create dashboards for provisioning failures and autoscaler events; set on-call alerts.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">22. Glossary<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Capacity planning<\/strong>: Estimating and preparing compute resources needed to meet performance and reliability goals.<\/li>\n<li><strong>Compute Engine<\/strong>: Google Cloud\u2019s Infrastructure-as-a-Service VM platform.<\/li>\n<li><strong>Reservation (Compute Engine)<\/strong>: A zonal resource that reserves capacity for VM instances with matching requirements.<\/li>\n<li><strong>Zone<\/strong>: An isolated location within a region where resources run.<\/li>\n<li><strong>Region<\/strong>: A geographic area containing multiple zones.<\/li>\n<li><strong>MIG (Managed Instance Group)<\/strong>: A group of identical VMs managed as a single entity with autoscaling and autohealing.<\/li>\n<li><strong>Reservation affinity<\/strong>: VM setting controlling whether a VM must use a reservation, can use one, or avoids reservations.<\/li>\n<li><strong>Quota<\/strong>: A project-level limit on resources (vCPU, GPUs, instances, etc.).<\/li>\n<li><strong>SLO\/SLI<\/strong>: Service Level Objective\/Indicator\u2014reliability targets and their measurements.<\/li>\n<li><strong>IaC (Infrastructure as Code)<\/strong>: Managing infrastructure via declarative code (e.g., Terraform).<\/li>\n<li><strong>FinOps<\/strong>: Practice of managing cloud spend with engineering, finance, and business collaboration.<\/li>\n<li><strong>Cloud Monitoring<\/strong>: Google Cloud\u2019s metrics, dashboards, and alerting service.<\/li>\n<li><strong>Cloud Logging<\/strong>: Google Cloud\u2019s centralized logging service and audit log platform.<\/li>\n<li><strong>Org Policy<\/strong>: Organization-level constraints that govern allowed configurations.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">23. Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Capacity Planner in <strong>Google Cloud Compute<\/strong> is best implemented as a <strong>disciplined capacity planning workflow<\/strong> centered on <strong>Compute Engine reservations<\/strong>. It helps you ensure that the VM capacity your workloads require\u2014especially in specific zones and with specific machine types\u2014will be available when you need it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It matters because it reduces provisioning failures, improves production reliability, and makes scaling behavior more deterministic. The key cost and security considerations are indirect but critical: avoid overprovisioning baseline fleets, govern who can change reservations, label everything for ownership, and monitor both quota and provisioning outcomes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Use Capacity Planner (via reservations) when you have <strong>critical workloads with predictable baselines<\/strong> and low tolerance for capacity-related failures. Your next learning step is to go deeper into <strong>Compute Engine Reservations<\/strong> documentation and practice deploying a <strong>multi-zone MIG<\/strong> with a reserved baseline in each zone, backed by monitoring and budgets:\nhttps:\/\/cloud.google.com\/compute\/docs\/instances\/reserving-zonal-resources<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Compute<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[26,51],"tags":[],"class_list":["post-623","post","type-post","status-publish","format-standard","hentry","category-compute","category-google-cloud"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/623","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/comments?post=623"}],"version-history":[{"count":0,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/623\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/media?parent=623"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/categories?post=623"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/tags?post=623"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}