Google Cloud Terraform on Google Cloud Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Infrastructure as code

Category

Infrastructure as code

1. Introduction

Terraform on Google Cloud refers to using HashiCorp Terraform (an Infrastructure as Code tool) to provision and manage Google Cloud resources through Google Cloud APIs.

In simple terms: you write human-readable configuration files that describe the cloud infrastructure you want (networks, storage buckets, IAM bindings, compute resources), and Terraform creates and updates those resources in Google Cloud in a repeatable way.

Technically: Terraform uses the Google Cloud provider (hashicorp/google and hashicorp/google-beta) to translate your desired state (HCL configuration) into API calls against Google Cloud services. Terraform tracks what it created in a state file (stored locally or in a remote backend like Cloud Storage) and computes safe changes using plans before applying them.

Terraform on Google Cloud solves common Infrastructure as Code problems: – Consistency: environments can be recreated reliably (dev/test/prod). – Change control: “plan” shows the impact before changing production. – Standardization: modules and policies encode approved patterns. – Scalability: manage large fleets of resources across projects and teams. – Automation: integrate with CI/CD and GitOps workflows.

Service naming note (important): “Terraform on Google Cloud” is not a single standalone Google Cloud product SKU. Terraform is a HashiCorp tool that integrates with Google Cloud via providers and APIs. Google Cloud also offers a managed Terraform execution service called Infrastructure Manager (Terraform-based). This tutorial focuses on Terraform workflows on and for Google Cloud, and calls out Infrastructure Manager where relevant. Verify the latest product positioning in official docs when making platform decisions.


2. What is Terraform on Google Cloud?

Official purpose

Terraform on Google Cloud is the practice of using Terraform to define, provision, and manage Google Cloud infrastructure via code. The integration is enabled primarily through: – The Google Cloud Terraform providers: – hashicorp/googlehashicorp/google-beta (for newer/preview API features) – Google-supported modules and reference architectures (commonly through the Cloud Foundation Toolkit).

Core capabilities

  • Provision Google Cloud resources (VPC, IAM, GKE, Cloud Run, Cloud Storage, BigQuery, etc.) by declaring them in Terraform configuration.
  • Manage lifecycle: create, update, and delete infrastructure safely.
  • Maintain state of deployed resources to compute changes (drift detection and updates).
  • Compose reusable modules and deploy multiple environments.
  • Integrate with CI/CD (Cloud Build, GitHub Actions, Jenkins, etc.).
  • Enforce standards through policy (e.g., Sentinel in Terraform Enterprise/Cloud, OPA/Conftest, or organization policies on Google Cloud).

Major components

  • Terraform CLI: runs init, plan, apply, destroy, import, etc.
  • Terraform configuration (HCL): .tf files describing resources, variables, outputs, providers, modules.
  • Providers: plugins that talk to Google Cloud APIs.
  • State: .tfstate (local or remote), often stored in Cloud Storage for teams.
  • Backend: remote state storage and (in some backends) state locking.
  • Execution environment: developer laptop, Cloud Shell, CI runners, or a managed execution service (e.g., Terraform Cloud/Enterprise; Google Cloud Infrastructure Manager—verify capabilities in official docs).

Service type (how to think about it)

Terraform on Google Cloud is not a single “regional service.” It’s a tool-driven workflow that interacts with Google Cloud’s: – Global control plane APIs (IAM, Resource Manager, service enablement) – Regional/zonal resources (Compute Engine, GKE, Cloud SQL) – Global resources (Cloud Storage buckets are globally named but have location constraints; IAM is global at project/org scope)

Scope (project/org/account)

Terraform can manage resources across: – A single project – Multiple projects under a folder – An organization (Org policies, folder IAM, shared VPC)

The scope is determined by: – Which Google Cloud credentials Terraform uses – What IAM roles those credentials have – What project/folder/org IDs your configuration targets

How it fits into the Google Cloud ecosystem

Terraform on Google Cloud commonly sits at the center of a platform toolchain: – IAM and org policy establish guardrails – Terraform provisions foundational infrastructure (networks, projects, service accounts) – Application teams deploy workloads (GKE/Cloud Run/Compute) using modules – CI/CD applies changes with approvals and auditing – Cloud Logging/Monitoring observe what was provisioned and how it behaves


3. Why use Terraform on Google Cloud?

Business reasons

  • Faster delivery with fewer mistakes: repeatable infra reduces manual console changes.
  • Auditability: infrastructure is reviewed in pull requests and tracked in version control.
  • Standardization at scale: modules implement “company approved” patterns.
  • Reduced operational risk: plans and automated checks reduce surprises.

Technical reasons

  • Declarative desired state: you describe what you want; Terraform figures out changes.
  • Broad Google Cloud coverage: the provider supports a wide range of resources (coverage varies by service and API maturity).
  • Dependency management: Terraform builds a resource graph and orders operations.
  • Modules and composition: reuse patterns across projects and environments.

Operational reasons

  • Drift detection: terraform plan highlights manual changes (and can reconcile them).
  • Repeatable environments: dev/test/prod can share the same code with different variables.
  • Team workflows: remote state and CI reduce “works on my machine” issues.

Security/compliance reasons

  • Least privilege via service accounts: CI runners can use narrowly scoped credentials.
  • Separation of duties: approvals in Git; execution controlled by CI.
  • Policy enforcement: combine Terraform code review with org policies and policy-as-code tools.
  • Audit logs: Google Cloud admin activity logs + CI logs + VCS history.

Scalability/performance reasons

  • Scales to large infrastructures when structured into modules and multiple states/projects.
  • Parallelism: Terraform can apply independent resource changes concurrently (within limits).

When teams should choose it

  • You want Infrastructure as Code for Google Cloud with a large ecosystem and mature workflows.
  • You need consistent provisioning across multiple projects/environments.
  • You need a strong module story and CI/CD integration.
  • You want a tool that’s cloud-agnostic in principle, but can be used deeply with Google Cloud.

When teams should not choose it

  • You need imperative configuration management (then consider Ansible or similar).
  • You require a fully managed “click-to-run Terraform” service and don’t want to manage runners (then evaluate Google Cloud Infrastructure Manager or Terraform Cloud—verify fit).
  • Your team standard is Kubernetes-native configuration (then consider Config Connector / Crossplane).
  • You have extremely high change frequency and want a different workflow model (e.g., GitOps controllers continuously reconciling).

4. Where is Terraform on Google Cloud used?

Industries

  • SaaS and software product companies
  • Financial services and regulated industries (with strict change control)
  • Healthcare and life sciences (compliance + audit requirements)
  • Retail and media (elastic workloads, multi-environment deployments)
  • Public sector (governed landing zones; controlled project vending)

Team types

  • Platform engineering teams building landing zones and shared services
  • DevOps/SRE teams standardizing deployments
  • Security engineering teams enforcing guardrails (IAM/org policies)
  • Application teams provisioning app-specific infrastructure
  • Data engineering teams provisioning BigQuery, Composer, Dataflow dependencies

Workloads

  • Web apps (Cloud Run, GKE, Compute Engine)
  • Data platforms (BigQuery datasets, buckets, service accounts, IAM)
  • Network foundations (VPCs, subnets, firewall rules, Cloud DNS)
  • Shared platform services (Artifact Registry, Cloud Build triggers, KMS keys)

Architectures

  • Single project simple apps
  • Multi-project shared VPC architectures
  • Hub-and-spoke networks
  • Multi-environment (dev/stage/prod) with consistent modules
  • Organization-scale landing zones (folders, projects, policies, billing)

Real-world deployment contexts

  • Production: typically uses remote state, CI/CD, approvals, and strict IAM.
  • Dev/test: may use local runs or lightweight CI, smaller state, sandbox projects.

5. Top Use Cases and Scenarios

Below are realistic, commonly implemented scenarios for Terraform on Google Cloud.

1) Project “vending” (automated project creation)

  • Problem: Creating projects manually is slow and inconsistent (billing, APIs, IAM, labels).
  • Why Terraform fits: Declarative project factories and repeatable guardrails.
  • Example: A platform team uses Terraform to create proj-foo-dev, proj-foo-prod, attach billing, enable APIs, apply labels, and create baseline IAM bindings.

2) Landing zone / foundation setup

  • Problem: Organizations need consistent folders, org policies, logging sinks, shared VPC.
  • Why Terraform fits: Encodes organization standards and provides repeatable rollout.
  • Example: Terraform provisions folders for dev, prod, shared networking projects, org policies, and central logging.

3) Shared VPC network provisioning

  • Problem: Multi-team networks are complex; manual changes cause outages.
  • Why Terraform fits: Versioned network definitions and controlled changes.
  • Example: A central network team manages VPCs, subnets, routes, and firewall rules with Terraform; app teams attach service projects.

4) IAM standardization and least privilege

  • Problem: IAM grows organically; over-permissioning becomes common.
  • Why Terraform fits: IAM bindings are code-reviewed and standardized.
  • Example: Terraform manages group-based IAM roles, service accounts, and workload identities per environment.

5) Cloud Storage governance (buckets, lifecycle, retention)

  • Problem: Buckets created ad hoc without lifecycle rules, causing uncontrolled costs.
  • Why Terraform fits: Enforces uniform bucket-level access, retention, lifecycle.
  • Example: Terraform creates buckets with CMEK, object lifecycle transitions, and logging.

6) GKE cluster provisioning with standard add-ons

  • Problem: Clusters differ across teams; upgrades and security settings drift.
  • Why Terraform fits: Module-based cluster creation with consistent settings.
  • Example: A module provisions private GKE clusters, node pools, authorized networks, and logging/monitoring settings.

7) Cloud Run + load balancing baseline

  • Problem: Teams need a secure, repeatable pattern for serverless deployments.
  • Why Terraform fits: Declarative configuration for service, IAM invoker, and networking.
  • Example: Terraform provisions Cloud Run service, service account, and IAM, plus supporting resources (Artifact Registry, Secret Manager).

8) BigQuery dataset provisioning and IAM

  • Problem: Data platform needs consistent dataset creation and access controls.
  • Why Terraform fits: Dataset definitions, tables, and IAM are code-reviewed.
  • Example: Terraform creates datasets per domain, sets access for groups, and configures audit logging sinks.

9) CI/CD infrastructure provisioning (build triggers, Artifact Registry)

  • Problem: CI/CD configurations drift across repositories and teams.
  • Why Terraform fits: Centralized, auditable CI/CD infrastructure definitions.
  • Example: Terraform creates Artifact Registry repos, Cloud Build triggers, and service accounts with minimal permissions.

10) Multi-environment deployments with workspaces or separate states

  • Problem: Teams want consistent infra across dev/stage/prod with different sizes and policies.
  • Why Terraform fits: Variable-driven configuration + environment-specific state separation.
  • Example: Same Terraform code deploys dev and prod VPCs into different projects, using separate state buckets/prefixes.

11) Disaster recovery preparedness (infrastructure reproducibility)

  • Problem: Recreating infrastructure after a major incident is risky if it’s manual.
  • Why Terraform fits: Infra can be recreated from code (with careful state/backup planning).
  • Example: A DR runbook references Terraform modules to recreate networking and baseline services in a recovery project.

12) Migrations from manual to code-managed infrastructure

  • Problem: Existing resources exist, but no code describes them; changes are risky.
  • Why Terraform fits: Import and progressive adoption allow “bring under management.”
  • Example: Terraform imports existing VPCs/buckets; then teams apply incremental improvements (labels, lifecycle rules, IAM).

6. Core Features

1) Declarative infrastructure with HCL

  • What it does: Lets you describe desired infrastructure state in .tf files.
  • Why it matters: Config is readable, reviewable, and repeatable.
  • Practical benefit: Consistent deployments across environments.
  • Caveats: Some behaviors require understanding of resource lifecycle (create-before-destroy, replacement on immutable changes).

2) Google Cloud providers (google and google-beta)

  • What it does: Implements resources and data sources that map to Google Cloud APIs.
  • Why it matters: Enables Terraform to manage Google Cloud services.
  • Practical benefit: Broad service coverage with ongoing updates.
  • Caveats: Newer features may appear first in google-beta; provider versions can introduce breaking changes—pin versions and test upgrades.

Official provider docs: – https://registry.terraform.io/providers/hashicorp/google/latest/docs – https://registry.terraform.io/providers/hashicorp/google-beta/latest/docs

3) State management (local or remote)

  • What it does: Tracks what Terraform created and current known resource attributes.
  • Why it matters: Enables accurate diffs (plan) and safe updates.
  • Practical benefit: Collaboration and drift detection when state is centralized.
  • Caveats: State can contain sensitive values; secure it and control access.

4) Remote backend with Google Cloud Storage (GCS)

  • What it does: Stores state in a Cloud Storage bucket and supports team workflows.
  • Why it matters: Centralizes state and reduces “multiple people ran apply” issues.
  • Practical benefit: Easier collaboration and CI/CD runs.
  • Caveats: Configure bucket security carefully; enable versioning; understand locking semantics and failure recovery.

Backend docs: – https://developer.hashicorp.com/terraform/language/backend/gcs

5) Planning and safe execution

  • What it does: terraform plan shows exactly what will change before you apply.
  • Why it matters: Reduces unexpected production changes.
  • Practical benefit: Enables approval gates in CI/CD (plan → review → apply).
  • Caveats: Plans can become stale if the environment changes before apply; prefer short-lived plans and controlled pipelines.

6) Modules for reuse and standardization

  • What it does: Packages Terraform code into reusable building blocks.
  • Why it matters: Scales patterns across teams and environments.
  • Practical benefit: Less duplication; faster onboarding; consistent controls.
  • Caveats: Module design requires discipline (versioning, inputs/outputs, backwards compatibility).

Google Cloud Foundation Toolkit (modules and blueprints): – https://cloud.google.com/foundation-toolkit (verify current landing page and module links)

7) Workspaces and environment separation (optional pattern)

  • What it does: Allows multiple states from the same configuration.
  • Why it matters: Can simplify multi-env deployments.
  • Practical benefit: Quick dev/test/prod separation in small setups.
  • Caveats: Many teams prefer separate state per environment/project instead of workspaces to reduce blast radius.

8) Data sources (read existing infrastructure)

  • What it does: Fetches info about existing resources (networks, subnets, projects).
  • Why it matters: Integrates with pre-existing infrastructure.
  • Practical benefit: Avoids hardcoding; builds on shared resources.
  • Caveats: Data source reads can fail due to permissions or API enablement.

9) Import (adopt existing resources)

  • What it does: Brings existing Google Cloud resources into Terraform state.
  • Why it matters: Enables incremental migration to IaC.
  • Practical benefit: Avoids rebuild; reduces migration risk.
  • Caveats: Import does not generate configuration automatically (Terraform has experimental/config-generation features—verify current status in Terraform docs).

10) CI/CD integration patterns

  • What it does: Automates plan/apply on merges with controlled credentials.
  • Why it matters: Standardizes change management.
  • Practical benefit: Repeatable, auditable deployments.
  • Caveats: Requires secure credential management, state access, and guardrails.

11) Policy and guardrails (multi-layer)

  • What it does: Prevents unsafe patterns (public buckets, overly broad IAM, no labels).
  • Why it matters: Infrastructure changes should comply with security and cost standards.
  • Practical benefit: Fewer incidents and audit findings.
  • Caveats: Terraform itself doesn’t enforce all policies; pair with:
  • Google Cloud Organization Policy Service
  • CI checks (OPA/Conftest, tfsec, checkov)
  • Terraform Cloud/Enterprise policy features (if used)

7. Architecture and How It Works

High-level architecture

Terraform on Google Cloud typically has these parts: 1. Authoring: Developers/platform engineers write Terraform code in Git. 2. Execution: Terraform CLI runs in a workstation, Cloud Shell, or CI runner. 3. Authentication: Terraform uses Google credentials (ADC, service account impersonation, or Workload Identity Federation). 4. Provider/API calls: The Google provider calls Google Cloud APIs to create/update resources. 5. State: Terraform stores state locally or in a remote backend (commonly GCS) and uses it to compute future changes. 6. Observability and governance: – Google Cloud audit logs record API calls. – CI logs record Terraform runs. – Optional policy checks gate changes.

Request/data/control flow

  • Control flow: Terraform evaluates configuration → builds dependency graph → reads current state and remote resource attributes → computes plan → applies changes via API calls.
  • Data flow:
  • State is written to the backend (e.g., Cloud Storage object).
  • Provider reads and writes resource attributes via Google Cloud APIs.
  • Locking:
  • Remote backends often implement state locking to prevent concurrent writes (behavior varies by backend; for GCS, review official backend docs).

Integrations with related Google Cloud services

Common integrations: – Cloud Storage: remote state backend – Cloud Build / Cloud Deploy: CI/CD runner patterns (Cloud Build is common for Terraform pipelines) – Secret Manager / Cloud KMS: secret and key management (do not store secrets in plain text variables) – Cloud Logging / Cloud Monitoring: operational observability for the resources created – Cloud Asset Inventory: inventory and change history – IAM / Resource Manager: projects, folders, org-level governance

Dependency services

Terraform relies on: – Google Cloud APIs for the target services (e.g., Compute Engine API for VPC resources) – IAM for authentication and authorization – A backend for shared state (recommended for teams)

Security/authentication model

Terraform can authenticate to Google Cloud using: – Application Default Credentials (ADC): – Common for local development (gcloud auth application-default login) – Service account keys: – Works, but increases key management risk; many orgs discourage long-lived keys – Service account impersonation: – A user or CI principal authenticates, then impersonates a service account for Terraform actions – Workload Identity Federation (WIF): – Recommended for CI systems to avoid long-lived keys (GitHub Actions → Google Cloud)

Verify auth options in Google provider docs: – https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference

Networking model

Terraform does not “sit in” your VPC. It calls Google Cloud APIs over the internet (or via controlled egress in enterprise networks). Networking concerns typically relate to: – Where Terraform runs (developer machine, CI runner, private build worker) – Whether outbound access to Google APIs is allowed – Private connectivity patterns (e.g., Private Google Access) if Terraform runs in restricted environments

Monitoring/logging/governance considerations

  • Enable and review Cloud Audit Logs (Admin Activity logs are on by default for many services; Data Access logs can be configured—verify per service).
  • Store CI logs securely and retain them for audit requirements.
  • Use labels/tags and folder/project structure for governance.
  • Consider organization policies to prevent unsafe configurations even if Terraform code changes.

Simple architecture diagram (conceptual)

flowchart LR
  Dev[Engineer / CI Runner] -->|terraform plan/apply| TF[Terraform CLI]
  TF -->|Google Provider| API[Google Cloud APIs]
  TF <--> State[(Cloud Storage\nRemote State)]
  API --> Res[Google Cloud Resources\n(VPC, IAM, Storage, etc.)]

Production-style architecture diagram (typical enterprise)

flowchart TB
  subgraph VCS[Version Control]
    PR[Pull Request]
    Main[Main Branch]
  end

  subgraph CI[CI/CD]
    Plan[Plan Job]
    Approve[Approval Gate]
    Apply[Apply Job]
  end

  subgraph Sec[Security & Governance]
    OPA[Policy Checks\n(OPA/Conftest, tfsec, checkov)]
    OrgPol[Org Policies]
    Audit[Cloud Audit Logs]
  end

  subgraph State[State & Secrets]
    GCS[(Cloud Storage Backend)]
    SM[Secret Manager]
    KMS[Cloud KMS]
  end

  subgraph GCP[Google Cloud]
    APIs[Service APIs]
    Net[VPC / Subnets / FW]
    IAM[IAM / Service Accounts]
    App[Workloads\n(GKE/Cloud Run/VMs)]
  end

  PR --> Plan
  Plan --> OPA
  OPA --> Approve
  Approve --> Apply
  Apply -->|terraform apply| APIs
  Apply <--> GCS
  Apply -->|read secrets| SM
  SM --> KMS
  APIs --> Net
  APIs --> IAM
  APIs --> App
  APIs --> Audit
  OrgPol --> APIs

8. Prerequisites

Account/project requirements

  • A Google Cloud account with access to a billing-enabled project.
  • Ability to create or select a project for the lab.

Permissions / IAM roles

Minimum permissions depend on what you create. For this tutorial’s lab (network + storage), you typically need: – Permissions to enable APIs (often roles/serviceusage.serviceUsageAdmin) – Permissions to create storage buckets (e.g., roles/storage.admin or bucket-scoped permissions) – Permissions to create VPC resources (e.g., roles/compute.networkAdmin)

For simplicity in a sandbox, many people use Project Owner, but that is not recommended for production.

Billing requirements

  • A billing account linked to the project.
  • Some resources (even small ones) can incur charges. This lab is designed to be low-cost and focuses mainly on low/no-cost resources (VPC + Cloud Storage), but always verify in your environment.

CLI/SDK/tools needed

Choose one execution environment:

Option A: Cloud Shell (recommended for beginners) – Includes gcloud and usually includes Terraform (version may vary). Verify with terraform version.

Option B: Local machine – Install: – Google Cloud CLI: https://cloud.google.com/sdk/docs/install – Terraform CLI: https://developer.hashicorp.com/terraform/install

Region availability

  • Terraform itself is not regional.
  • Google Cloud resources are regional/zonal depending on resource type.
  • Cloud Storage buckets require choosing a location type (region/dual-region/multi-region).

Quotas/limits

  • Google Cloud has project quotas per API (Compute, Storage, etc.).
  • Terraform has no hard quota, but large plans can hit API rate limits.
  • If you see 429 or quota errors, request quota increases or reduce parallelism.

Prerequisite services

You will enable: – Compute Engine API (for VPC network resources) – Cloud Storage (generally available without explicit API enablement, but access is controlled by IAM; behavior can vary—verify if your org requires explicit enabling)


9. Pricing / Cost

Terraform on Google Cloud has two cost layers:

1) Terraform tooling costTerraform CLI: free (open-source). – Terraform Cloud / Terraform Enterprise: paid tiers may apply (pricing is from HashiCorp). Official pricing: https://www.hashicorp.com/products/terraform/pricing

2) Google Cloud resource cost You pay for the Google Cloud resources Terraform creates and for certain supporting services (state storage, CI, logging).

Pricing dimensions (Google Cloud side)

Common cost drivers when running Terraform on Google Cloud: – Cloud Storage for remote state: – Storage (GB-month), operations (Class A/B requests), egress (if applicable) – Versioning increases stored data over time – Official pricing: https://cloud.google.com/storage/pricing – CI runners (if using Cloud Build or other CI): – Cloud Build pricing (build minutes, workers): https://cloud.google.com/build/pricing – Resources you provision (examples): – Compute Engine VMs: vCPU/RAM hours, disk, external IPs, egress – GKE: cluster management fees (Standard) + nodes – Cloud SQL: instance hours, storage, backups – Load balancers: forwarding rules, data processing – Logging: ingestion and retention beyond free allocations

Free tier (if applicable)

  • Google Cloud has an overall Free Tier for some products and always-free usage for some services. Eligibility and limits vary by region and product—verify here:
  • https://cloud.google.com/free
  • Cloud Storage has limited free usage in some cases; verify current free tier terms and whether it applies to your region and bucket class.

Hidden or indirect costs to watch

  • State bucket versioning: great for recovery, but can grow over time.
  • Egress: if CI runners or operators are outside Google Cloud, downloading providers and interacting with APIs can cause egress (usually minimal for API calls, but artifact downloads can matter).
  • Logging/Monitoring ingestion: provisioning lots of resources can increase logs/metrics.
  • NAT / Load Balancing: networking services can incur baseline and per-GB charges.
  • Artifact Registry: storing providers/modules internally can incur storage/egress.

Network/data transfer implications

  • Terraform’s API calls are typically small.
  • Major network cost is usually from the resources you create (VM egress, load balancer traffic, Cloud NAT egress).

How to optimize cost

  • Use a single remote state bucket per environment and lifecycle old versions (keep enough versions for recovery).
  • Avoid provisioning costful services in sandboxes (NAT gateways, load balancers, large clusters).
  • Use labels to enable cost allocation (env, owner, cost_center).
  • Prefer ephemeral test projects and terraform destroy after labs.
  • Use the Google Cloud Pricing Calculator to estimate infrastructure:
  • https://cloud.google.com/products/calculator

Example low-cost starter estimate (no exact numbers)

A typical beginner Terraform setup can be near-zero cost if it only creates: – One Cloud Storage bucket for state (small storage footprint) – One VPC + subnet + firewall rules (generally no direct hourly cost)

Costs are mostly: – A few cents/month for bucket storage and operations (varies by class/region/requests—verify in pricing docs).

Example production cost considerations (no fabricated numbers)

In production, Terraform is not the cost center—the infrastructure is. Expect costs from: – Multiple environments (dev/stage/prod) multiplied across projects – Logging and monitoring at scale – Network services (load balancing, NAT, inter-region traffic) – Compute and managed services (GKE, Cloud SQL, data services) – CI/CD build minutes and artifact storage


10. Step-by-Step Hands-On Tutorial

Objective

Use Terraform on Google Cloud to provision: – A remote Terraform state backend in Cloud Storage – A simple VPC network and subnet – A second application bucket with labels and versioning

You will also learn how to: – Authenticate Terraform to Google Cloud – Run init, plan, apply – Validate results in Google Cloud – Clean up safely with destroy

Lab Overview

  • Runtime: 30–60 minutes
  • Cost: low (mainly Cloud Storage; VPC resources typically do not incur hourly charges)
  • Execution environment: Google Cloud Shell (recommended)

Step 1: Create/select a Google Cloud project and set your defaults

  1. Open Google Cloud Console and pick or create a project.
  2. In Cloud Shell (or your local terminal), set the project:
gcloud config set project YOUR_PROJECT_ID
  1. Confirm:
gcloud config get-value project

Expected outcome – Your CLI is targeting the correct project.


Step 2: Enable required APIs

Enable the Compute Engine API (required for VPC resources):

gcloud services enable compute.googleapis.com

(Optional but often useful in real environments):

gcloud services enable iam.googleapis.com
gcloud services enable cloudresourcemanager.googleapis.com

Expected outcome – APIs are enabled and ready for Terraform provider operations.

Verification

gcloud services list --enabled --format="value(config.name)" | grep compute.googleapis.com

Step 3: Verify Terraform and gcloud authentication

3.1 Check Terraform installation

terraform version

If Terraform is not installed (local machine), install it from: – https://developer.hashicorp.com/terraform/install

Cloud Shell typically includes Terraform, but the version can vary—verify and pin provider versions accordingly.

3.2 Authenticate for Terraform using Application Default Credentials (ADC)

For a beginner-friendly lab, use ADC:

gcloud auth application-default login

Follow the browser flow.

Expected outcome – Terraform can use Google credentials via ADC.

Verification

gcloud auth application-default print-access-token >/dev/null && echo "ADC is working"

Production note: For CI/CD, prefer Workload Identity Federation or service account impersonation rather than user credentials or long-lived keys. See the provider authentication guide: https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference


Step 4: Create a Cloud Storage bucket for remote Terraform state

Choose a globally unique bucket name. Bucket names are global across all Google Cloud customers.

Set variables:

export PROJECT_ID="$(gcloud config get-value project)"
export TF_STATE_BUCKET="${PROJECT_ID}-tfstate-$(date +%Y%m%d%H%M%S)"
export REGION="us-central1"

Create the bucket (regional example):

gcloud storage buckets create "gs://${TF_STATE_BUCKET}" \
  --project="${PROJECT_ID}" \
  --location="${REGION}"

Enable versioning (recommended for state recovery):

gcloud storage buckets update "gs://${TF_STATE_BUCKET}" --versioning

Expected outcome – A GCS bucket exists and has versioning enabled.

Verification

gcloud storage buckets describe "gs://${TF_STATE_BUCKET}" --format="yaml(name,location,versioning)"

Step 5: Create a Terraform project structure

Create a working directory:

mkdir -p terraform-gcp-lab/modules/network
cd terraform-gcp-lab

Create these files: – versions.tfproviders.tfbackend.tfvariables.tfmain.tfoutputs.tfmodules/network/main.tfmodules/network/variables.tfmodules/network/outputs.tf


Step 6: Add Terraform configuration (providers, backend, module)

6.1 versions.tf

Pin Terraform and provider versions. Exact versions change frequently—choose a version compatible with your environment and test upgrades.

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 6.0"
    }
  }
}

If ~> 6.0 is not available in your environment, adjust accordingly after checking provider releases: https://registry.terraform.io/providers/hashicorp/google/latest

6.2 backend.tf

Configure the GCS backend. Note: backend blocks can’t reference variables in the same way as normal configuration; keep it simple.

terraform {
  backend "gcs" {
    bucket = "REPLACE_WITH_YOUR_TF_STATE_BUCKET"
    prefix = "terraform/state"
  }
}

Replace REPLACE_WITH_YOUR_TF_STATE_BUCKET with your bucket name from Step 4.

6.3 providers.tf

provider "google" {
  project = var.project_id
  region  = var.region
}

6.4 variables.tf

variable "project_id" {
  type        = string
  description = "Google Cloud project ID to deploy into."
}

variable "region" {
  type        = string
  description = "Default region for regional resources."
  default     = "us-central1"
}

variable "environment" {
  type        = string
  description = "Environment label (e.g., dev, test, prod)."
  default     = "dev"
}

6.5 Network module: modules/network/variables.tf

variable "project_id" {
  type        = string
  description = "Project ID."
}

variable "network_name" {
  type        = string
  description = "VPC name."
}

variable "subnet_name" {
  type        = string
  description = "Subnet name."
}

variable "region" {
  type        = string
  description = "Subnet region."
}

variable "subnet_cidr" {
  type        = string
  description = "Subnet CIDR range."
}

6.6 Network module: modules/network/main.tf

resource "google_compute_network" "vpc" {
  name                    = var.network_name
  project                 = var.project_id
  auto_create_subnetworks = false
  routing_mode            = "REGIONAL"
}

resource "google_compute_subnetwork" "subnet" {
  name          = var.subnet_name
  project       = var.project_id
  region        = var.region
  network       = google_compute_network.vpc.id
  ip_cidr_range = var.subnet_cidr
  private_ip_google_access = true
}

6.7 Network module: modules/network/outputs.tf

output "network_id" {
  value       = google_compute_network.vpc.id
  description = "ID of the VPC network."
}

output "subnet_id" {
  value       = google_compute_subnetwork.subnet.id
  description = "ID of the subnet."
}

6.8 Root module: main.tf

This creates: – VPC + subnet (via module) – An application bucket with labels and versioning

module "network" {
  source      = "./modules/network"
  project_id  = var.project_id
  network_name = "lab-vpc-${var.environment}"
  subnet_name  = "lab-subnet-${var.environment}"
  region       = var.region
  subnet_cidr  = "10.10.0.0/24"
}

resource "google_storage_bucket" "app_bucket" {
  name     = "${var.project_id}-app-bucket-${var.environment}"
  project  = var.project_id
  location = var.region

  uniform_bucket_level_access = true

  versioning {
    enabled = true
  }

  labels = {
    env   = var.environment
    owner = "terraform"
  }
}

Bucket names must be globally unique. If the apply fails due to naming, change the bucket name to include more uniqueness (for example, add a random suffix). For deterministic labs, you can also add a variable or use random_id (requires the random provider).

6.9 outputs.tf

output "state_backend_bucket" {
  value       = "gs://${terraform.backend.gcs.bucket}"
  description = "Remote state bucket (from backend config)."
}

output "vpc_id" {
  value       = module.network.network_id
  description = "VPC ID."
}

output "subnet_id" {
  value       = module.network.subnet_id
  description = "Subnet ID."
}

output "app_bucket_name" {
  value       = google_storage_bucket.app_bucket.name
  description = "Application bucket name."
}

Note: terraform.backend.gcs.bucket may not be available in all Terraform versions. If you see an error, remove the state_backend_bucket output and rely on your known bucket name. Terraform backend settings are intentionally not always exposed as normal expressions—verify behavior with your Terraform version.

A safer outputs.tf if you hit that issue:

output "vpc_id" {
  value       = module.network.network_id
  description = "VPC ID."
}

output "subnet_id" {
  value       = module.network.subnet_id
  description = "Subnet ID."
}

output "app_bucket_name" {
  value       = google_storage_bucket.app_bucket.name
  description = "Application bucket name."
}

Step 7: Initialize Terraform (downloads provider + configures backend)

From the terraform-gcp-lab directory:

terraform fmt -recursive
terraform init

Expected outcome – Terraform initializes successfully. – It configures the GCS backend and downloads the Google provider.

Verification – You should see messages indicating the backend is configured and the provider is installed.


Step 8: Plan the deployment

Create a terraform.tfvars file:

cat > terraform.tfvars <<EOF
project_id  = "${PROJECT_ID}"
region      = "${REGION}"
environment = "dev"
EOF

Run plan:

terraform plan

Expected outcome – Terraform shows a plan including: – Creating a VPC network – Creating a subnetwork – Creating a Cloud Storage bucket

Review: – Names – Region – Labels


Step 9: Apply the deployment

terraform apply

Type yes when prompted.

Expected outcome – Terraform creates the resources and prints outputs.


Step 10: Validate in Google Cloud

10.1 Validate via Terraform outputs

terraform output

10.2 Validate via gcloud (VPC and subnet)

gcloud compute networks list --filter="name~lab-vpc"
gcloud compute networks subnets list --filter="name~lab-subnet"

10.3 Validate Cloud Storage bucket

gcloud storage buckets list --filter="name~app-bucket"

10.4 Confirm remote state exists in the backend bucket

List objects in the state bucket prefix:

gcloud storage ls "gs://${TF_STATE_BUCKET}/terraform/state/"

Expected outcome – You see a state file object in the bucket (and possibly a lock file during operations).


Step 11: Make a small controlled change and re-apply

Edit main.tf and add another label to the bucket:

labels = {
  env        = var.environment
  owner      = "terraform"
  cost_center = "lab"
}

Then:

terraform plan
terraform apply

Expected outcome – Terraform updates only the bucket labels (a small in-place change), showing how incremental changes work.


Validation

You have successfully used Terraform on Google Cloud to: – Store remote Terraform state in a GCS bucket – Create a VPC and subnet – Create and manage a storage bucket with versioning and labels – Apply and update changes predictably using plan/apply


Troubleshooting

Common issues and realistic fixes:

1) 403 Permission denied – Symptom: Error 403: The caller does not have permission – Causes: – Account lacks required IAM roles in the project – Organization policy blocks the action – Fix: – Confirm project: gcloud config get-value project – Verify IAM: ask a project admin to grant needed roles – Check Org Policies (if applicable)

2) API not enabled – Symptom: errors mentioning compute.googleapis.com not enabled – Fix:

gcloud services enable compute.googleapis.com

3) Bucket name already exists – Symptom: Error 409: ... already exists – Fix: change bucket name to be globally unique (add more suffix entropy).

4) State lock issues / concurrent runs – Symptom: Terraform cannot acquire lock – Fix: – Ensure only one apply is running at a time – If a lock is stuck due to a crashed run, follow Terraform backend guidance (do not delete random objects without understanding). For GCS backend behavior, verify official backend docs: – https://developer.hashicorp.com/terraform/language/backend/gcs

5) Provider version conflicts – Symptom: init fails due to provider constraints – Fix: – Adjust provider version in versions.tf – Run terraform init -upgrade


Cleanup

Always clean up to avoid ongoing cost.

1) Destroy resources managed by Terraform:

terraform destroy

2) Delete the remote state bucket (only after destroy):

gcloud storage rm -r "gs://${TF_STATE_BUCKET}"

Expected outcome – All lab resources are removed and billing stops (except any retained logs per policy).


11. Best Practices

Architecture best practices

  • Separate states by environment and/or project to reduce blast radius.
  • Use modules for standardized patterns (networking, IAM, logging).
  • Use multiple projects for isolation (common Google Cloud best practice): shared services, networking, prod workloads separated.

IAM/security best practices

  • Prefer service account impersonation or Workload Identity Federation for CI/CD.
  • Use least privilege:
  • Create dedicated Terraform service accounts per environment.
  • Grant only necessary roles (network admin, storage admin, etc.).
  • Avoid long-lived service account keys when possible.
  • Restrict who can read/write Terraform state (state can contain sensitive values).

Cost best practices

  • Enforce labels: env, owner, team, cost_center.
  • Implement lifecycle rules for state buckets (but keep enough versions for rollback).
  • Use sandbox projects with automatic cleanup for experiments.

Performance best practices

  • Keep plans small by splitting infrastructure into stacks (multiple states).
  • Reduce graph complexity: avoid huge monolithic configurations.
  • Use -parallelism cautiously when you hit API rate limits.

Reliability best practices

  • Enable GCS bucket versioning for remote state.
  • Store Terraform code in Git with protected branches and required reviews.
  • Use CI to run terraform fmt, validate, and policy checks.

Operations best practices

  • Use a consistent pipeline:
  • fmtvalidateplan → policy checks → approval → apply
  • Record plan artifacts for audit (store in CI artifacts if appropriate).
  • Use Cloud Audit Logs and CI logs for traceability.

Governance/tagging/naming best practices

  • Adopt naming conventions:
  • app-env-region-resource
  • Use labels for cost allocation and inventory.
  • Combine Terraform with Google Cloud Org Policies to enforce non-negotiable constraints (e.g., block public IPs, restrict locations)—verify policy availability per requirement:
  • https://cloud.google.com/resource-manager/docs/organization-policy/overview

12. Security Considerations

Identity and access model

  • Terraform actions are authorized by Google Cloud IAM.
  • Best practice is to run Terraform using a dedicated principal:
  • CI: WIF principal → impersonate Terraform service account
  • Humans: user identity → impersonate Terraform service account (reduces direct privileges)

Encryption

  • Google Cloud encrypts data at rest by default for many services.
  • For higher assurance:
  • Use Customer-Managed Encryption Keys (CMEK) where supported (e.g., some storage resources).
  • For state bucket, you can evaluate CMEK for Cloud Storage (verify current Cloud Storage CMEK docs).

Network exposure

  • Terraform itself doesn’t open network ports; it provisions resources that might.
  • Guardrails:
  • Org policy to restrict external IPs
  • Firewall rules and load balancer config reviewed in PRs
  • Default-deny patterns for VPC firewall rules in sensitive environments

Secrets handling

Common mistakes: – Putting secrets in .tfvars committed to Git – Storing secrets in Terraform state unintentionally

Recommendations: – Use Secret Manager for application secrets. – Use CI secret stores for pipeline secrets. – Avoid outputting sensitive values. – Mark variables as sensitive = true where appropriate (still can appear in state; sensitivity affects display).

Secret Manager docs: – https://cloud.google.com/secret-manager/docs

Audit/logging

  • Ensure Cloud Audit Logs are retained according to policy.
  • Track who applied what:
  • VCS commit history
  • CI logs
  • Google Cloud audit logs for API calls

Compliance considerations

  • Use org policies for location restrictions and service constraints.
  • Apply least privilege and separation of duties.
  • Consider Infrastructure Manager or Terraform Cloud if you need a managed execution control plane (evaluate carefully; verify features and compliance support in official docs).

Secure deployment recommendations

  • Remote state: private bucket, least privilege IAM, versioning, retention policies where appropriate.
  • CI: WIF, no keys; short-lived credentials; approvals on apply.
  • Code: mandatory reviews; static analysis (terraform validate, policy tools).

13. Limitations and Gotchas

  • Terraform state contains data: even if variables are marked sensitive, state can store resource attributes. Protect state storage.
  • Provider coverage varies: some new Google Cloud features appear later in the provider or first in google-beta.
  • Breaking changes: provider upgrades can change defaults/behavior. Pin versions and test upgrades in non-prod.
  • Eventual consistency: some Google Cloud APIs may take time to propagate changes; Terraform may need retries.
  • IAM complexity: IAM bindings can be tricky:
  • Resource-level IAM vs project-level IAM
  • Additive vs authoritative bindings
  • Accidental overwrites if you mix IAM resource types incorrectly (review Terraform IAM patterns for Google resources—verify in provider docs).
  • Global uniqueness of bucket names: Cloud Storage bucket names are global across Google Cloud.
  • Quotas and rate limits: large applies can hit API quotas; reduce parallelism or request quota increases.
  • Manual changes cause drift: console changes can produce unexpected plan output; manage changes through code.
  • State locking behavior: depends on backend; understand how to recover from interrupted runs (verify backend docs).
  • Multi-project and org scope: requires careful IAM design and often separate service accounts and states.

14. Comparison with Alternatives

Terraform on Google Cloud is one of several ways to implement Infrastructure as Code and provisioning workflows.

Options to consider

  • Google Cloud Infrastructure Manager (managed Terraform execution; verify latest features and pricing)
  • Google Cloud Deployment Manager (legacy/older IaC for Google Cloud; verify current status in docs)
  • Pulumi (IaC using general-purpose languages)
  • Crossplane (Kubernetes-native control plane to provision cloud resources)
  • Config Connector (Kubernetes resources to manage Google Cloud; part of Google Cloud’s Kubernetes ecosystem—verify current docs)
  • Ansible (procedural automation, not primarily state-based IaC)
  • Other-cloud IaC services (AWS CloudFormation, Azure Bicep) are not native to Google Cloud but matter in multi-cloud comparisons.
Option Best For Strengths Weaknesses When to Choose
Terraform on Google Cloud (Terraform CLI + Google provider) Most teams implementing IaC on Google Cloud Mature workflows, modules, large ecosystem, plan/apply, multi-cloud skill portability State management complexity; requires runner/CI design Default choice for many Google Cloud IaC implementations
Google Cloud Infrastructure Manager Teams wanting managed Terraform execution on Google Cloud Managed service model; integrates with Google Cloud IAM and APIs Feature set and pricing can evolve; verify parity with Terraform CLI workflows When you want Google-managed Terraform execution and governance
Google Cloud Deployment Manager (legacy) Existing deployments already using it Google-native templates Less commonly used today; verify current roadmap/status Only if you must maintain existing Deployment Manager stacks
Pulumi Teams preferring TypeScript/Python/Go/C# IaC Strong developer ergonomics; real languages Different state model; learning curve; ecosystem differs When app teams want IaC in general-purpose languages
Crossplane Kubernetes-centric platform teams Continuous reconciliation; GitOps-friendly Requires Kubernetes expertise; operational overhead When you want Kubernetes as the control plane for cloud resources
Config Connector GKE users managing Google Cloud resources via Kubernetes CRDs Google Cloud integration; Kubernetes-native Tied to Kubernetes model; not ideal for all org-wide provisioning When your platform standard is Kubernetes-native provisioning
Ansible Ops automation and configuration management Great for procedural tasks and OS config Not ideal for declarative infra lifecycle and drift management When you need config management and orchestration more than IaC

15. Real-World Example

Enterprise example: regulated financial services landing zone

  • Problem
  • Multiple teams need projects quickly, but compliance requires strict IAM, logging, and network controls.
  • Manual provisioning leads to inconsistent controls and audit findings.
  • Proposed architecture
  • Org/folder structure with guarded environments (prod vs non-prod)
  • Shared VPC in a central networking project
  • Central logging project with sinks
  • Terraform modules for:
    • Project factory (billing, APIs, labels)
    • Baseline IAM (groups, break-glass accounts)
    • Network baselines (subnets, firewall patterns)
  • CI/CD executes Terraform using WIF + service account impersonation
  • Org policies enforce non-negotiables (location restrictions, public IP constraints—verify exact policies required)
  • Why Terraform on Google Cloud
  • Reviewable, auditable plan/apply workflow
  • Strong module reuse across many projects
  • Aligns with separation of duties and change management
  • Expected outcomes
  • Faster project delivery with consistent controls
  • Reduced audit exceptions
  • Lower risk of accidental public exposure
  • Standardized cost allocation via labels

Startup/small-team example: repeatable dev/stage/prod for a SaaS

  • Problem
  • A small team ships fast but needs repeatable environments to avoid “snowflake” infrastructure.
  • They want to keep costs low and reduce operational overhead.
  • Proposed architecture
  • One Google Cloud project per environment (or at minimum separate state and naming)
  • Terraform provisions:
    • VPC (if needed), buckets, service accounts, IAM bindings
    • Cloud Run services and supporting resources (Artifact Registry, Secret Manager) as the platform grows
  • GitHub Actions runs plan on PRs and apply on merges using WIF
  • Why Terraform on Google Cloud
  • Lightweight tooling, strong community, predictable change workflow
  • Easy to add modules over time
  • Expected outcomes
  • Reproducible environments
  • Fewer outages from manual console changes
  • Clear rollback/forward path using code + state history

16. FAQ

1) Is “Terraform on Google Cloud” a Google Cloud product?
It’s primarily a workflow: using HashiCorp Terraform to manage Google Cloud resources through the Google provider and Google APIs. Google Cloud also offers Terraform-based managed tooling (Infrastructure Manager). Confirm the latest product scope in official docs.

2) Do I need Terraform Cloud to use Terraform on Google Cloud?
No. You can run Terraform CLI locally, in Cloud Shell, or in any CI system. Terraform Cloud/Enterprise is optional for managed runs, collaboration, and policy features.

3) Where should I store Terraform state for Google Cloud?
For teams, use a remote backend. A common pattern is a Cloud Storage bucket with versioning and strict IAM. Avoid sharing local state files.

4) Does Terraform state include secrets?
It can. State stores resource attributes and sometimes sensitive values. Treat state as sensitive data: restrict access, enable versioning, consider retention controls.

5) Should I use service account keys for CI?
Many organizations avoid long-lived keys. Prefer Workload Identity Federation (CI to Google) or service account impersonation. Verify your organization’s security standards.

6) What’s the difference between google and google-beta providers?
google-beta often exposes newer or preview fields/resources earlier. Use it selectively and pin versions; beta features can change.

7) How do I avoid destroying production resources accidentally?
Use separate projects/states, protected branches, approvals for apply, and consider Terraform lifecycle rules (like prevent_destroy for critical resources) where appropriate.

8) Can Terraform manage resources across multiple projects?
Yes, if credentials have permission. Many platform setups use multiple providers or explicit project IDs and separate states.

9) Why does Terraform want to replace a resource instead of updating it?
Some attributes are immutable in the underlying API; changing them forces replacement. Always review plan carefully.

10) How do I detect drift (manual console changes)?
Run terraform plan regularly (or in CI). It compares current state and reads actual resource attributes to detect differences.

11) What’s the best way to structure Terraform repositories for Google Cloud?
Common patterns: – Separate repos per platform domain (networking, projects, workloads) – Or a mono-repo with clearly separated stacks and state backends
Choose based on team size, blast radius, and change frequency.

12) How do I manage IAM safely with Terraform?
Use the correct IAM resource types (binding vs member vs policy) consistently to avoid overwriting. Review Google provider IAM guidance carefully (verify in provider docs).

13) Do I need to enable APIs manually if Terraform creates resources?
Terraform typically cannot create resources if the required API is disabled. Many teams enable baseline APIs as part of project provisioning.

14) How do I estimate cost for what Terraform will create?
Terraform plan doesn’t calculate cost by default. Use the Google Cloud Pricing Calculator and cost labels; consider third-party cost estimation tools if required.

15) Can I run Terraform from inside a private network?
Yes, but the runner must reach Google APIs and the remote state backend. In restricted enterprises, you may need controlled egress and proxy/VPC Service Controls design—verify architecture with your network/security team.

16) Is Google Cloud Deployment Manager still recommended?
Deployment Manager is generally considered legacy compared to Terraform-based approaches, but product status can change. Verify current guidance in official Google Cloud docs before investing.


17. Top Online Resources to Learn Terraform on Google Cloud

Resource Type Name Why It Is Useful
Official Terraform Docs Terraform Documentation (HashiCorp Developer) — https://developer.hashicorp.com/terraform/docs Authoritative Terraform language, state, modules, workflows
Official Provider Docs Google Provider (Terraform Registry) — https://registry.terraform.io/providers/hashicorp/google/latest/docs Resource/data source reference and Google auth configuration
Official Backend Docs GCS Backend — https://developer.hashicorp.com/terraform/language/backend/gcs How to store Terraform state in Cloud Storage
Official Google Cloud Docs Google Cloud IAM Overview — https://cloud.google.com/iam/docs/overview Understanding roles, service accounts, and least privilege
Official Google Cloud Docs Organization Policy Overview — https://cloud.google.com/resource-manager/docs/organization-policy/overview Enforcing guardrails beyond Terraform code review
Official Pricing Cloud Storage Pricing — https://cloud.google.com/storage/pricing Key for remote state bucket cost modeling
Official Pricing Cloud Build Pricing — https://cloud.google.com/build/pricing Common CI runner choice for Terraform pipelines on Google Cloud
Official Cost Tooling Google Cloud Pricing Calculator — https://cloud.google.com/products/calculator Estimate cost of the infrastructure Terraform provisions
Google Cloud Modules/Blueprints Cloud Foundation Toolkit — https://cloud.google.com/foundation-toolkit Google-aligned modules and reference foundations (verify current module catalog)
Managed Terraform Option Infrastructure Manager docs — https://cloud.google.com/infrastructure-manager/docs Managed Terraform execution on Google Cloud (verify latest features and pricing)
Hands-on Labs Google Cloud Skills Boost — https://www.cloudskillsboost.google/ Official hands-on labs; search for Terraform and IaC content
Videos Google Cloud Tech YouTube — https://www.youtube.com/@googlecloudtech Architecture and product walkthroughs; look for IaC/Terraform sessions
Community Learning Terraform Learn tutorials — https://developer.hashicorp.com/terraform/tutorials Guided examples (many can be adapted to Google Cloud)
Security Scanning (community) tfsec — https://github.com/aquasecurity/tfsec Static checks for Terraform security misconfigurations (verify maintenance status and alternatives)
Security/Compliance (community) Checkov — https://github.com/bridgecrewio/checkov Policy checks for Terraform configurations

18. Training and Certification Providers

  1. DevOpsSchool.com
    Suitable audience: DevOps engineers, SREs, platform engineers, beginners to intermediate
    Likely learning focus: Terraform fundamentals, CI/CD integration, cloud DevOps workflows (Google Cloud may be included depending on offering)
    Mode: check website
    Website: https://www.devopsschool.com/

  2. ScmGalaxy.com
    Suitable audience: DevOps learners, engineers building SDLC and automation skills
    Likely learning focus: SCM/CI/CD foundations, automation, DevOps toolchains (may include Terraform topics)
    Mode: check website
    Website: https://www.scmgalaxy.com/

  3. CLoudOpsNow.in
    Suitable audience: Cloud operations and DevOps practitioners
    Likely learning focus: Cloud operations practices, automation, DevOps/cloud workflows (verify Terraform/Google Cloud coverage on site)
    Mode: check website
    Website: https://www.cloudopsnow.in/

  4. SreSchool.com
    Suitable audience: SREs, reliability-focused engineers, platform teams
    Likely learning focus: Reliability engineering practices, automation, operational excellence (Terraform may be part of platform tooling)
    Mode: check website
    Website: https://www.sreschool.com/

  5. AiOpsSchool.com
    Suitable audience: Operations teams exploring automation and AIOps
    Likely learning focus: Monitoring/operations automation, AIOps concepts; Terraform may appear in broader automation curricula
    Mode: check website
    Website: https://www.aiopsschool.com/


19. Top Trainers

  1. RajeshKumar.xyz
    Likely specialization: DevOps/cloud tooling and automation (verify current offerings on site)
    Suitable audience: Individuals and teams seeking practical DevOps guidance
    Website: https://rajeshkumar.xyz/

  2. devopstrainer.in
    Likely specialization: DevOps tools training (CI/CD, IaC, cloud workflows)
    Suitable audience: Beginners to working professionals
    Website: https://www.devopstrainer.in/

  3. devopsfreelancer.com
    Likely specialization: Freelance DevOps services and training resources (verify offerings)
    Suitable audience: Teams needing hands-on assistance or mentorship
    Website: https://www.devopsfreelancer.com/

  4. devopssupport.in
    Likely specialization: DevOps support and enablement services (may include Terraform guidance)
    Suitable audience: Organizations needing operational support and coaching
    Website: https://www.devopssupport.in/


20. Top Consulting Companies

  1. cotocus.com
    Likely service area: Cloud/DevOps consulting, automation, platform engineering (verify specific offerings)
    Where they may help: IaC adoption, CI/CD pipeline design, cloud migrations, governance patterns
    Consulting use case examples: landing zone rollout, Terraform module standardization, CI-based apply workflows
    Website: https://www.cotocus.com/

  2. DevOpsSchool.com
    Likely service area: DevOps consulting and corporate training
    Where they may help: Terraform on Google Cloud enablement, toolchain integration, team upskilling
    Consulting use case examples: creating IaC standards, building reusable Terraform modules, setting up secure pipelines
    Website: https://www.devopsschool.com/

  3. DEVOPSCONSULTING.IN
    Likely service area: DevOps consulting services (verify specific offerings)
    Where they may help: Infrastructure automation, CI/CD, operational practices
    Consulting use case examples: Terraform migration from manual provisioning, remote state and IAM design, policy checks in CI
    Website: https://www.devopsconsulting.in/


21. Career and Learning Roadmap

What to learn before Terraform on Google Cloud

  • Google Cloud basics:
  • Projects, billing, IAM, service accounts
  • VPC fundamentals (subnets, firewall rules, routes)
  • CLI basics:
  • gcloud usage and authentication
  • Infrastructure concepts:
  • Idempotency, environments, change management
  • Git fundamentals:
  • branches, pull requests, code review

What to learn after Terraform on Google Cloud

  • Advanced Terraform:
  • module design, testing patterns, state refactoring, import strategies
  • CI/CD for IaC:
  • plan/apply pipelines, approvals, artifact retention
  • Security:
  • Workload Identity Federation, org policy guardrails, secret management
  • Google Cloud architecture:
  • shared VPC, multi-project design, landing zones
  • Policy-as-code:
  • OPA/Conftest, checkov/tfsec, organization policy automation

Job roles that use it

  • Cloud Engineer
  • DevOps Engineer
  • Site Reliability Engineer (SRE)
  • Platform Engineer
  • Cloud Security Engineer (for guardrails and IAM automation)
  • Solutions Architect (for standardized provisioning patterns)

Certification path (common options)

  • HashiCorp: Terraform Associate (verify current certification name and requirements)
  • Google Cloud certifications (role-dependent):
  • Associate Cloud Engineer
  • Professional Cloud Architect
  • Professional Cloud DevOps Engineer
    Verify current certification catalog:
  • https://cloud.google.com/learn/certification

Project ideas for practice

  • Build a “project factory” that:
  • creates a project, enables APIs, sets labels, creates baseline IAM
  • Build a shared VPC module with:
  • subnets, firewall rules, and standard logging
  • Create a CI pipeline:
  • PR → plan → policy checks → merge → apply with WIF
  • Implement governance:
  • enforce labels and restrict public access via org policies and CI checks
  • Migrate a manually created bucket/network into Terraform using import

22. Glossary

  • Infrastructure as code (IaC): Managing infrastructure through code and automation instead of manual configuration.
  • Terraform: HashiCorp IaC tool that uses providers to manage infrastructure.
  • HCL: HashiCorp Configuration Language used to write Terraform configuration.
  • Provider: Terraform plugin that talks to an API (e.g., Google Cloud provider).
  • Resource: A managed object in Terraform (e.g., google_compute_network).
  • Data source: A read-only lookup for existing infrastructure in Terraform.
  • State: Terraform’s record of managed resources and their attributes.
  • Backend: Where Terraform state is stored (local, GCS, Terraform Cloud, etc.).
  • Remote state: Shared state stored outside the local filesystem (e.g., Cloud Storage).
  • State locking: Preventing concurrent modifications to the same state.
  • Plan: Terraform’s computed set of changes it will apply.
  • Apply: Executing the plan and making changes in the target cloud.
  • Drift: Differences between declared config/state and real infrastructure due to manual changes or external processes.
  • Module: Reusable Terraform code packaged behind inputs/outputs.
  • Workload Identity Federation (WIF): A Google Cloud auth method that lets external identity providers issue short-lived credentials without service account keys.
  • Service account: Non-human identity used by applications and automation in Google Cloud.
  • Org policy: Organization-level constraints that restrict what can be created/configured in Google Cloud.

23. Summary

Terraform on Google Cloud is the practical way to implement Infrastructure as code for Google Cloud using HashiCorp Terraform and the Google provider. It matters because it turns infrastructure into a repeatable, reviewable, auditable workflow—critical for scaling teams, improving reliability, and enforcing security controls.

Architecturally, Terraform operates outside your workloads and uses Google Cloud APIs to provision resources, tracking them in a state file (ideally stored remotely in Cloud Storage). Cost-wise, Terraform CLI is free, but you pay for the Google Cloud resources you create and supporting services like state storage and CI runners. Security-wise, protect Terraform state, use least privilege, and prefer Workload Identity Federation or impersonation over long-lived keys.

Use Terraform on Google Cloud when you want predictable provisioning, strong module reuse, and CI/CD-friendly change control. For the next learning step, build a small CI pipeline that runs plan on pull requests and apply on approved merges using short-lived Google Cloud credentials, and expand from basic resources into a standardized module library aligned with your organization’s policies.