Google Cloud Pub/Sub Lite Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Data analytics and pipelines

Category

Data analytics and pipelines

1. Introduction

Pub/Sub Lite is Google Cloud’s low-cost, high-throughput event streaming and messaging service designed for predictable workloads where you can pre-provision capacity. It is part of the Cloud Pub/Sub family, but it is a distinct service with a different pricing and scaling model than “Pub/Sub” (often called Pub/Sub Standard in practice).

In simple terms: Pub/Sub Lite lets producers write streams of events into a topic that is split into partitions, and lets consumers read those events back reliably and in order per partition. Unlike Pub/Sub Standard (which auto-scales), Pub/Sub Lite requires you to provision publish and subscribe throughput capacity and choose topic partitioning up front.

Technically: Pub/Sub Lite provides regional topics with partitioned storage and cursor-based consumption. Producers publish messages to partitions (optionally using a message key for partition assignment). Subscribers read messages sequentially from partitions and acknowledge them; their progress is tracked by cursors. The service is optimized for data analytics and pipelines where workloads are steady, throughput is known, and cost control matters.

What problem it solves: streaming ingest and delivery at scale with stronger cost predictability than auto-scaling messaging—especially for pipelines such as logs/telemetry, clickstream, IoT data, or ETL where you know your steady-state throughput and want lower unit cost.

Service status note: Pub/Sub Lite is currently documented as an active Google Cloud service in the Cloud Pub/Sub family. Always confirm the latest product status and feature set in the official docs: https://cloud.google.com/pubsub/lite/docs


2. What is Pub/Sub Lite?

Official purpose

Pub/Sub Lite is a managed messaging and event streaming service that provides partitioned topics with provisioned throughput for predictable, high-volume streaming pipelines on Google Cloud.

Core capabilities

  • Publish/subscribe messaging with durable storage and replay (within retention limits)
  • Partitioned topics for horizontal scale and per-partition ordering
  • Provisioned throughput (you set publish and subscribe capacity)
  • Retention and storage controls (retain messages for a configured duration and/or storage limits)
  • Subscriber cursor management to track read progress and enable replay via seeks
  • Integration with data pipeline services (notably Dataflow) and standard Google Cloud security/IAM patterns

Major components

  • Lite Topic
  • A named stream of messages in a Google Cloud region
  • Split into partitions
  • Configured with retention and capacity
  • Partitions
  • Ordered sequences of messages
  • Scaling unit for parallelism and throughput
  • Lite Subscription
  • Attaches to a topic
  • Tracks consumption progress via cursors
  • Cursors
  • Positions within partitions (offsets) that represent subscriber progress
  • Support seek operations (replay from earlier offsets) within retention
  • Publishers / Subscribers
  • Applications (or pipeline jobs) that write/read messages using client libraries

Service type

  • Fully managed Google Cloud service (you do not manage brokers)
  • Conceptually similar to an event streaming platform (like Kafka), but with Google-managed operations and Google Cloud IAM integration.

Resource scope (important for architecture)

Pub/Sub Lite resources are project-scoped and location-scoped: – Topics and subscriptions are created in a specific Google Cloud location (region). – Pub/Sub Lite is not a global service like Pub/Sub Standard. – Data locality matters for latency, compliance, and cost.

Verify the latest resource model and naming rules in the official docs: https://cloud.google.com/pubsub/lite/docs

How it fits into the Google Cloud ecosystem

Pub/Sub Lite commonly sits in Data analytics and pipelines architectures such as: – Ingest layer for Dataflow streaming pipelines (e.g., to BigQuery) – Event bus for microservices where regional locality and predictable throughput matter – Buffered ingestion layer in front of storage/analytics systems (BigQuery, Cloud Storage, Bigtable, Spanner via custom pipelines)

It complements (not replaces) Pub/Sub Standard: – Pub/Sub Standard: global/auto-scaling/event-driven integrations and triggers – Pub/Sub Lite: provisioned, partitioned, cost-optimized regional streaming


3. Why use Pub/Sub Lite?

Business reasons

  • Lower cost for steady high throughput: When you have predictable load, provisioned capacity can be more cost-effective than auto-scaling services.
  • Cost predictability: You choose throughput capacity and retention; bills are easier to forecast than purely usage-elastic models.
  • Data locality: Regional design can help meet data residency needs (verify specific compliance requirements with your policies and Google Cloud documentation).

Technical reasons

  • Partitioning and ordering: Strong fit for stream processing that benefits from partition parallelism and per-partition ordering semantics.
  • Replay: Subscribers can seek and replay messages (within retention), useful for backfills, debugging, and reprocessing.
  • High throughput streaming design: Intended for sustained high-volume pipelines.

Operational reasons

  • Managed service: No broker provisioning, patching, or cluster ops as with self-managed Kafka.
  • Explicit capacity controls: Provisioned throughput forces explicit planning; reduces “surprise scaling” and can simplify SLO management.

Security/compliance reasons

  • Google Cloud IAM for access control
  • Audit logging for admin operations through Cloud Audit Logs (verify Data Access audit logs coverage in your environment)
  • Default encryption at rest and in transit using Google-managed mechanisms (confirm CMEK options in docs if required)

Scalability/performance reasons

  • Scale by:
  • Increasing partition count (parallelism)
  • Increasing publish/subscribe throughput capacity (provisioned)
  • Suitable for sustained throughput where you can plan capacity.

When teams should choose Pub/Sub Lite

Choose Pub/Sub Lite when: – You have predictable throughput (steady-state or well-modeled peak patterns). – You want lower cost per byte for sustained volume. – Your pipeline is regional and you can keep producers/consumers in the same region. – You need partition-based parallelism and replay.

When teams should not choose Pub/Sub Lite

Avoid or reconsider Pub/Sub Lite when: – You need global endpoints, cross-region replication, or globally managed multi-region behavior (Pub/Sub Standard is typically the default choice here). – Your traffic is spiky and unpredictable, and you want auto-scaling with minimal capacity planning. – You rely on event-driven triggers that are available for Pub/Sub Standard (for example, some serverless triggers and integrations are built around Pub/Sub Standard; verify current integration support). – You want a simpler mental model (Lite introduces partitions, capacity planning, and cursor concepts).


4. Where is Pub/Sub Lite used?

Industries

  • Digital advertising and marketing (clickstream, conversion events)
  • Financial services (market data feeds, fraud signals—subject to strict governance)
  • Gaming (telemetry, matchmaking events, real-time analytics)
  • Retail/e-commerce (browse/purchase events, inventory updates)
  • Manufacturing and IoT (sensor readings, device telemetry)
  • Media/streaming platforms (playback events, QoE telemetry)

Team types

  • Data engineering teams building streaming ETL/ELT
  • Platform and infrastructure teams running shared ingestion platforms
  • SRE/DevOps teams operating multi-tenant pipeline infrastructure
  • Backend teams building regional event-driven systems
  • Security/operations teams centralizing audit and telemetry

Workloads

  • High-volume telemetry ingestion (logs, metrics, traces—though evaluate if Cloud Logging/Monitoring fit better)
  • Stream processing with Dataflow (windowing, enrichment, aggregation)
  • Event sourcing where per-partition ordering is useful
  • Buffering/decoupling between producers and consumers

Architectures

  • Ingestion bus → stream processing (Dataflow) → analytics sink (BigQuery)
  • Ingestion bus → multiple consumers (fraud scoring, personalization, storage)
  • Regional microservice event bus with partitioned ordering
  • Replayable ingestion layer for backfills and reprocessing

Real-world deployment contexts

  • Production: steady ingest rates with well-defined capacity and SLOs
  • Dev/test: functional testing of stream processing (be mindful that provisioned capacity can still incur costs even with low message volume)

5. Top Use Cases and Scenarios

Below are realistic Pub/Sub Lite use cases. For each, the key reason is typically predictable throughput + cost control + partitioned scale.

1) Clickstream ingestion for analytics

  • Problem: Collect pageview and user interaction events at sustained high volume.
  • Why Pub/Sub Lite fits: Predictable throughput, partitioning for parallel consumers, replay for reprocessing.
  • Scenario: Web/mobile apps publish events keyed by user ID; Dataflow aggregates sessions and writes to BigQuery.

2) IoT sensor telemetry pipeline

  • Problem: Continuous device telemetry ingestion with stable device fleet throughput.
  • Why it fits: Provision capacity for steady throughput; regional locality reduces latency and cost.
  • Scenario: Devices publish temperature/pressure every second; consumers compute alerts and store raw data in Cloud Storage.

3) Centralized application telemetry bus (custom)

  • Problem: Multiple services emit structured telemetry/events needing multiple downstream consumers.
  • Why it fits: Partitioned topic enables scalable fan-out via multiple subscriptions; replay supports incident forensics.
  • Scenario: Microservices publish “order-state-changed” events; one consumer updates a read model, another triggers analytics.

4) Streaming ETL to BigQuery via Dataflow

  • Problem: Transform, enrich, and load events into BigQuery with consistent throughput.
  • Why it fits: Designed for streaming pipelines; Dataflow supports Pub/Sub Lite connectors (verify current templates/connectors).
  • Scenario: Pub/Sub Lite → Dataflow (windowed aggregation) → BigQuery partitioned tables.

5) Payment event pipeline (regional)

  • Problem: Ingest payment authorization events for downstream risk scoring and reporting.
  • Why it fits: Predictable volume, need for replay, per-partition ordering by merchant/customer.
  • Scenario: Publish events keyed by merchant ID; risk engine consumes and writes scores to operational store.

6) CDN/playback QoE event stream

  • Problem: Large volume of playback metrics (startup time, buffering) requiring near-real-time processing.
  • Why it fits: Sustained throughput; partitioning by device region or customer.
  • Scenario: Player apps publish QoE events; stream processor computes rolling KPIs.

7) Security event stream for custom detection

  • Problem: Ingest and process security signals (application/audit events) for custom detection logic.
  • Why it fits: Replay and partitioning; steady ingest; cost controls.
  • Scenario: Security events published; consumers run correlation and store outputs for dashboards.

8) Data replication / CDC event buffering

  • Problem: Buffer change data capture events before applying to downstream systems.
  • Why it fits: Partition ordering by primary key; replay on downstream outages.
  • Scenario: CDC tool publishes row-change events keyed by record ID; subscribers apply changes to cache/search index.

9) Batch-to-stream bridge for legacy systems

  • Problem: Legacy system outputs hourly batch files; you want streaming-like processing.
  • Why it fits: Publish parsed records steadily; subscribers process continuously.
  • Scenario: Cloud Run job parses hourly exports and publishes events; Dataflow enriches and stores.

10) Multi-tenant ingestion platform (internal)

  • Problem: Provide teams with a shared, cost-controlled ingestion backbone.
  • Why it fits: Provision capacity per topic; enforce quotas and naming; predictable internal chargeback.
  • Scenario: Platform team creates one topic per tenant with defined capacity; tenants run consumers independently.

11) Event replay for ML feature generation

  • Problem: Recompute features from recent event history for model retraining.
  • Why it fits: Seek/replay within retention; partitioned reads for parallel processing.
  • Scenario: Nightly job seeks to offsets and reprocesses last N hours.

12) Regional event bus for latency-sensitive microservices

  • Problem: Services need low-latency regional messaging with ordering by entity.
  • Why it fits: Regional placement and partition keys reduce latency; partitions preserve per-entity ordering.
  • Scenario: Inventory updates keyed by SKU; multiple consumers update caches and trigger workflows.

6. Core Features

Pub/Sub Lite is a specialized service. The “core features” are tightly coupled to partitions, capacity provisioning, and cursor-based consumption.

1) Regional, location-scoped resources

  • What it does: Topics and subscriptions exist in a specific region.
  • Why it matters: Latency and cost are strongly affected by region placement.
  • Practical benefit: Keep producers, stream processors, and sinks co-located for low latency and minimal egress.
  • Caveat: Not global; cross-region architectures require deliberate design (and may incur network costs).

2) Partitioned topics

  • What it does: A topic is divided into partitions—each partition is an ordered log.
  • Why it matters: Partitions provide parallelism and ordering.
  • Practical benefit: You can scale consumers horizontally and maintain per-key ordering (by routing keys to partitions).
  • Caveat: Changing partition count later may have operational implications; plan partitioning early (verify current constraints in docs).

3) Provisioned throughput capacity (publish and subscribe)

  • What it does: You configure throughput capacity explicitly rather than relying on auto-scaling.
  • Why it matters: Drives both performance and cost.
  • Practical benefit: Predictable throughput and predictable baseline cost.
  • Caveat: Under-provisioning causes throttling; over-provisioning wastes money. Capacity planning is required.

4) Retention controls

  • What it does: Retain messages for a configured duration and/or storage limit.
  • Why it matters: Enables replay and recovery from downstream outages.
  • Practical benefit: Reprocess recent history during incident recovery.
  • Caveat: Longer retention increases storage consumption (and cost). Retention is not a substitute for long-term archival.

5) Cursor-based consumption and acknowledgements

  • What it does: Subscriber progress is tracked via cursors/offsets per partition; acknowledgements advance the cursor.
  • Why it matters: Supports reliable processing and replay.
  • Practical benefit: Strong operational control (seek to re-read).
  • Caveat: You must design idempotent consumers (at-least-once delivery patterns are common in messaging systems; verify exact delivery semantics in the Lite docs).

6) Seek/replay (within retention)

  • What it does: Reset subscription cursor to reprocess from an earlier offset or timestamp (depending on supported seek modes).
  • Why it matters: Enables backfill and debugging.
  • Practical benefit: Re-run pipeline logic after bug fixes.
  • Caveat: Replay increases downstream compute cost and can cause duplicates unless consumers are idempotent.

7) IAM integration (publisher/subscriber/admin roles)

  • What it does: Use Google Cloud IAM to grant least-privilege access.
  • Why it matters: Central access management, auditability.
  • Practical benefit: Separate producer and consumer permissions cleanly.
  • Caveat: IAM is project-scoped; consider organization policies and guardrails for multi-team setups.

8) Observability via Cloud Monitoring and Cloud Logging

  • What it does: Exposes metrics and audit logs for operations.
  • Why it matters: You need visibility into backlog, throughput, errors, and throttling.
  • Practical benefit: Alert on consumer lag or publish throttling before incidents occur.
  • Caveat: Metrics are only useful if you define SLOs and alerting policies.

9) Client libraries and pipeline integrations

  • What it does: Supports publishing/consuming via Google Cloud client libraries; integrates with Dataflow for streaming ETL (verify the latest connectors and templates).
  • Why it matters: Reduces custom work and supports enterprise patterns.
  • Practical benefit: Faster pipeline delivery and standardized operations.
  • Caveat: Some “Pub/Sub Standard” ecosystem integrations (triggers, managed connectors) may not apply to Lite—validate before choosing Lite.

7. Architecture and How It Works

High-level service architecture

At a high level, Pub/Sub Lite behaves like a managed, partitioned log: – Producers send messages to a topic. – Messages are stored durably in partitions. – Subscribers read messages from partitions, acknowledge them, and their cursors advance. – Retention settings control how long messages remain available for replay.

Request/data/control flow

Data plane 1. Producer authenticates to Google Cloud and publishes messages to a topic in a region. 2. Pub/Sub Lite assigns each message to a partition: – Usually based on a message key (hash) or other routing behavior supported by the client library. 3. Messages are appended to the partition log. 4. Subscriber reads messages from the subscription. 5. Subscriber acknowledges processed messages; cursors advance.

Control plane – Admin operations create/update topics and subscriptions, manage IAM, and configure retention/capacity.

Integrations with related Google Cloud services

Common integrations in Data analytics and pipelines include: – Dataflow: streaming pipelines reading from Pub/Sub Lite and writing to BigQuery/Cloud Storage (verify latest IO connectors). – BigQuery: typically via Dataflow rather than direct subscription ingestion. – Cloud Storage: archival via Dataflow or custom consumers. – Cloud Run / GKE: custom consumers/publishers running containerized apps. – Cloud Monitoring / Logging: metrics + audit logs for visibility and governance.

Dependency services

  • IAM for authentication/authorization
  • Cloud Audit Logs for administrative audit trail
  • Cloud Monitoring for metrics
  • Cloud Billing for usage charges

Security/authentication model

  • Uses Google Cloud IAM identities:
  • User accounts, service accounts, and workload identity (GKE) are common.
  • Client libraries use Application Default Credentials (ADC).
  • Use least privilege: producer and consumer roles separated.

Networking model

  • Pub/Sub Lite is a Google-managed service endpoint.
  • Typical best practice is to keep producers/consumers in the same region as the Lite topic to minimize latency and network costs.
  • For private networking patterns (e.g., avoiding public IPs), evaluate Private Google Access and organization controls. Some restrictions (like VPC Service Controls support) must be verified in official docs.

Monitoring/logging/governance considerations

  • Monitor:
  • Publish throughput vs provisioned capacity
  • Subscribe throughput vs provisioned capacity
  • Subscription backlog/lag
  • Error rates and throttling signals
  • Log/audit:
  • Admin actions (topic/subscription creation, IAM changes)
  • Governance:
  • Naming conventions, labels/tags, and capacity policy per environment (dev/test/prod)

Simple architecture diagram (conceptual)

flowchart LR
  P[Producer App] -->|publish| T[(Pub/Sub Lite Topic)]
  T -->|partitioned log| Part[Partitions]
  Part -->|stream| S[Subscriber App]
  S -->|ack advances cursor| T

Production-style architecture diagram (data analytics pipeline)

flowchart TB
  subgraph Region["Google Cloud Region (same as Pub/Sub Lite)"]
    Producers[Producers<br/>Cloud Run / GKE / VMs] -->|publish| LiteTopic[(Pub/Sub Lite Topic<br/>N partitions)]
    LiteTopic -->|subscription| LiteSub[(Pub/Sub Lite Subscription)]
    LiteSub -->|read| Dataflow[Dataflow Streaming Job]
    Dataflow -->|write| BQ[(BigQuery)]
    Dataflow -->|archive| GCS[(Cloud Storage)]
  end

  Ops[Cloud Monitoring<br/>Dashboards & Alerts] -.metrics.-> LiteTopic
  Logs[Cloud Audit Logs] -.admin logs.-> LiteTopic
  IAM[IAM Policies & Service Accounts] -.authz.-> Producers
  IAM -.authz.-> Dataflow

8. Prerequisites

Account/project requirements

  • A Google Cloud project with billing enabled
  • Access to a supported Google Cloud region where Pub/Sub Lite is available (verify region list in docs)

Permissions / IAM roles

You need permissions to: – Create and manage Pub/Sub Lite topics/subscriptions – Publish and subscribe

Common roles (verify exact role names and what they include in docs): – roles/pubsublite.admin (admin) – roles/pubsublite.publisher (publish) – roles/pubsublite.subscriber (subscribe) – roles/pubsublite.viewer (read-only)

You also need: – roles/serviceusage.serviceUsageAdmin or equivalent to enable APIs (or a project owner/admin can do it)

Billing requirements

  • Pub/Sub Lite is billed based on provisioned capacity, storage, and network egress (details in the pricing section). You should set a budget and alerts before running production-like loads.

CLI/SDK/tools needed

  • Cloud Shell (recommended) or local installation:
  • gcloud CLI
  • Python 3.10+ (for the hands-on tutorial)
  • Optional: Terraform if you want IaC (not required for this lab)

APIs

  • Pub/Sub Lite API (enable in your project; the console may prompt you automatically)
  • IAM APIs (usually already enabled)

Quotas/limits

Expect limits around: – Topics/subscriptions per project – Partitions per topic – Throughput capacity – Retention/storage limits

Always confirm current quotas and request increases if needed: – Check the Quotas page in Google Cloud console and the Pub/Sub Lite docs.

Prerequisite services

  • Cloud Monitoring and Cloud Logging are typically enabled by default in most projects.

9. Pricing / Cost

Pub/Sub Lite pricing is not the same as Pub/Sub Standard pricing. Pub/Sub Lite is designed around provisioned capacity, which is a key cost driver even if you publish very few messages.

Official pricing page (start here and select Pub/Sub Lite): – https://cloud.google.com/pubsub/lite/pricing
If the URL changes, use the product navigation from https://cloud.google.com/pubsub

Pricing calculator: – https://cloud.google.com/products/calculator

Pricing dimensions (how you are charged)

Pub/Sub Lite typically charges along these dimensions (verify exact SKUs and units per region on the pricing page): 1. Publish throughput capacity (provisioned) 2. Subscribe throughput capacity (provisioned) 3. Storage used (messages retained, often billed in GiB-month) 4. Network egress (data leaving the region/service boundary, depending on destination)

Free tier

Pub/Sub Lite free tier availability (if any) can change and can be region/SKU dependent. Verify in the official pricing page.

Cost drivers (what most impacts your bill)

  • Provisioned throughput kept running 24/7
  • Even low message volume can cost money if you keep capacity provisioned continuously.
  • Partition count
  • Partitions influence how you scale and may influence minimum viable provisioning patterns.
  • Retention duration + message volume
  • More retained data increases storage charges.
  • Cross-region consumption
  • Reading from a different region (or writing to sinks in other regions) can add egress charges and latency.

Hidden/indirect costs

  • Dataflow costs (if used): worker compute, streaming engine, shuffle, etc.
  • BigQuery ingestion and storage: streaming inserts, storage, and query costs.
  • Operations overhead: building idempotency, replay mechanisms, and monitoring/alerting.

Network/data transfer implications

  • Keep producers and consumers in-region.
  • Sending data to sinks in other regions can incur:
  • Inter-region egress
  • Higher latency and possible throughput constraints

How to optimize cost (practical actions)

  • Right-size throughput:
  • Start with the minimum capacity that meets SLOs, then adjust.
  • Use load testing to find steady-state and peak needs.
  • Align retention with business need:
  • Retention for replay/debugging (hours/days) vs archival (weeks/months) should be handled differently.
  • Reduce unnecessary replay:
  • Replay is powerful but can multiply downstream compute/storage costs.
  • Co-locate pipeline components:
  • Same region for Pub/Sub Lite topic, Dataflow job, and primary sink where possible.
  • Use budgets and alerts:
  • Create Cloud Billing budgets specifically for the project or even label-based reporting.

Example low-cost starter estimate (no fabricated prices)

A minimal lab setup might be: – 1 topic, 1 partition – Publish throughput capacity set to the minimum allowed – Subscribe throughput capacity set to the minimum allowed – Short retention (for example, 1 day) – Low message volume

Your monthly cost will be roughly:

(Publish capacity × hours × publish capacity rate) + (Subscribe capacity × hours × subscribe capacity rate) + (Avg stored GiB × storage rate) + (Egress GiB × egress rate)

Because capacity is provisioned, the “hours” component can dominate even when message volume is tiny.

Example production cost considerations

For production pipelines: – Model throughput per partition and overall. – Consider separate topics for different workloads so you don’t over-provision a shared topic. – Budget for: – Peak throughput headroom – Storage growth during consumer outages (backlog retention) – Replay/backfill events after incidents or schema changes – Factor in Dataflow/BigQuery/Storage costs as the pipeline usually dominates total cost beyond messaging.


10. Step-by-Step Hands-On Tutorial

This lab creates a Pub/Sub Lite topic and subscription, publishes messages, subscribes to read them, validates results, and cleans up. It is designed to be safe and low-cost, but remember: Pub/Sub Lite charges for provisioned capacity while resources exist, even if message volume is low.

Objective

  • Create a Pub/Sub Lite topic and Lite subscription
  • Publish a small set of test messages
  • Read messages with a subscriber
  • Understand how regional resources, partitioning, and project number affect Lite clients
  • Clean up to stop ongoing charges

Lab Overview

You will: 1. Select a region and create a Pub/Sub Lite topic with 1 partition and minimal throughput capacity. 2. Create a Lite subscription. 3. Use Python client libraries in Cloud Shell to publish messages. 4. Use Python to subscribe and print messages. 5. Validate delivery and basic ordering per partition. 6. Troubleshoot common issues. 7. Delete resources.

Note on exact UI fields and CLI flags: Google Cloud may update console and gcloud surfaces. The workflow is stable, but verify the latest flags/fields in the official Pub/Sub Lite docs if you see differences: https://cloud.google.com/pubsub/lite/docs


Step 1: Set project, region, and enable APIs

1) Open Cloud Shell in the Google Cloud console.

2) Set environment variables:

export PROJECT_ID="$(gcloud config get-value project)"
export REGION="us-central1"   # Choose a region where Pub/Sub Lite is available
export TOPIC_ID="lite-demo-topic"
export SUB_ID="lite-demo-sub"

3) Confirm the project:

gcloud config list --format="text(core.project)"
echo "Project: $PROJECT_ID"

Expected outcome: You see your active core.project.

4) Enable the Pub/Sub Lite API (if not already enabled):

gcloud services enable pubsublite.googleapis.com

Expected outcome: Command completes without error.


Step 2: Create a Pub/Sub Lite topic

You can create the topic in the console (most beginner-friendly) or with gcloud.

Option A (recommended for beginners): Google Cloud Console

  1. Go to Pub/Sub Lite in the console: – https://console.cloud.google.com/cloudpubsub/lite
  2. Create a Lite topic: – Location/region: REGION (for example us-central1) – Partitions: 1 – Retention: choose a short retention (for example 1 day) for the lab – Publish throughput capacity: set to the minimum allowed – Subscribe throughput capacity: set to the minimum allowed
  3. Name it: lite-demo-topic (or use your TOPIC_ID)

Expected outcome: Topic is created and visible in the Pub/Sub Lite topics list.

Option B: gcloud CLI (verify flags with --help)

Run:

gcloud pubsub lite-topics create "$TOPIC_ID" \
  --location="$REGION" \
  --partitions=1 \
  --publish-throughput-capacity=1 \
  --subscribe-throughput-capacity=1 \
  --message-retention-duration=86400s

If the command fails due to flags, run:

gcloud pubsub lite-topics create --help

…and adjust flags to match the current CLI.

Expected outcome: The topic is created.


Step 3: Create a Pub/Sub Lite subscription

Create a Lite subscription to the topic.

Console method

  1. In Pub/Sub Lite, open Subscriptions
  2. Create subscription: – Name: lite-demo-sub – Topic: select lite-demo-topic – Location: same region as topic (REGION)

Expected outcome: Subscription is created and attached to the topic.

CLI method (verify with --help)

gcloud pubsub lite-subscriptions create "$SUB_ID" \
  --location="$REGION" \
  --topic="$TOPIC_ID"

If needed:

gcloud pubsub lite-subscriptions create --help

Step 4: Get your project number (required for many Lite client paths)

Pub/Sub Lite resource paths commonly use the project number (not the project ID). Retrieve it:

export PROJECT_NUMBER="$(gcloud projects describe "$PROJECT_ID" --format='value(projectNumber)')"
echo "Project number: $PROJECT_NUMBER"

Expected outcome: Prints a numeric project number.


Step 5: Install Python dependencies in Cloud Shell

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install google-cloud-pubsublite google-cloud-pubsub

Expected outcome: Packages install successfully.

If you encounter compilation/network issues, retry. Cloud Shell images change; verify the latest official samples if dependency names change: https://cloud.google.com/pubsub/lite/docs/samples


Step 6: Publish test messages to Pub/Sub Lite

Create publish_lite.py:

# publish_lite.py
import time
from google.cloud.pubsublite.types import CloudRegion, TopicPath
from google.cloud.pubsublite.cloudpubsub import PublisherClient

PROJECT_NUMBER = int(input("Project number: ").strip())
REGION = input("Region (e.g., us-central1): ").strip()
TOPIC_ID = input("Topic ID: ").strip()

topic_path = TopicPath(PROJECT_NUMBER, CloudRegion(REGION), TOPIC_ID)
publisher = PublisherClient(topic_path)

print(f"Publishing to {topic_path} ...")

# Publish a small batch of messages.
# Some client examples support message keys for partition routing; if you need that,
# verify the latest sample code in the official docs.
futures = []
for i in range(10):
    data = f"hello-lite-{i}".encode("utf-8")
    futures.append(publisher.publish(data))
    time.sleep(0.05)

# Wait for publish confirmations
for f in futures:
    f.result(timeout=30)

publisher.stop()
print("Done.")

Run it:

python publish_lite.py

When prompted, provide: – Project number: your PROJECT_NUMBER – Region: your REGION – Topic ID: lite-demo-topic

Expected outcome: Script prints “Done.” with no errors.


Step 7: Subscribe and read messages

Create subscribe_lite.py:

# subscribe_lite.py
from concurrent.futures import TimeoutError
from google.cloud.pubsublite.types import CloudRegion, SubscriptionPath
from google.cloud.pubsublite.cloudpubsub import SubscriberClient

PROJECT_NUMBER = int(input("Project number: ").strip())
REGION = input("Region (e.g., us-central1): ").strip()
SUB_ID = input("Subscription ID: ").strip()

sub_path = SubscriptionPath(PROJECT_NUMBER, CloudRegion(REGION), SUB_ID)
subscriber = SubscriberClient(sub_path)

print(f"Subscribing from {sub_path} ... (will run ~20 seconds)")

received = []

def callback(message):
    # message is a Pub/Sub-style message with ack()
    text = message.data.decode("utf-8")
    print(f"Got: {text}")
    received.append(text)
    message.ack()

future = subscriber.subscribe(callback=callback)

try:
    future.result(timeout=20)
except TimeoutError:
    future.cancel()
finally:
    subscriber.stop()

print(f"Received {len(received)} messages.")

Run it:

python subscribe_lite.py

Expected outcome: You see Got: hello-lite-0Got: hello-lite-9 (order may depend on partitioning and publish timing; with 1 partition it is typically sequential).


Validation

Use these checks to validate the setup.

1) Confirm topic and subscription exist (CLI):

gcloud pubsub lite-topics list --location="$REGION"
gcloud pubsub lite-subscriptions list --location="$REGION"

2) Confirm publish/subscribe worked: – Publisher had no exceptions. – Subscriber printed the messages and “Received 10 messages.”

3) Monitoring validation (optional): – Open Cloud Monitoring and look for Pub/Sub Lite metrics for throughput/backlog (metric names can vary; use metric explorer search for “pubsublite”).


Troubleshooting

Common issues and fixes:

1) Permission denied / 403 – Cause: Your user/service account lacks Pub/Sub Lite permissions. – Fix: – Ensure you have a role like roles/pubsublite.admin for resource creation. – For publishing/subscribing, assign roles/pubsublite.publisher and roles/pubsublite.subscriber as appropriate. – Verify the active identity in Cloud Shell: bash gcloud auth list

2) Project number vs project ID confusion – Symptom: Not found errors when using client libraries. – Fix: Ensure your client paths use the project number, not project ID.

3) Region mismatch – Symptom: Not found errors or subscription attach failures. – Fix: Topic and subscription must be in the same region/location. Ensure REGION matches creation region.

4) Throttling / resource exhausted – Cause: Provisioned throughput too low for bursty publishing/consuming. – Fix: – Reduce publish rate for the lab. – Increase publish/subscribe throughput capacity in topic settings (this increases cost).

5) Subscriber receives 0 messages – Causes: – Published to a different topic or region. – Subscription created after publishing and configured to start at “latest” (behavior can depend on subscription settings; verify). – Fix: – Republish messages after subscription is created. – Use seek/replay if supported by your subscription settings (verify seek modes in docs).


Cleanup

Important: Delete the topic and subscription to stop provisioned-capacity charges.

1) Delete subscription:

gcloud pubsub lite-subscriptions delete "$SUB_ID" --location="$REGION"

2) Delete topic:

gcloud pubsub lite-topics delete "$TOPIC_ID" --location="$REGION"

3) Confirm deletion:

gcloud pubsub lite-subscriptions list --location="$REGION"
gcloud pubsub lite-topics list --location="$REGION"

11. Best Practices

Architecture best practices

  • Co-locate everything in the same region
  • Topic, subscribers, Dataflow jobs, and primary sinks should be regional-aligned.
  • Partition thoughtfully
  • Use partitions to scale consumer parallelism.
  • Choose a partitioning strategy (often via message keys) aligned to your processing needs (entity-based ordering, sharding).
  • Design for replay
  • Decide operationally when replay is allowed and how you prevent duplicate side effects.
  • Separate workloads
  • Don’t mix unrelated workloads with different throughput/retention needs into one topic; it forces over-provisioning.

IAM/security best practices

  • Use dedicated service accounts per workload (publisher vs subscriber).
  • Apply least privilege:
  • Publishers should not be able to delete topics.
  • Subscribers should not have publish permissions.
  • Restrict who can change throughput/retention, since those changes affect cost and risk.
  • Use org policies and centralized review for production IAM changes.

Cost best practices

  • Treat provisioned throughput as a baseline “always-on” cost while the topic exists.
  • Start with conservative capacity and scale up based on measured utilization.
  • Keep lab/test topics in a separate project with budgets and automated cleanup.
  • Keep retention minimal for non-production, unless replay is explicitly required.

Performance best practices

  • Match publisher batching and concurrency to your provisioned publish capacity.
  • Run multiple subscriber workers for parallel partitions; ensure each partition can be read at required throughput.
  • Prefer stable, long-lived subscriber processes for streaming rather than frequent restarts.

Reliability best practices

  • Implement idempotent processing downstream (e.g., deduplicate by message ID or event key in storage).
  • Plan for consumer restarts:
  • Ensure offsets/cursors are committed via acknowledgements.
  • Define backpressure behavior:
  • If downstream is slow, backlog grows; ensure retention and storage are sized accordingly.

Operations best practices

  • Use dashboards:
  • Throughput utilization
  • Backlog/lag per subscription
  • Error rates
  • Alerting:
  • Sustained backlog growth
  • Throttling indicators
  • Subscriber error/restart loops
  • Document runbooks:
  • Increasing capacity safely
  • Performing a replay/seek
  • Handling poison-pill messages (bad payloads)

Governance/tagging/naming best practices

  • Naming:
  • env.region.domain.topic-purpose patterns help: prod.us-central1.analytics.clickstream
  • Labels/tags:
  • Owner team
  • Cost center
  • Data classification
  • Separate projects for:
  • dev/test vs prod
  • regulated vs non-regulated datasets

12. Security Considerations

Identity and access model

  • Pub/Sub Lite uses IAM for:
  • Admin actions (create/update/delete topics/subscriptions)
  • Data actions (publish/subscribe)
  • Prefer service accounts for workloads:
  • Cloud Run, GKE, Compute Engine, Dataflow workers

Encryption

  • Data in transit uses TLS (standard Google Cloud API behavior).
  • Data at rest is encrypted by default using Google-managed encryption.
  • If you require customer-managed encryption keys (CMEK), verify Pub/Sub Lite CMEK support and configuration steps in official docs before committing to Lite.

Network exposure

  • Pub/Sub Lite is accessed via Google-managed endpoints.
  • Reduce exposure by:
  • Running clients without public IPs and using Private Google Access where applicable
  • Restricting egress with firewall rules and organization policies
  • If you rely on VPC Service Controls for exfiltration controls, verify whether Pub/Sub Lite is supported in your perimeter design (support can differ by service and evolves over time).

Secrets handling

  • Avoid embedding service account keys.
  • Prefer:
  • Workload identity (GKE)
  • Attached service accounts (Compute Engine)
  • Cloud Run service identity
  • If you must use keys (not recommended), store them in Secret Manager and rotate regularly.

Audit/logging

  • Use Cloud Audit Logs to track:
  • Topic/subscription creation and deletion
  • IAM policy changes
  • Ensure logs are routed to a secured central logging project if required.

Compliance considerations

  • Regional resources can support data residency requirements, but compliance depends on:
  • Region selection
  • Organization policy
  • Data classification and handling
  • Key management requirements
    Always confirm with your compliance team and official Google Cloud compliance documentation.

Common security mistakes

  • Granting admin broadly to developer groups in production.
  • Sharing a single service account across many apps (hard to audit and rotate).
  • Not separating dev/test/prod projects, leading to accidental data exposure.
  • Ignoring replay implications (replayed messages might re-trigger side effects).

Secure deployment recommendations

  • Implement a least-privilege IAM baseline:
  • Separate roles for topic management vs publish vs subscribe
  • Use environment separation with budgets and policy guardrails
  • Add a data classification label to every topic
  • Use centralized monitoring and audit log export for security operations

13. Limitations and Gotchas

This section highlights common design mistakes with Pub/Sub Lite in real systems.

Regional constraints

  • Pub/Sub Lite topics/subscriptions are regional. Cross-region producers/consumers can cause:
  • Higher latency
  • Egress charges
  • Operational complexity

Capacity planning is mandatory

  • Under-provisioned capacity can throttle your workload.
  • Over-provisioned capacity costs money even when idle.

Partition management complexity

  • Partitions drive parallelism and ordering:
  • Too few partitions limits scalability.
  • Too many partitions can complicate consumer design and operations.
  • Changing partitioning later can be non-trivial; plan for growth.

Delivery semantics and duplicates

  • Messaging systems commonly require consumers to be idempotent. Verify Pub/Sub Lite delivery guarantees in official docs and always design downstream processing to handle duplicates or retries.

Ecosystem integration differences vs Pub/Sub Standard

  • Some “turnkey” integrations and triggers built for Pub/Sub Standard may not exist for Lite.
  • Expect to use:
  • Dataflow
  • Custom consumers (Cloud Run/GKE/VMs)

Pricing surprises

  • Provisioned throughput is billed while configured, not only when used.
  • Leaving lab topics running can create unexpected baseline cost.

Replay can amplify downstream costs

  • Seeking/replaying events is useful but can:
  • Multiply Dataflow compute
  • Increase BigQuery ingestion
  • Re-trigger side effects if consumers aren’t idempotent

Quotas

  • Quotas exist for:
  • Topics/subscriptions
  • Partitions
  • Throughput capacities
    Check quotas in the console and docs before production rollout.

Migration challenges

  • Migrating from Pub/Sub Standard or Kafka requires changes:
  • Partition model and message key strategy
  • Capacity planning
  • Different operational playbooks (cursor/seek, capacity adjustments)

14. Comparison with Alternatives

Pub/Sub Lite is best compared with Pub/Sub Standard and managed streaming platforms.

Option Best For Strengths Weaknesses When to Choose
Pub/Sub Lite (Google Cloud) Predictable, steady high-throughput regional streaming Lower cost potential for sustained throughput; partitioned ordering; replay via cursors; explicit capacity control Requires capacity planning; regional; fewer plug-and-play integrations than Pub/Sub Standard You can model throughput and want cost control + partitioned streaming
Pub/Sub Standard (Google Cloud Pub/Sub) Event-driven apps and pipelines with spiky or unpredictable traffic Auto-scaling; global service; rich integrations and triggers; simple ops model Can be more expensive for sustained high volume; less explicit control of capacity Default choice for most event-driven architectures and serverless triggers
Dataflow + Pub/Sub Lite Managed stream processing pipelines Strong integration for analytics; handles transformation/windowing Adds compute cost and operational surface You need streaming ETL to BigQuery/Storage and want managed processing
Kafka (self-managed on GKE/Compute Engine) Maximum control/customization, Kafka ecosystem tools Full Kafka API; wide ecosystem; control over configs Significant ops burden; patching/upgrades; capacity planning; reliability engineering You need Kafka-specific features/ecosystem and accept operational cost
Confluent Cloud (managed Kafka) Kafka API with less ops burden Kafka ecosystem + managed offering Cost; vendor account management; networking complexity You require Kafka compatibility and managed operations
AWS Kinesis / MSK AWS-native streaming Tight AWS integrations Multi-cloud complexity if on Google Cloud Choose if your platform is primarily AWS
Azure Event Hubs Azure-native event ingestion Tight Azure integrations Multi-cloud complexity if on Google Cloud Choose if your platform is primarily Azure

15. Real-World Example

Enterprise example: Retail telemetry and near-real-time analytics

  • Problem: A retailer needs to ingest steady clickstream and app events across regions, process them in near real time, and load curated datasets into BigQuery for dashboards and experimentation.
  • Proposed architecture:
  • Regional producers (apps/services) publish to Pub/Sub Lite topics in-region
  • Dataflow streaming pipelines read subscriptions, validate/enrich, and write:
    • curated events to BigQuery
    • raw archives to Cloud Storage
  • Central monitoring dashboards track backlog, throughput, and pipeline health
  • Why Pub/Sub Lite was chosen:
  • Throughput is steady and predictable (marketing peaks are known and planned)
  • Cost optimization compared to purely elastic messaging for sustained ingest
  • Partitioning supports parallel processing and per-user/session ordering
  • Expected outcomes:
  • Predictable baseline cost from provisioned capacity
  • Low-latency analytics updates
  • Replay capability for reprocessing last N hours after pipeline changes

Startup/small-team example: IoT monitoring for a regional fleet

  • Problem: A small team collects telemetry from a predictable number of devices in one primary region and needs a durable buffer before writing into storage and running alerts.
  • Proposed architecture:
  • Devices → API service (Cloud Run) → Pub/Sub Lite topic (single region)
  • Consumer service (Cloud Run or GKE) reads subscription and:
    • writes recent metrics to a time-series store (or BigQuery for analytics)
    • triggers alerts through a separate notification system
  • Why Pub/Sub Lite was chosen:
  • Device fleet size makes throughput predictable
  • Regional-only footprint simplifies compliance and cost
  • Replay helps recover from consumer bugs without data loss
  • Expected outcomes:
  • Simple, cost-controlled pipeline that scales as device count grows
  • Strong operational control with clear capacity planning

16. FAQ

1) Is Pub/Sub Lite the same as Pub/Sub?
No. Pub/Sub Lite is a separate service in the Cloud Pub/Sub family with partitioned topics and provisioned throughput. Pub/Sub Standard is generally auto-scaling and global.

2) Do I need to pre-provision throughput?
Yes. Pub/Sub Lite is designed around provisioned publish and subscribe throughput capacity. That provisioning is a key cost/performance control.

3) Is Pub/Sub Lite global or regional?
Pub/Sub Lite resources are regional/location-scoped. Plan for regional architectures and co-locate components.

4) Does Pub/Sub Lite support ordering?
Pub/Sub Lite provides ordering per partition. Global ordering across partitions is not a standard streaming pattern; design around partition-level ordering.

5) How does partitioning work?
A topic is split into N partitions. Messages are assigned to partitions (often via a key). Consumers can read partitions in parallel.

6) Can I replay messages?
Yes, within retention limits, using cursor/seek capabilities. Verify supported seek modes (by offset/time) in the official docs.

7) What happens if my consumer is down?
Messages remain available until retention limits are reached. Backlog increases and storage usage may increase.

8) Will Pub/Sub Lite trigger Cloud Functions or Eventarc directly?
Many event-driven triggers are built around Pub/Sub Standard. Verify current trigger/integration support for Lite; often you use Dataflow or custom consumers instead.

9) What’s the biggest cost surprise with Pub/Sub Lite?
Provisioned throughput is billed while provisioned, even if message volume is low. Always clean up test resources.

10) How do I choose the number of partitions?
Base it on required parallelism, throughput, and ordering requirements. A common approach is to start with a number that matches expected consumer parallelism and scale up carefully (verify how partition changes are handled in the service).

11) Do Lite topics support schemas?
Schema support and integration may differ from Pub/Sub Standard. Verify schema features for Pub/Sub Lite in the official docs.

12) Can I use Terraform for Pub/Sub Lite?
Often yes via Google provider resources, but resource coverage changes over time. Verify Terraform support for Pub/Sub Lite topics/subscriptions in the provider docs.

13) How do I monitor subscriber lag?
Use Cloud Monitoring metrics for subscriptions/backlog and set alerts when lag grows beyond your SLO thresholds. Metric names evolve—search for “pubsublite” in Metrics Explorer.

14) Is Pub/Sub Lite suitable for exactly-once processing?
End-to-end exactly-once is typically an application and sink concern. Verify Pub/Sub Lite delivery semantics and design idempotent processing.

15) How do I migrate from Kafka?
Plan for: – topic partition mapping – consumer group behavior differences – offsets/cursors vs Kafka offsets – capacity provisioning model
Often migration requires code and operational changes.

16) How do I avoid duplicates when replaying?
Use idempotency keys, deduplication in sinks, or transactional writes where possible. Treat replay as a deliberate operational action.

17) Is Pub/Sub Lite good for dev/test?
Yes for functional validation, but it can be cost-inefficient if you leave provisioned capacity running. Use aggressive cleanup automation.


17. Top Online Resources to Learn Pub/Sub Lite

Resource Type Name Why It Is Useful
Official documentation Pub/Sub Lite docs – https://cloud.google.com/pubsub/lite/docs Authoritative overview, concepts, API behavior, and configuration guidance
Official pricing Pub/Sub Lite pricing – https://cloud.google.com/pubsub/lite/pricing Current SKUs, billing dimensions, and region-specific pricing
Pricing tool Google Cloud Pricing Calculator – https://cloud.google.com/products/calculator Build scenario-based cost estimates using official calculator
Official samples Pub/Sub Lite samples (Docs/Samples section) – https://cloud.google.com/pubsub/lite/docs/samples Up-to-date code examples for publishing, subscribing, and admin operations
Client libraries Google Cloud Pub/Sub Lite client libraries – start from docs: https://cloud.google.com/pubsub/lite/docs Links to supported languages, auth patterns, and latest APIs
Architecture guidance Google Cloud Architecture Center – https://cloud.google.com/architecture Patterns for streaming analytics, data ingestion, and pipeline operations (not Lite-specific but highly relevant)
Dataflow integration Apache Beam / Dataflow documentation – https://cloud.google.com/dataflow/docs How to build streaming pipelines; verify Pub/Sub Lite IO connectors for your language/runner
Observability Cloud Monitoring – https://cloud.google.com/monitoring/docs Dashboards, metrics explorer, alerting for throughput/backlog/SLOs
Governance/audit Cloud Audit Logs – https://cloud.google.com/logging/docs/audit Track admin activity for security and compliance
Community learning Google Cloud Community – https://www.googlecloudcommunity.com/ Practical discussions and troubleshooting (validate against official docs)

18. Training and Certification Providers

Institute Suitable Audience Likely Learning Focus Mode Website URL
DevOpsSchool.com DevOps engineers, SREs, cloud engineers Google Cloud operations, CI/CD, infrastructure and monitoring practices that can support streaming systems Check website https://www.devopsschool.com/
ScmGalaxy.com Beginners to intermediate practitioners Software configuration management and DevOps foundations relevant to operating cloud services Check website https://www.scmgalaxy.com/
CLoudOpsNow.in Cloud ops and platform teams Cloud operations practices; may include observability and cost control themes applicable to Pub/Sub Lite pipelines Check website https://www.cloudopsnow.in/
SreSchool.com SREs, operations teams Reliability engineering, SLOs, incident response, monitoring—critical for streaming pipelines Check website https://www.sreschool.com/
AiOpsSchool.com Ops teams adopting automation AIOps concepts, monitoring, automation approaches for operating production systems Check website https://www.aiopsschool.com/

19. Top Trainers

Platform/Site Likely Specialization Suitable Audience Website URL
RajeshKumar.xyz DevOps/cloud training content (verify current offerings) Engineers seeking guided learning paths https://rajeshkumar.xyz/
devopstrainer.in DevOps tooling and practices (verify current offerings) Beginners to intermediate DevOps practitioners https://www.devopstrainer.in/
devopsfreelancer.com Freelance DevOps support/training platform (verify offerings) Teams needing practical help implementing/operating pipelines https://www.devopsfreelancer.com/
devopssupport.in DevOps support and guidance (verify offerings) Ops teams needing troubleshooting and platform support help https://www.devopssupport.in/

20. Top Consulting Companies

Company Name Likely Service Area Where They May Help Consulting Use Case Examples Website URL
cotocus.com Cloud/DevOps consulting (verify exact catalog) Architecture reviews, platform setup, ops processes Designing regional ingestion pipelines; setting up monitoring and IAM; cost optimization reviews https://cotocus.com/
DevOpsSchool.com DevOps and cloud consulting/training (verify exact catalog) Enablement, implementation support, skills development Building CI/CD for streaming apps; operational runbooks; capacity planning processes https://www.devopsschool.com/
DEVOPSCONSULTING.IN DevOps consulting services (verify exact catalog) DevOps transformation and cloud operations Production readiness reviews; observability stack; incident response process https://www.devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Pub/Sub Lite

  • Google Cloud fundamentals:
  • Projects, IAM, service accounts, billing
  • Networking basics:
  • Regions/zones, latency, egress
  • Messaging and streaming fundamentals:
  • pub/sub, partitions, offsets/cursors, retention
  • Basic observability:
  • Metrics, logs, alerting

What to learn after Pub/Sub Lite

  • Dataflow / Apache Beam for streaming transformations and windowing
  • BigQuery modeling for streaming analytics (partitioning, clustering, ingestion patterns)
  • Reliability patterns:
  • Idempotency, deduplication, replay runbooks
  • Infrastructure as Code:
  • Terraform modules for topics/subscriptions/IAM
  • Security hardening:
  • Org policies, audit log sinks, (verify) VPC Service Controls applicability

Job roles that use it

  • Data Engineer (streaming)
  • Cloud Engineer / Platform Engineer
  • DevOps Engineer / SRE operating data pipelines
  • Backend Engineer building event-driven systems
  • Security Engineer (telemetry pipelines and governance)

Certification path (Google Cloud)

There is typically no Pub/Sub Lite-specific certification. Relevant Google Cloud certifications to consider: – Professional Data EngineerProfessional Cloud ArchitectProfessional DevOps Engineer
Verify current certification names and exam scopes: https://cloud.google.com/learn/certification

Project ideas for practice

  • Build Pub/Sub Lite → Dataflow → BigQuery pipeline with windowed aggregations
  • Implement a multi-partition consumer on GKE with autoscaling based on lag metrics
  • Create a replay tool that seeks and reprocesses the last 2 hours into a separate BigQuery table
  • Implement end-to-end idempotency with dedup keys stored in a fast datastore
  • Build a capacity planning worksheet and load test harness for publish/subscribe throughput

22. Glossary

  • Pub/Sub Lite: Google Cloud’s provisioned-throughput, partitioned messaging service.
  • Pub/Sub (Standard): Google Cloud’s auto-scaling, global messaging service (commonly referred to as Pub/Sub).
  • Topic: Named message stream that producers publish to.
  • Subscription: A consumer’s view of a topic; tracks delivery progress.
  • Partition: An ordered append-only log segment of a topic used for parallelism and ordering.
  • Throughput capacity: Provisioned publish/subscribe bandwidth configured for a Lite topic.
  • Retention: How long messages remain stored and available for delivery/replay.
  • Cursor/Offset: Position in a partition indicating the subscriber’s progress.
  • Seek/Replay: Resetting a subscription cursor to reprocess messages.
  • Idempotency: Ability to process the same message multiple times without changing the final outcome incorrectly.
  • Backlog/Lag: Amount of unprocessed data/messages a subscription has accumulated.
  • ADC (Application Default Credentials): Google Cloud auth mechanism used by client libraries to obtain credentials.

23. Summary

Pub/Sub Lite is Google Cloud’s regional, partitioned, provisioned-throughput messaging service built for Data analytics and pipelines where throughput is predictable and cost control matters. It fits best as a durable ingestion buffer and streaming backbone for systems that can plan partitions and capacity, and that benefit from cursor-based replay.

Key points to remember: – Cost model: driven by provisioned publish/subscribe capacity, plus storage and egress—clean up resources to avoid ongoing baseline charges. – Security model: standard Google Cloud IAM with audit logging; apply least privilege and environment separation. – Where it fits: regional streaming ingest feeding Dataflow/BigQuery and other pipeline components, especially under sustained load. – When to use it: steady, well-modeled throughput + need for partitioned scale and replay. – Next learning step: build a streaming pipeline with Dataflow consuming Pub/Sub Lite and writing to BigQuery, then add monitoring alerts for backlog and throttling.