Google Cloud Vertex AI Studio Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for AI and ML

1. Introduction

Vertex AI Studio is Google Cloud’s console-based workspace for prototyping, testing, and iterating on generative AI workflows—especially prompt design—using models hosted on Vertex AI (including Google’s Gemini models and other Model Garden options).

In simple terms: Vertex AI Studio is where you try ideas with foundation models quickly, without first building a full application. You can craft prompts, tune model behavior with parameters, test variations, and then export working code to use in production services.

Technically, Vertex AI Studio is a UI layer inside Vertex AI that helps you interact with Vertex AI’s generative model APIs (for example, the Gemini generateContent API). It supports rapid experimentation (prompt iteration, structured outputs, safety settings) and provides pathways to operationalize results through the Vertex AI API, IAM, audit logging, and standard Google Cloud governance controls.

What problem it solves: teams need a safe, repeatable way to go from “idea” → “validated prompt” → “API-backed implementation” while controlling access, cost, and security in an enterprise cloud environment.

Naming note (important): Google previously used the name Generative AI Studio for similar capabilities. In current Google Cloud terminology, these console experiences are commonly presented as Vertex AI Studio within Vertex AI. If you see “Generative AI Studio” in older posts or labs, treat it as legacy naming and verify against current Vertex AI Studio docs in Google Cloud.

2. What is Vertex AI Studio?

Official purpose (practical definition): Vertex AI Studio is the Google Cloud console experience in Vertex AI for interacting with and prototyping generative AI solutions—primarily by designing and testing prompts and model parameters against foundation models available on Vertex AI.

Core capabilities (what you can do)

Experiment with text/chat prompts against Gemini and other supported models.
Control inference behavior using parameters (temperature, max output tokens, etc., depending on model).
Produce and test structured outputs (for example, JSON) and refine prompts for reliability.
Optionally test multimodal prompts (model/region dependent) when supported.
Generate starter code (language/SDK dependent) to call Vertex AI APIs using the same model and settings.

Major components (conceptual)

Vertex AI Studio UI (console): prompt editors and testing panels.
Vertex AI Generative AI APIs: the actual API endpoints that execute model inference.
Model selection (Model Garden): choose Google models (Gemini) and other available publisher/open models, depending on what your project/region supports.
Project + IAM + Audit logs: Google Cloud governance around who can access models, who can run prompts, and what was called.

Service type

Managed service UI / console experience backed by Vertex AI APIs.
Not a standalone compute runtime: Vertex AI Studio does not “host” your app. Your production usage typically calls Vertex AI APIs from Cloud Run/GKE/Compute Engine/on-prem via HTTPS.

Scope (regional/global/project-scoped)

Project-scoped: Access is governed by the Google Cloud project you select in the console.
Regional behavior: Model availability and the API endpoints you call are location-based (you choose a Vertex AI region like us-central1, europe-west4, etc.). Some models/features are not available in all regions.
Identity-scoped via IAM: Permissions are granted to users/groups/service accounts through IAM roles.

How it fits into the Google Cloud ecosystem

Vertex AI Studio sits inside Vertex AI and pairs naturally with: – Cloud Run / GKE for serving apps that call Gemini via Vertex AI. – BigQuery / Cloud Storage for storing and processing data used to craft prompts and evaluate outputs. – Cloud Logging / Cloud Monitoring for operations. – IAM / VPC Service Controls / Cloud KMS for security and governance.

3. Why use Vertex AI Studio?

Business reasons

Faster time-to-value: validate a generative AI use case before investing engineering time.
Lower prototyping cost: test small prompt variants quickly (you still pay for model usage).
Better alignment: product, security, and engineering can review the same prompt behavior in a controlled environment.

Technical reasons

Repeatable prompt iteration: quickly test different instructions, examples (few-shot), and constraints.
Model + parameter exploration: compare responses with different decoding parameters.
Easier path to production: export code snippets aligned to Vertex AI APIs (then integrate with CI/CD).

Operational reasons

Centralized governance: usage is tied to a Google Cloud project; access is managed via IAM.
Auditability: API usage can be audited (subject to your logging configuration and what the service emits).
Environment separation: you can use separate projects for dev/test/prod.

Security/compliance reasons

IAM-based access control (users/groups/service accounts).
Data governance posture aligned to Google Cloud controls (verify the latest model data usage terms in official docs).
Enterprise guardrails: VPC Service Controls, organization policies, and controlled egress patterns for production callers (Studio itself is a console experience).

Scalability/performance reasons

Studio is for prototyping, but what you build calls Vertex AI APIs that scale as managed services.
You can scale production callers independently (Cloud Run autoscaling, GKE HPA).

When teams should choose Vertex AI Studio

Choose it when you need: – Rapid prompt prototyping for a real app (support automation, summarization, extraction, classification). – A governance-friendly environment to test generative AI inside Google Cloud. – A bridge from experimentation to API-based implementation on Vertex AI.

When teams should not choose it

Avoid relying on Vertex AI Studio when you need: – A full prompt/version lifecycle management platform with SDLC features comparable to a dedicated promptops tool (some features exist, but confirm current capabilities). – Offline or air-gapped experimentation (console requires access to Google Cloud). – Full custom model training pipelines (use Vertex AI Training, Workbench, Pipelines instead). – A chatbot product with channels/integrations out-of-the-box (consider Dialogflow CX or Vertex AI Agent Builder; verify the current product lineup).

4. Where is Vertex AI Studio used?

Industries

Customer support and contact centers
E-commerce and retail
Financial services (with strong governance requirements)
Healthcare and life sciences (with strict data handling constraints)
Media and publishing
Software/SaaS
Manufacturing and logistics

Team types

Cloud/platform teams building internal AI platforms
Application engineering teams building AI features
Data/ML teams validating model behavior
Security and compliance reviewers validating controls and data handling
SRE/operations teams monitoring production inference services

Workloads

Summarization (tickets, calls, documents)
Extraction (entities, fields, structured data)
Classification and routing
Draft generation (emails, knowledge base articles)
Code assistance for internal tools (verify policy compliance)
Multimodal analysis (where enabled): image-to-text, document understanding (model dependent)

Architectures and deployment contexts

Prototyping in Studio → production calls from Cloud Run or GKE → logs/metrics in Cloud Operations
Enterprise governance: Organization policies + VPC Service Controls + centralized logging
Multi-project setup: dev/test/prod separation with different IAM and quotas

Production vs dev/test usage

Dev/test: Studio is ideal for prompt iteration and small evaluations.
Production: You typically do not serve requests “from Studio.” You call Vertex AI models through APIs from your runtime (Cloud Run/GKE/etc.), with proper auth, quotas, retries, and monitoring.

5. Top Use Cases and Scenarios

Below are realistic scenarios where Vertex AI Studio is commonly used to prototype and validate the prompt + model approach before productionization.

1) Customer support ticket summarization

Problem: Support agents waste time reading long ticket threads.
Why Vertex AI Studio fits: Rapidly test summarization prompts and formatting requirements.
Example: Summarize a 30-message ticket into “Issue / Steps Tried / Current Status / Next Action” and export code to integrate into a CRM workflow.

2) Email intent classification and routing

Problem: Incoming emails need fast triage to correct teams.
Why it fits: Prototype few-shot prompts that output a strict label set.
Example: Classify emails into BILLING, TECH_SUPPORT, CANCELLATION, SALES, returning JSON used by a routing service.

3) Structured data extraction from text

Problem: Extract order IDs, dates, amounts, customer names from unstructured messages.
Why it fits: Iterate until extraction is reliable and returns valid JSON.
Example: Extract {"order_id": "...", "refund_amount": ..., "currency": "...", "reason": "..."} from chat transcripts.

4) Knowledge base article drafting

Problem: Support teams need consistent documentation quickly.
Why it fits: Test tone, style guides, and templates with constrained outputs.
Example: Draft a troubleshooting article with sections and bullet points, then human review.

5) Meeting/call note transformation

Problem: Raw meeting notes aren’t actionable.
Why it fits: Validate prompts that produce action items and owners.
Example: Convert a transcript into “Decisions / Action Items / Risks / Follow-ups.”

6) Policy and compliance Q&A (with guardrails)

Problem: Employees need answers from internal policy docs.
Why it fits: Prototype response style, refusal behaviors, and citations patterns (actual grounding solution may involve additional services).
Example: Draft prompt patterns for “answer with references and say ‘I don’t know’ if missing.”

7) Product review sentiment analysis

Problem: Large volumes of reviews need quick insights.
Why it fits: Prototype consistent sentiment labels and topic extraction.
Example: Output JSON with sentiment, topics, and urgency fields.

8) SQL generation (controlled)

Problem: Analysts want natural language to SQL, but must avoid unsafe queries.
Why it fits: Prototype prompt constraints and safe query patterns before building a tool.
Example: Only generate SELECT queries and include a LIMIT.

9) Code explanation for internal onboarding

Problem: New engineers struggle to understand legacy services.
Why it fits: Test prompts that explain code with architecture context and warnings.
Example: Summarize a service’s endpoints and dependencies from a README (ensure you follow your organization’s code/data policies).

10) Content moderation assistance (human-in-the-loop)

Problem: Moderation teams need prioritization signals.
Why it fits: Prototype labeling and explanations with deterministic-ish settings.
Example: Classify text into safety categories and provide “why,” feeding a review queue.

11) Marketing copy variants with brand tone

Problem: Need multiple ad copy options under constraints.
Why it fits: Fast iteration to meet length and tone constraints.
Example: Generate 10 variants under 90 characters with no banned terms.

12) Internal IT helpdesk automation draft responses

Problem: Helpdesk agents need suggested replies.
Why it fits: Prototype response templates and escalation triggers.
Example: Generate a draft reply and a “next diagnostic question” field.

6. Core Features

Feature availability can change by region and model. Verify the latest Vertex AI Studio feature set in official docs.

1) Prompt design and iteration (text/chat)

What it does: Lets you write prompts (instructions + user inputs) and test responses interactively.
Why it matters: Prompt quality strongly impacts correctness, safety, and cost.
Practical benefit: Short feedback loop to improve output format, tone, and compliance.
Limitations/caveats: Outputs can still be non-deterministic; rely on validation and guardrails for production.

2) Model selection (Vertex AI Model Garden integration)

What it does: Allows choosing from available models (commonly Gemini models hosted by Google; other publishers may appear depending on your org settings).
Why it matters: Different models trade off cost, latency, context length, and quality.
Practical benefit: Test the least expensive model that meets requirements.
Limitations/caveats: Model availability varies by region and may require allowlisting or specific org policies.

3) Parameter controls (decoding/inference settings)

What it does: Adjusts generation behavior (for example, temperature, max output tokens, top-p/top-k where supported).
Why it matters: Helps balance creativity vs consistency.
Practical benefit: Make outputs more stable for extraction/classification tasks.
Limitations/caveats: Supported parameters differ per model/API version.

4) Safety settings and content controls (model dependent)

What it does: Configure how the model handles potentially unsafe content (exact controls depend on model and API).
Why it matters: Reduces risk of harmful or policy-violating output.
Practical benefit: Safer prototypes and clearer expectations for production behavior.
Limitations/caveats: Safety controls are not a substitute for your own application-layer validation and access control.

5) Structured output prompting (for JSON or schemas)

What it does: Supports patterns that encourage consistent JSON output (and in some APIs, structured output features may exist—verify current docs for Gemini on Vertex AI).
Why it matters: Production apps need machine-parseable outputs.
Practical benefit: Faster integration into downstream systems (queues, ticketing, workflows).
Limitations/caveats: Even with strong prompts, you must validate JSON and handle failures.

6) Code export / “get code” workflow

What it does: Provides code snippets to call the same model via Vertex AI APIs.
Why it matters: Converts a successful prototype into an implementable call pattern.
Practical benefit: Reduces integration mistakes (wrong endpoint, wrong auth, wrong region).
Limitations/caveats: Generated code is a starting point—production needs retries, timeouts, observability, and secret management.

7) Multimodal prompting (where supported)

What it does: Use text + images (and potentially other modalities) with multimodal models.
Why it matters: Unlocks document and image understanding workflows.
Practical benefit: Prototype visual QA, image classification, extraction from screenshots, etc.
Limitations/caveats: Availability depends on model, region, and policy constraints; costs may be higher.

8) Evaluation mindset support (manual comparisons)

What it does: Helps compare prompt versions and outputs during experimentation.
Why it matters: Without evaluation, “it seems good” is not reliable.
Practical benefit: Encourages repeatability and test cases (golden prompts).
Limitations/caveats: For systematic evaluation at scale, you may need additional Vertex AI evaluation tooling or custom pipelines (verify current Vertex AI evaluation offerings).

9) Project/IAM integration

What it does: Studio access and model calls are controlled by IAM in a Google Cloud project.
Why it matters: Enterprise access control, separation of duties, and audit readiness.
Practical benefit: Manage who can test prompts, who can deploy code, who can view logs.
Limitations/caveats: Misconfigured IAM can either block progress (too strict) or increase risk (too broad).

7. Architecture and How It Works

High-level architecture

Vertex AI Studio is a console UI that sends requests to Vertex AI model endpoints in a chosen region.
The model runs on Google-managed infrastructure; results are returned to the UI.
In production, your application calls the same Vertex AI API endpoints directly using a service account.

Request/data/control flow (conceptual)

User selects a Google Cloud project and opens Vertex AI Studio.
User selects a region and model (for example, a Gemini model hosted on Vertex AI).
User submits prompt content and parameter settings.
Vertex AI receives the request, authenticates via Google identity/IAM, enforces quotas/policies, and returns the model output.
Logs/metrics are emitted depending on service capabilities and your project settings (audit logs are commonly available for API calls; verify for your exact usage).

Integrations with related services

Common integrations when moving from Studio to production: – Cloud Run / GKE: host an API that calls Vertex AI. – Secret Manager: store API keys for downstream systems (Vertex AI itself uses IAM auth; your app may still need other secrets). – Cloud Logging / Cloud Monitoring: observe latency, errors, request volume. – Cloud Storage / BigQuery: store prompt test cases, evaluation sets, and model outputs (mind data governance). – Cloud KMS: customer-managed encryption keys for supported resources (not all generative inference paths use CMEK—verify). – VPC Service Controls: reduce data exfiltration risk for supported services (verify current support boundaries for Vertex AI generative endpoints).

Dependency services

Vertex AI API (aiplatform.googleapis.com) enabled in the project.
Billing enabled.
IAM roles for users/service accounts.

Security/authentication model

Users (Studio): authenticate via Google identity (Cloud Console) + IAM.
Workloads (production): authenticate with service accounts and OAuth 2.0 access tokens to call Vertex AI endpoints.
Use least privilege roles, and separate dev/test/prod projects.

Networking model

Vertex AI API endpoints are accessed over HTTPS.
From Google Cloud runtimes, you typically use:
Private Google Access (for VMs) or standard egress with controlled NAT
Organization controls (VPC-SC where applicable)
For strict environments, consider restricting egress, using regional endpoints, and reviewing whether private connectivity options apply to your exact Vertex AI usage (verify in official docs for generative endpoints).

Monitoring/logging/governance considerations

Track:
Request count, error rates, latency (from your calling service)
Model usage and quotas
Cost by project/labels
Govern:
IAM (who can call models)
Org policy constraints (service usage restrictions)
Data classification and retention policies

Simple architecture diagram (prototype)

flowchart LR
  U[User in Cloud Console] --> S[Vertex AI Studio]
  S -->|Prompt + parameters| VAI[Vertex AI\nGenerative Model API (Gemini)]
  VAI -->|Generated content| S
  S --> U

Production-style architecture diagram (operationalized)

flowchart TB
  subgraph Users
    A[Internal app users] --> UI[Web UI]
  end

  subgraph Google Cloud Project (Prod)
    UI --> LB[HTTPS Load Balancer / API Gateway]
    LB --> CR[Cloud Run service\n(GenAI Orchestrator)]
    CR -->|OAuth token via SA| VAI[Vertex AI\nGemini model endpoint\n(region)]
    CR --> LOG[Cloud Logging]
    CR --> MON[Cloud Monitoring]
    CR --> SM[Secret Manager\n(non-Vertex secrets)]
    CR --> BQ[BigQuery\n(optional: eval + analytics)]
    CR --> GCS[Cloud Storage\n(optional: test sets/artifacts)]
  end

  subgraph Governance
    IAM[IAM + Least Privilege]
    ORG[Org Policies / VPC Service Controls\n(where applicable)]
    KMS[Cloud KMS\n(CMEK where supported)]
  end

  CR -. governed by .-> IAM
  CR -. governed by .-> ORG
  GCS -. encrypted by .-> KMS
  BQ -. encrypted by .-> KMS

8. Prerequisites

Google Cloud requirements

A Google Cloud account with a project you can administer (or at least enable APIs and grant IAM).
Billing enabled on the project.

Required APIs

Vertex AI API:
aiplatform.googleapis.com

Some generative AI features may also reference other APIs in documentation or samples. Prefer official Vertex AI generative AI docs for your chosen model and SDK, and enable only what you need.

IAM permissions (common minimums)

Exact roles depend on your org and whether you’re just prototyping or also deploying workloads.

Typical roles: – For Studio usage and calling models: – roles/aiplatform.user (commonly used to access Vertex AI resources) – To enable services: – roles/serviceusage.serviceUsageAdmin (or project owner/admin equivalent) – For production calling from a service account: – Often roles/aiplatform.user on the service account (or a more specific role if available for generative inference; verify current IAM roles in official docs)

Principles: – Use least privilege – Separate roles for: – prototyping (human users) – deployment (CI/CD service account) – runtime inference (Cloud Run service account)

Tools

Cloud Console access for Vertex AI Studio.
gcloud CLI (recommended for repeatable setup):
Cloud Shell works fine for the lab.

Region availability

Vertex AI is regional. Model availability is region-dependent.
Pick a region where the model you want is available (commonly us-central1 is a safe starting point, but verify current availability).

Quotas/limits

Expect quotas like:
Requests per minute
Tokens per minute/day
Concurrent requests
Quotas vary by model, region, and project. Verify in:
Google Cloud console quotas for Vertex AI
The model’s documentation page

Prerequisite services (for production)

Cloud Run (or GKE) for hosting an app that calls Vertex AI
Cloud Logging/Monitoring (enabled by default for many services)
Secret Manager (if your app needs secrets for non-Vertex dependencies)

9. Pricing / Cost

Vertex AI Studio itself is a console experience; the primary costs come from the underlying services you use, especially Vertex AI generative model inference.

Official pricing sources (use these)

Vertex AI pricing: https://cloud.google.com/vertex-ai/pricing
Google Cloud Pricing Calculator: https://cloud.google.com/products/calculator

Pricing for generative AI models changes and is SKU-specific. Always confirm current SKUs, units, and regional pricing in the official pricing pages.

Pricing dimensions (what you pay for)

Common cost dimensions when using Vertex AI Studio with generative models: – Model inference usage – Often priced by input tokens and output tokens (and sometimes by modality, e.g., images) – Some models have separate SKUs for different context lengths or throughput tiers (verify) – Tuning / training (optional) – If you use model tuning features (for supported models), there may be training and storage costs – Data storage (optional) – Cloud Storage for datasets, test sets, outputs – BigQuery for analytics/evaluation datasets – Networking – Egress from your app (for example, if outputs are sent to external systems) – Operational runtime – If you deploy a production service (Cloud Run/GKE/Compute Engine), you pay for that compute and its networking/logging

Free tier / credits

Google Cloud frequently offers free trials/credits for new accounts, but they are not specific to Vertex AI Studio. Verify current offers in your Cloud Billing account and Google Cloud’s free trial page.

Key cost drivers

Output length: longer responses = more output tokens.
Prompt size: large system prompts, long documents, or large chat histories increase input tokens.
Retry behavior: client retries on errors/timeouts can multiply cost if not handled carefully.
High cardinality usage: many small requests can cost more operationally than fewer batched requests (though batching has latency/UX tradeoffs).
Model choice: higher-quality models usually cost more than faster/lower-cost variants.

Hidden or indirect costs

Logging: storing large payloads in logs can increase Cloud Logging costs and risk leaking sensitive data.
Data retention: storing prompts/outputs for evaluation without lifecycle policies increases storage costs and risk.
Egress: sending generated outputs to other clouds or external SaaS may incur egress and compliance overhead.
Human review: high-risk outputs often need human-in-the-loop processes.

Network/data transfer implications

Calls to Vertex AI are HTTPS requests to regional endpoints.
If your workload runs in Google Cloud in the same region, network performance is typically best.
Cross-region calling may increase latency and complicate data residency controls.

How to optimize cost (practical)

Start with the lowest-cost model that meets quality needs.
Constrain outputs:
Set max output tokens
Ask for concise formats (bullet lists, short JSON)
Reduce prompt size:
Summarize context
Use references (IDs) rather than repeating long text
Use caching patterns in your app:
Cache results for repeated prompts
Store intermediate summaries rather than re-sending raw threads
Implement guardrails to reduce retries:
Validate inputs
Use timeouts and circuit breakers
Log only what you need

Example low-cost starter estimate (conceptual, no fabricated numbers)

A low-cost prototype might involve: – A few dozen prompt tests per day – Short prompts (a few hundred tokens) – Short outputs (under a few hundred tokens) – Using a cost-optimized model variant when available

To estimate: 1. Identify the model SKU in the Vertex AI pricing page 2. Estimate daily input/output tokens 3. Multiply by the per-token rate 4. Add operational costs only if deploying a service (Cloud Run, logging, storage)

Example production cost considerations (what to model)

For a production service: – Peak requests per second and daily volume – Average prompt tokens and output tokens – Latency SLOs (may influence model selection) – Retry rate and fallback strategy (secondary model, cached responses) – Logging retention and PII redaction costs – Separate dev/test/prod projects to avoid runaway spend

10. Step-by-Step Hands-On Tutorial

This lab builds a small but real workflow: use Vertex AI Studio to design a prompt that classifies customer emails into a strict JSON schema, then call the same model via the Vertex AI API from Cloud Shell.

Objective

Prototype a classification prompt in Vertex AI Studio
Enforce a strict JSON output
Export the working prompt to an API call
Validate results and clean up safely

Lab Overview

You will: 1. Create or select a Google Cloud project and enable Vertex AI. 2. Use Vertex AI Studio to test a Gemini model prompt for JSON classification. 3. Call the model using curl and OAuth from Cloud Shell. 4. Validate output and apply basic troubleshooting. 5. Clean up by deleting any created service accounts/keys (if any) and optionally deleting the project.

Cost control: Keep prompts small, limit output tokens, and avoid repeated runs.

Step 1: Create/select a project and set variables

In the Google Cloud Console, select an existing project or create a new one: – Console → IAM & Admin → Manage resources → Create Project
Open Cloud Shell and set variables:

export PROJECT_ID="YOUR_PROJECT_ID"
export REGION="us-central1"   # pick a region where your chosen model is available
gcloud config set project "$PROJECT_ID"

Expected outcome: gcloud is now pointed at your project.

Step 2: Enable Vertex AI API

In Cloud Shell:

gcloud services enable aiplatform.googleapis.com

Expected outcome: The Vertex AI API is enabled successfully.

Verification:

gcloud services list --enabled --filter="name:aiplatform.googleapis.com"

Step 3: Confirm you have permissions to use Vertex AI Studio

You need IAM permissions to access Vertex AI resources.

In the console: IAM & Admin → IAM
Confirm your user has a role such as:
Vertex AI User (roles/aiplatform.user) (common)
or broader admin permissions in a sandbox project

Expected outcome: You can open Vertex AI without permission errors.

Step 4: Open Vertex AI Studio and select a model

Go to Vertex AI in the console: – https://console.cloud.google.com/vertex-ai
Open Vertex AI Studio (location in the UI may vary as Google updates the console navigation).
Choose: – Your Region (for example, us-central1) – A Gemini model available in your project/region (for example, a “Flash” variant for lower cost/latency)

If you don’t see Gemini models, check: – Region availability – Whether your organization restricts model access – Whether additional terms/allowlisting are required (verify in official docs)

Expected outcome: You can access a prompt editor and run a test prompt.

Step 5: Build a strict JSON classification prompt in Vertex AI Studio

Use this prompt pattern (adapt as needed). The key is: – Fixed label set – Explicit JSON schema – “Return JSON only” instruction – Low temperature (more consistent)

System / instruction text (example): – Role: “You are a classification engine…” – Output constraints: JSON only

User content (example email): – Provide a sample customer email text

Example prompt (single text block if Studio doesn’t split system/user explicitly):

You are a classification engine for a customer support inbox.

Task:
Classify the email into exactly one of these intents:
- BILLING
- TECH_SUPPORT
- CANCELLATION
- SALES
- OTHER

Output:
Return ONLY valid JSON that matches this schema:
{
  "intent": "BILLING|TECH_SUPPORT|CANCELLATION|SALES|OTHER",
  "confidence": number, 
  "summary": string,
  "requires_human": boolean
}

Rules:
- confidence must be between 0 and 1.
- summary must be <= 30 words.
- requires_human must be true if the email is ambiguous or requests account changes/refunds.

Email:
"""
Hi team, I was charged twice for my subscription this month. Please refund the extra charge.
Order ID: A-19333
Thanks!
"""

In the Studio parameter settings: – Set temperature low (for classification/extraction). – Set max output tokens modest (since output is short JSON).

Expected outcome: The model returns JSON similar to:

{
  "intent": "BILLING",
  "confidence": 0.9,
  "summary": "Customer reports a double charge and requests a refund; provides an order ID.",
  "requires_human": true
}

Verification checklist: – Output is valid JSON – intent is one of the allowed labels – Summary length is within constraints – confidence is numeric and between 0 and 1

Step 6: Export the configuration to an API call (conceptual mapping)

Vertex AI Studio often provides a “Get code” or similar option. Even if UI wording differs, the underlying call pattern for Gemini on Vertex AI typically looks like:

Endpoint form:
https://REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/publishers/google/models/MODEL_ID:generateContent

Because model IDs and API fields evolve, verify the exact endpoint and payload in the official Gemini on Vertex AI docs: – Vertex AI generative AI docs: https://cloud.google.com/vertex-ai/docs/generative-ai

For the lab, we’ll demonstrate a curl call pattern using OAuth.

Expected outcome: You understand how Studio maps to a Vertex AI API request.

Step 7: Call the model from Cloud Shell using curl (Vertex AI API)

Get an access token:

ACCESS_TOKEN="$(gcloud auth print-access-token)"
echo "${ACCESS_TOKEN:0:20}..."

Choose a model ID that is available in your region. Common examples include Gemini variants (names change). Verify the current model ID in your console/model list or docs.

Set it:

export MODEL_ID="gemini-1.5-flash"  # VERIFY in your project/region

Make the request.

curl -s -X POST \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" \
  -H "Content-Type: application/json; charset=utf-8" \
  "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:generateContent" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "You are a classification engine for a customer support inbox.\n\nTask:\nClassify the email into exactly one of these intents:\n- BILLING\n- TECH_SUPPORT\n- CANCELLATION\n- SALES\n- OTHER\n\nOutput:\nReturn ONLY valid JSON that matches this schema:\n{\n  \"intent\": \"BILLING|TECH_SUPPORT|CANCELLATION|SALES|OTHER\",\n  \"confidence\": number,\n  \"summary\": string,\n  \"requires_human\": boolean\n}\n\nRules:\n- confidence must be between 0 and 1.\n- summary must be <= 30 words.\n- requires_human must be true if the email is ambiguous or requests account changes/refunds.\n\nEmail:\n\"\"\"\nHi team, I was charged twice for my subscription this month. Please refund the extra charge.\nOrder ID: A-19333\nThanks!\n\"\"\""
          }
        ]
      }
    ],
    "generationConfig": {
      "temperature": 0.2,
      "maxOutputTokens": 256
    }
  }' | sed 's/\\n/\n/g'

Expected outcome: You receive a JSON response envelope that contains the model’s generated text (exact response format can differ by API version). Extract the generated JSON content and validate it.

Verification tips: – Confirm the request returns HTTP 200. – Confirm the response includes a candidate with text. – Confirm the model output itself is parseable JSON.

Step 8: Add a lightweight JSON validation step (recommended)

Install jq (Cloud Shell usually already has it). Save the model’s generated JSON to a variable/file and validate.

Because the API returns an envelope, you may need to manually copy the model’s JSON output into a file first for a beginner-friendly check:

cat > output.json <<'EOF'
{"intent":"BILLING","confidence":0.9,"summary":"Customer reports a double charge and requests a refund; provides an order ID.","requires_human":true}
EOF

jq . output.json

Expected outcome: jq pretty-prints the JSON. If it fails, your prompt needs stricter constraints or your extraction method needs adjustment.

Validation

You have successfully completed the lab if: – You can run the prompt in Vertex AI Studio and get consistent JSON. – You can call the model via Vertex AI API using curl and OAuth. – You can validate the JSON output and handle failures (at least manually).

Troubleshooting

Common issues and fixes:

403 PERMISSION_DENIED – Cause: Missing IAM role(s) for Vertex AI. – Fix:
- Ensure your user (or service account) has roles/aiplatform.user (or appropriate role per your org).
- Ensure Vertex AI API is enabled.
404 NOT_FOUND for model – Cause: Wrong MODEL_ID or model not available in that region. – Fix:
- Confirm the model name in Vertex AI Studio model picker.
- Switch region to one where the model is available (verify).
429 RESOURCE_EXHAUSTED – Cause: Quota exceeded (RPM/TPM). – Fix:
- Reduce request frequency.
- Request quota increase in console.
- Use a smaller model or lower output tokens.
Output is not valid JSON – Cause: Model deviates from format. – Fix:
- Tighten instructions (“Return JSON only, no markdown, no backticks”).
- Lower temperature.
- Provide one example output (few-shot).
- Add post-processing: parse best-effort and retry with a “fix JSON” prompt (be careful—retries add cost).
Studio UI doesn’t match the tutorial – Cause: Console navigation changes. – Fix:
- Use Vertex AI landing page and search for “Studio” within Vertex AI.
- Follow the official Vertex AI Studio docs for current UI steps.

Cleanup

To avoid unexpected costs: – Stop running repeated prompts. – If you created any additional resources (service accounts, keys, storage buckets), remove them.

Optional cleanup approaches:

A) Delete the project (strongest cleanup) – Console → IAM & Admin → Manage resources → select project → Delete

B) Keep project, but remove extra artifacts – If you created a service account key (not required for this lab), delete it immediately. – Review: – Cloud Logging retention and sinks – Cloud Storage buckets created for test data

11. Best Practices

Architecture best practices

Use Vertex AI Studio for prompt prototyping, not production serving.
In production:
Put a thin orchestration layer in Cloud Run or GKE
Add timeouts, retries with backoff, and circuit breakers
Cache stable results when appropriate
Keep prompts modular:
System instruction template
User input insertion
Output schema constraints

IAM/security best practices

Enforce least privilege:
Human access to Studio in dev projects only (where possible)
Runtime inference via dedicated service accounts
Use separate projects for:
dev experimentation
staging validation
production workloads
Control who can:
call models
view logs (logs may contain sensitive prompts/outputs)

Cost best practices

Always set:
maxOutputTokens
low temperature for deterministic tasks
Prefer smaller/faster models for:
classification/extraction/summarization
Avoid storing full prompts/outputs in logs by default.

Performance best practices

Choose region close to your workload and users.
Optimize prompt size:
Summarize prior conversation
Send only needed context
Consider concurrency controls in your calling service.

Reliability best practices

Treat model calls as external dependencies:
Retry only on safe errors
Implement fallbacks (for example, rule-based fallback or smaller model)
Validate output:
JSON schema validation
allowed-label checks
length limits

Operations best practices

Add request IDs and trace correlation:
propagate a correlation ID in logs
Monitor:
error rate
latency
token usage (where measurable)
spend by project/label
Create runbooks for quota exhaustion and permission errors.

Governance/tagging/naming best practices

Label projects and workloads:
env=dev|staging|prod
team=...
cost_center=...
Use consistent naming for service accounts and Cloud Run services:
sa-genai-infer-prod
cr-email-classifier-prod

12. Security Considerations

Identity and access model

Vertex AI Studio access is controlled by Google Cloud IAM.
Production usage should use service accounts, not user credentials.
Recommended pattern:
Developers: Studio access in dev project
CI/CD: deploy permissions only
Runtime: inference permissions only

Encryption

Google Cloud encrypts data at rest and in transit by default across most services.
For CMEK (customer-managed encryption keys):
Some Vertex AI resources and data stores can support CMEK.
Generative inference requests may not be CMEK-configurable in the same way as storage resources—verify current Vertex AI generative AI docs and CMEK support matrices.

Network exposure

Calls to Vertex AI APIs are HTTPS.
For production:
restrict egress where possible
avoid sending sensitive data unnecessarily
consider organization policies and VPC Service Controls where applicable (verify compatibility with your exact generative AI endpoints)

Secrets handling

Vertex AI API calls use OAuth tokens; avoid long-lived secrets.
Store non-Vertex credentials in Secret Manager.
Never store secrets in prompts.

Audit/logging

Enable and retain Admin Activity audit logs (default in many orgs).
For Data Access logs, evaluate:
cost impact
sensitivity of payloads
Ensure logs do not unintentionally store PII/PHI in prompt or model output.

Compliance considerations

Data classification: define which data types can be sent to generative models.
For regulated workloads:
minimize personal data
use redaction/tokenization before sending prompts
document your DPIA/TRA as required by your org
Review Google Cloud’s terms and Vertex AI data governance statements:
Verify in official docs for whether prompts/outputs are used for training and what opt-out/opt-in controls exist.

Common security mistakes

Giving broad roles (Project Owner) to everyone “to make Studio work”
Logging full prompts and outputs containing sensitive data
Mixing dev and prod usage in one project, complicating access control and cost visibility
Building production workflows without output validation (leading to injection or malformed outputs)

Secure deployment recommendations

Put production callers behind:
authenticated APIs
rate limits
input validation
Use allowlists for:
intents/labels
tool/function names (if you use function calling in your app—verify supported features)
Implement “prompt injection” defenses:
separate system instructions from user content
refuse to reveal system prompt
sanitize and scope user-provided context

13. Limitations and Gotchas

Model availability is region-dependent – A model visible in one region might not be in another.
Quotas can block you unexpectedly – Especially during load tests. Plan quota checks early.
Studio is not production serving – It’s a prototyping UI; production requires API integration and operational hardening.
Non-determinism – Even with low temperature, outputs can vary. Always validate outputs.
JSON output is not guaranteed – Prompting improves reliability but doesn’t guarantee strict formatting without validation.
Logging can leak sensitive data – Prompts/outputs can contain PII. Be intentional about logging and retention.
Costs scale with tokens – Long chat histories and large documents are major cost drivers.
Console UI changes – Tutorials may go stale because Studio navigation and naming evolve.
Org policy restrictions – Some orgs restrict model usage, regions, or external access.
Data residency constraints – You must choose regions and storage locations that match policy.
Integration assumptions – Vertex AI Studio is part of Vertex AI, but not all Vertex AI features are “in Studio.” For training/pipelines, use the appropriate Vertex AI components.

14. Comparison with Alternatives

Vertex AI Studio is a prototyping experience. Alternatives vary depending on whether you need prototyping, deployment, training, or end-to-end conversational products.

Comparison table

Option	Best For	Strengths	Weaknesses	When to Choose
Vertex AI Studio (Google Cloud)	Prompt prototyping and rapid iteration on Vertex AI models	Tight integration with Vertex AI, IAM/project governance, easy transition to API	Not a full production runtime; UI changes; evaluation/PromptOps depth varies	You’re building on Google Cloud and want a governed prototype-to-API flow
Vertex AI Workbench (Google Cloud)	Notebook-based ML/AI development	Jupyter environment, data science workflows, custom code	Heavier than Studio for quick prompt tests; notebook ops overhead	You need code-first experimentation, data processing, or ML workflows
Vertex AI Pipelines (Google Cloud)	MLOps pipelines for training and batch workflows	Repeatable pipelines, CI/CD integration	Not designed for interactive prompt iteration	You need production ML pipelines (training, batch inference, orchestration)
Dialogflow CX (Google Cloud)	Conversational bots with intent/flow management	Conversation design tooling, channels/integrations	Different focus than prompt prototyping; generative features vary	You want a managed conversational platform
Vertex AI Agent Builder (Google Cloud)	Search/agent experiences over enterprise content	Connectors, retrieval/grounding patterns (product dependent)	Separate product scope; may require more setup	You need enterprise search/agent patterns beyond simple prompting
AWS Bedrock (AWS)	Managed foundation model access in AWS	Model choice, AWS-native integration	Different governance/tooling; migration effort	Your platform is AWS-first
Azure AI Studio (Microsoft Azure)	Model + prompt tooling in Azure	Azure-native tooling and governance	Different model catalog and APIs	Your platform is Azure-first
OpenAI platform (direct)	Direct access to OpenAI models	Fast iteration, strong ecosystem	Not inherently tied to Google Cloud IAM/governance	You’re building outside Google Cloud or prefer direct vendor APIs
Self-hosted LLM (vLLM/TGI on GKE)	Maximum control and data locality	Customization, potentially predictable costs at scale	Significant ops burden, scaling, security, model management	You need on-cluster hosting for policy, latency, or cost reasons

15. Real-World Example

Enterprise example: Contact center ticket triage + summarization

Problem: A large enterprise contact center receives tens of thousands of tickets/day. Agents need summaries and correct routing; security requires controlled access and auditing.
Proposed architecture:
Agents’ system → ticket events published to Pub/Sub
Cloud Run service subscribes and calls Vertex AI (Gemini) for:
- summary
- intent label
- priority
Output stored in BigQuery for analytics and appended to ticket
IAM roles restrict who can call models; logs are redacted; VPC-SC evaluated where applicable
Why Vertex AI Studio was chosen:
Security team could review prompts and expected outputs quickly in a controlled project.
Engineering exported the working prompt patterns to Vertex AI API calls.
Expected outcomes:
Reduced handle time per ticket
Better routing accuracy
Auditable model usage with project-level governance
Cost visibility by project/team and token usage patterns

Startup/small-team example: SaaS “inbox co-pilot”

Problem: A small SaaS team wants an “email assistant” feature that drafts replies and classifies intent, but they need to validate quality quickly.
Proposed architecture:
Web app → Cloud Run backend
Backend calls Vertex AI Gemini for:
- intent classification JSON
- draft reply suggestion
Minimal logging, strict max tokens, caching for repeated patterns
Why Vertex AI Studio was chosen:
Very fast iteration cycle without standing up notebooks or custom tooling.
Easy to test prompt templates and quickly move to API calls.
Expected outcomes:
Faster feature delivery
Controlled early-stage cost by limiting tokens and using cost-optimized models
A clear path to scale as user volume grows

16. FAQ

Is Vertex AI Studio the same as Vertex AI?
Vertex AI Studio is a console experience within Vertex AI focused on prototyping generative AI interactions. Vertex AI also includes training, pipelines, model registry, endpoints, and more.
Is Vertex AI Studio the same as “Generative AI Studio”?
“Generative AI Studio” is an older name you may see in posts or labs. Current console experiences are commonly presented as Vertex AI Studio within Vertex AI. Verify current naming in official docs.
Do I pay for Vertex AI Studio?
You generally pay for the underlying model inference and other resources you use (Vertex AI model calls, storage, logging, runtime services). Studio itself is a UI.
Which models can I use in Vertex AI Studio?
Typically models available in Vertex AI’s Model Garden for your project/region, including Google-hosted Gemini models. Availability depends on region, project, and org policy.
Do I need to deploy anything to use Studio?
No. Studio is for interactive prototyping. For production, you deploy your own service that calls Vertex AI APIs.
How do I move from Studio to production?
Use Studio to finalize prompt patterns and parameters, then call the model via Vertex AI APIs from Cloud Run/GKE/etc., with IAM, monitoring, and validation.
Can I force the model to always return valid JSON?
You can strongly encourage it with prompting and low temperature, but you must still validate outputs and handle failures.
How do I control cost?
Limit input context, set maxOutputTokens, pick appropriate model variants, cache results, and monitor token usage and spend per project.
How do I restrict who can use Vertex AI Studio?
Use IAM: grant Vertex AI roles only to appropriate groups, and keep production projects more restricted than dev projects.
Are prompts and outputs used to train Google’s models?
Google Cloud provides specific data governance terms for Vertex AI. The default posture is designed for enterprise use, but you must verify current terms in official docs for your org and model.
Can I use Vertex AI Studio with VPC Service Controls?
Some Vertex AI and related services can be used with VPC Service Controls, but boundaries and support vary. Verify in official docs for your exact generative AI endpoints.
What region should I pick?
Choose a region where your model is available, close to your workload/users, and aligned with data residency requirements.
What’s the difference between Vertex AI Studio and Vertex AI Workbench?
Studio is UI-first prompt/model testing. Workbench is notebook-based development for code-heavy workflows.
How do I handle retries safely?
Retry only on transient errors, implement exponential backoff, and cap retries because each retry can incur cost.
What should I log in production?
Log metadata (request ID, latency, status) and avoid storing raw prompts/outputs unless you have a clear business need and proper data handling controls.
Can I do fine-tuning from Vertex AI Studio?
Tuning capabilities depend on the model and current Vertex AI features. Verify in the official Vertex AI generative AI/tuning documentation.
Is Vertex AI Studio suitable for regulated data (PII/PHI)?
It can be, but only with strong governance: data minimization, redaction, approved regions, IAM restrictions, logging controls, and verified compliance posture. Confirm with your security/legal teams and official docs.

17. Top Online Resources to Learn Vertex AI Studio

Use official Google Cloud sources first, because the UI and model catalog evolve quickly.

Resource Type	Name	Why It Is Useful
Official documentation	Vertex AI documentation	Primary reference for Vertex AI concepts, IAM, regions, quotas: https://cloud.google.com/vertex-ai/docs
Official documentation	Vertex AI generative AI overview	Current entry point for Gemini on Vertex AI, APIs, and workflows: https://cloud.google.com/vertex-ai/docs/generative-ai
Official documentation	Vertex AI Studio (docs entry point)	Console-based Studio workflows and links (verify current page path from Vertex AI docs): https://cloud.google.com/vertex-ai
Official pricing	Vertex AI pricing	Authoritative SKUs and pricing dimensions: https://cloud.google.com/vertex-ai/pricing
Pricing tool	Google Cloud Pricing Calculator	Estimate spend across services: https://cloud.google.com/products/calculator
Official IAM docs	Vertex AI access control (IAM)	Roles and permissions model (navigate from Vertex AI docs): https://cloud.google.com/vertex-ai/docs
Architecture guidance	Google Cloud Architecture Center	Patterns for production architectures (search for Vertex AI / generative AI): https://cloud.google.com/architecture
Official samples	GoogleCloudPlatform GitHub	Official samples often live here; search for Vertex AI generative AI repos: https://github.com/GoogleCloudPlatform
Official videos	Google Cloud Tech YouTube	Product updates, demos, and best practices: https://www.youtube.com/@googlecloudtech
Hands-on labs	Google Cloud Skills Boost	Guided labs; search for Vertex AI / Gemini / generative AI: https://www.cloudskillsboost.google

18. Training and Certification Providers

The following training providers are listed as additional learning options. Verify course syllabi, dates, and delivery modes on their websites.

Institute	Suitable Audience	Likely Learning Focus	Mode	Website URL
DevOpsSchool.com	DevOps engineers, architects, developers	DevOps + cloud engineering + practical workshops that may include AI integrations	Check website	https://www.devopsschool.com/
ScmGalaxy.com	Beginners to intermediate engineers	DevOps/SCM foundations and tooling that can support AI project delivery	Check website	https://www.scmgalaxy.com/
CLoudOpsNow.in	Cloud ops and platform teams	Cloud operations practices and implementation skills	Check website	https://cloudopsnow.in/
SreSchool.com	SREs, operations engineers	Reliability engineering practices relevant to production AI services	Check website	https://sreschool.com/
AiOpsSchool.com	Ops + AI practitioners	AIOps concepts, monitoring/automation approaches for modern systems	Check website	https://aiopsschool.com/

19. Top Trainers

These sites are presented as trainer platforms/resources. Confirm offerings and credentials directly on each site.

Platform/Site	Likely Specialization	Suitable Audience	Website URL
RajeshKumar.xyz	DevOps/cloud training content	Engineers looking for hands-on guidance	https://rajeshkumar.xyz/
devopstrainer.in	DevOps training services	Teams and individuals seeking DevOps upskilling	https://devopstrainer.in/
devopsfreelancer.com	Freelance DevOps services/training	Small teams needing practical help	https://devopsfreelancer.com/
devopssupport.in	DevOps support and learning	Ops teams needing implementation support	https://devopssupport.in/

20. Top Consulting Companies

These consulting companies are listed as potential sources of professional services. Verify capabilities and case studies directly with the providers.

Company Name	Likely Service Area	Where They May Help	Consulting Use Case Examples	Website URL
cotocus.com	Cloud/DevOps/engineering services	Architecture, implementation, operationalization	Designing Cloud Run → Vertex AI inference services; IAM hardening; cost optimization	https://cotocus.com/
DevOpsSchool.com	DevOps and cloud consulting/training	Platform enablement, DevOps practices for AI services	CI/CD for Cloud Run services calling Vertex AI; observability and SRE practices	https://www.devopsschool.com/
DEVOPSCONSULTING.IN	DevOps consulting	Delivery support, automation, operations	Production readiness for generative AI microservices; monitoring and incident response	https://devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Vertex AI Studio

Google Cloud fundamentals:
Projects, IAM, billing, regions
Cloud Logging/Monitoring basics
Basic API concepts:
REST, OAuth tokens, service accounts
Prompting fundamentals:
instruction clarity
few-shot examples
output constraints and validation

What to learn after Vertex AI Studio

Production app hosting:
Cloud Run (recommended for many teams), GKE for advanced needs
Security hardening:
least privilege IAM
secrets management
logging redaction and retention
Evaluation and testing:
golden test sets
regression testing for prompts
load testing and quota management
Broader Vertex AI ecosystem:
Workbench (notebooks)
Pipelines (MLOps)
Model registry/deployment (if you move beyond hosted foundation models)

Job roles that use it

Cloud Engineer / Platform Engineer
Solutions Architect
DevOps Engineer / SRE
ML Engineer (for prototyping generative AI behaviors)
Application Developer integrating AI features
Security Engineer reviewing AI controls and governance

Certification path (if available)

Google Cloud certifications are role-based (Associate/Professional). While there isn’t a “Vertex AI Studio-only” certification, relevant tracks often include:
Professional Cloud Architect
Professional Machine Learning Engineer
Verify current certification blueprints at: https://cloud.google.com/learn/certification

Project ideas for practice

Email classifier microservice (Cloud Run + Vertex AI)
Ticket summarizer with JSON output and BigQuery analytics
Document-to-structured-data extractor with validation and retries
Prompt regression test harness (store test cases in BigQuery/CSV and run nightly)
Cost dashboard: tokens/output length vs cost by endpoint and team

22. Glossary

Vertex AI Studio: Console-based prototyping workspace within Vertex AI for generative AI prompts and model testing.
Vertex AI: Google Cloud managed ML/AI platform including training, deployment, model management, and generative AI APIs.
Gemini: Google’s family of foundation models available through Vertex AI (availability and versions vary).
Prompt: Input text/instructions given to a model to produce an output.
System instruction: High-priority instruction that defines the model’s role and constraints (API/UI dependent).
Temperature: Decoding parameter controlling randomness; lower values are more deterministic.
Tokens: Units of text used for billing and limits; both input and output tokens matter for cost.
JSON schema (informal): The expected JSON structure and constraints you enforce with prompting and validation.
IAM: Identity and Access Management; controls permissions in Google Cloud.
Service account: Non-human identity used by workloads to call Google Cloud APIs securely.
Quota: Service-imposed limits on usage (requests/minute, tokens/minute, etc.).
Cloud Run: Serverless container runtime on Google Cloud, often used to host inference callers.
VPC Service Controls (VPC-SC): Google Cloud security boundary to reduce data exfiltration risks for supported services.
CMEK: Customer-managed encryption keys via Cloud KMS, available for some resources/services.

23. Summary

Vertex AI Studio is Google Cloud’s practical, governed way to prototype generative AI prompts and validate model behavior (commonly Gemini on Vertex AI) before building production integrations. It matters because it shortens the cycle from idea to implementation while keeping work inside Google Cloud IAM and project boundaries.

From an architecture perspective, treat Studio as the design surface and Vertex AI APIs as the production interface. Your real production system should run in Cloud Run or GKE with strong output validation, retries, monitoring, and cost controls.

Cost is driven primarily by model inference usage (input/output tokens, model choice, and output length), plus indirect costs like logging and any deployed runtimes. Security success depends on least privilege IAM, careful handling of prompts/outputs (especially sensitive data), and clear governance around where data can go.

Use Vertex AI Studio when you need fast, repeatable prompt iteration on Google Cloud with a clean path to API-based production. Next step: operationalize your best prompt by deploying a small Cloud Run service that calls Vertex AI with proper logging, validation, and quota-aware resilience.

rajeshkumar

Category