AWS Amazon CloudSearch Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Analytics

1. Introduction

Amazon CloudSearch is a fully managed search service on AWS that helps you add fast, scalable search capabilities to applications and datasets—without running and tuning your own search clusters.

In simple terms: you upload documents (JSON or XML) into a CloudSearch domain, define which fields are searchable/filterable/sortable, and then query the service through a search endpoint to get relevant results, facets, and suggestions.

Technically, Amazon CloudSearch provisions and operates the underlying search infrastructure (instances, indexing, scaling knobs like partitions and replicas). You manage the schema (index fields), indexing options, and access policies, then send documents to a document endpoint and run queries against a search endpoint over HTTP/HTTPS.

It solves the common problem of building production search—indexing, relevance, autoscaling needs, performance tuning, and high availability—especially for teams that want a managed, AWS-native service with predictable operational controls.

Service status note: Amazon CloudSearch is an established AWS service and remains available. However, for many newer search needs (advanced analytics, log search, vector/semantic search, or richer ecosystem), teams often evaluate Amazon OpenSearch Service or Amazon Kendra. This tutorial focuses on Amazon CloudSearch as it exists today and is accurate to its documented scope.

2. What is Amazon CloudSearch?

Official purpose (AWS): Amazon CloudSearch is a managed service for setting up, managing, and scaling a search solution for your website or application. You create a search domain, define your index fields, upload data, and then query it.

Core capabilities

Create a managed search index (a domain) for text and structured data
Configure index fields and behaviors (searching, filtering, faceting, sorting, highlighting)
Upload and update documents continuously
Query the index with structured query options
Provide autocomplete/suggestions (suggester)

Major components

Domain: The top-level resource that contains your indexed data and configuration.
Index fields (schema): Field definitions (e.g., text, literal, int, date, latlon, arrays) and per-field indexing options.
Document service endpoint: Receives document uploads (adds/updates/deletes).
Search service endpoint: Serves search requests.
Suggest service: Returns query suggestions (configured by a suggester).
Scaling controls: Instance type, partitions (for scale) and replicas (for HA/read scale).

Service type

Managed AWS service (you do not administer servers/OS)
Provisioned capacity model (instance types + partitions/replicas), not “serverless”

Scope and availability model

Regional service: Domains are created in a specific AWS Region.
Account-scoped: Resources belong to your AWS account (and Region).
Network exposure: CloudSearch endpoints are typically public AWS endpoints controlled by domain access policies (it is not the same networking model as VPC-only services). Verify current endpoint/networking options in official docs for your Region and account constraints.

Fit in the AWS ecosystem

Amazon CloudSearch commonly integrates with: – S3 (source data dumps), Lambda (transform + index), DynamoDB/RDS (source of truth), Kinesis (streaming updates), SQS (buffering), CloudWatch (monitoring), IAM (access control) – Application stacks on EC2, ECS, EKS, Elastic Beanstalk, and API Gateway + Lambda

3. Why use Amazon CloudSearch?

Business reasons

Faster time-to-market for search features vs. running Elasticsearch/Solr yourself
Managed operations reduce the need for a dedicated search platform team
Pay for provisioned search capacity rather than staffing + ops overhead

Technical reasons

Supports full-text search plus structured search patterns:
filtering, sorting, faceting
highlighting
suggestions/autocomplete
geospatial (latlon) queries
Simple data ingestion model (JSON/XML documents)
Schema-based indexing provides predictable query behavior

Operational reasons

AWS provisions instances and handles service maintenance
Built-in metrics and scaling knobs (replicas/partitions/instance type)
Managed failover patterns via replicas (depending on configuration)

Security/compliance reasons

Domain access policies for controlling endpoint access
HTTPS endpoints for encryption in transit
IAM integrates with AWS governance (CloudTrail for API calls; verify CloudSearch endpoint request logging options in docs)

Scalability/performance reasons

Partitions scale index size and write/read throughput
Replicas improve availability and read capacity

When teams should choose Amazon CloudSearch

You need a managed search index for application search over product catalogs, documentation, knowledge bases, or site content
You want straightforward operational controls and AWS-native provisioning
Your workload fits “classic search” needs (lexical relevance, facets, filtering)

When teams should not choose it

You need log analytics, heavy aggregations, or a rich analytics ecosystem → evaluate Amazon OpenSearch Service
You need semantic search, connectors, natural language Q&A → evaluate Amazon Kendra (or OpenSearch + vector extensions where appropriate)
You require VPC-only private endpoints and deep network isolation (verify CloudSearch capabilities; many teams choose OpenSearch in VPC for this)
You need extensive plugin ecosystems or low-level tuning (self-managed OpenSearch/Elasticsearch/Solr may be a better fit)

4. Where is Amazon CloudSearch used?

Industries

E-commerce and retail (product search, category filters)
Media and publishing (article search, site search)
SaaS platforms (search across customer-generated content)
Education (course and content search)
Travel and real estate (geo + filtering)
Internal enterprise portals (document metadata search, directory-like search)

Team types

Product engineering teams building app search
Platform teams offering “search-as-a-service” internally (smaller orgs)
DevOps/SRE teams that prefer managed services over self-hosting

Workloads

End-user search boxes (site search)
Faceted navigation (filters by category/brand/price)
Autocomplete/suggestions
Search-driven recommendation surfaces (lexical similarity, not ML-based semantic embeddings)

Architectures

Event-driven indexing (DB change → stream → transform → CloudSearch)
Batch indexing (nightly rebuild from S3)
Hybrid (initial bulk load + incremental updates)

Production vs dev/test usage

Dev/test: smaller instance types, fewer partitions/replicas, synthetic data
Production: at least one replica for availability, careful access policy design, monitored indexing latency and 4xx/5xx rates

5. Top Use Cases and Scenarios

Below are realistic scenarios where Amazon CloudSearch is a good fit. Each includes the problem, why it fits, and a short scenario.

E-commerce Product Search with Facets – Problem: Users must find products quickly and filter by attributes. – Why it fits: Text search + structured facets/filtering/sorting in one service. – Scenario: Index product title/description as text, brand/category as literal, price as double. Use facets for category and brand, sort by price.
Documentation and Knowledge Base Search – Problem: Users can’t find the right doc page among thousands. – Why it fits: Full-text relevance + highlighting to show matched snippets. – Scenario: Index articles with title, body, tags. Use highlighting on body to show the matched section.
SaaS Multi-Tenant Search (Logical Isolation) – Problem: Each tenant can only search their own data. – Why it fits: Filter queries by tenant_id field; enforce at app layer and/or policy conditions. – Scenario: Index tenant_id as literal and always add a filter constraint for the current tenant.
Customer Support Ticket Search – Problem: Agents need quick search across cases and metadata. – Why it fits: Combine full-text with structured fields like status/priority. – Scenario: Search in ticket body while filtering by status and sorting by last_updated date.
Real Estate Listings with Geo Search – Problem: Users want listings “near me” and with filters. – Why it fits: latlon fields and geo constraints, plus facets like bedrooms/price range. – Scenario: Index coordinates, filter within a radius, sort by distance (where supported; verify exact geo query/sort options in docs).
Internal App Search for a Portal – Problem: Employees need to search internal pages, tools, and links. – Why it fits: Lightweight search index with straightforward ingestion. – Scenario: Index portal entries with title, summary, department tags; add suggestions for common tools.
Inventory/Parts Lookup – Problem: Operators search by part number, partial codes, or text description. – Why it fits: Supports exact match (literal) and partial match (text) patterns. – Scenario: Index SKU as literal and also include it in a text field for flexible searches.
Autocomplete for Site Search – Problem: Users need type-ahead suggestions. – Why it fits: Suggester provides query suggestions driven by indexed fields. – Scenario: Use a suggester built from product titles; call suggest endpoint as users type.
Content Moderation Queue Search – Problem: Moderators need to find content by keywords, flags, and dates. – Why it fits: Text + filters + sorting by time windows. – Scenario: Index moderation flags as literal-array, timestamp as date, filter by flag and sort by date.
Catalog Search for B2B Pricing Tiers – Problem: Different customers see different products/prices. – Why it fits: Index includes visibility rules; filter results for entitlements. – Scenario: Index visibility_group and filter based on customer entitlements from your IAM/identity layer.
Event/Conference Session Search – Problem: Attendees search sessions by topic/speaker/time. – Why it fits: Great for combined full-text and structured filters. – Scenario: Search abstract while filtering by track and sorting by start time.

6. Core Features

This section covers the key current features commonly documented for Amazon CloudSearch. If you rely on a feature for production, verify the latest constraints and API parameters in the official developer guide.

6.1 Domains (Managed Search Collections)

What it does: Creates an isolated search environment with its own schema and endpoints.
Why it matters: Domains are the boundary for configuration, billing, and access.
Practical benefit: You can separate environments (dev/stage/prod) and workloads.
Caveats: Domain configuration changes can take time to process and can trigger reindexing.

6.2 Schema / Index Field Definitions

What it does: Lets you define fields and types (text, literal, numeric, date, latlon, arrays).
Why it matters: Field types determine what queries and operations are possible (facet/filter/sort).
Practical benefit: Predictable query behavior and performance.
Caveats: Schema changes may require reindexing; plan schema carefully.

6.3 Full-Text Search

What it does: Searches text fields for keywords and phrases.
Why it matters: Core capability for site/app search.
Practical benefit: Users find relevant content quickly.
Caveats: Relevance tuning is schema/analysis dependent; CloudSearch is not a full custom IR platform.

6.4 Structured Search (Filter/Facet/Sort)

What it does: Filters and facets on structured fields; sorts results by numeric/date/literal fields where supported.
Why it matters: Most “real” search experiences rely on navigation facets and sorting.
Practical benefit: E-commerce-like experiences are feasible without additional databases.
Caveats: You must define fields appropriately (e.g., literal for facets).

6.5 Result Highlighting

What it does: Returns snippets showing matched terms in a field.
Why it matters: Improves UX and reduces pogo-sticking.
Practical benefit: Users see why a result matched.
Caveats: Highlighting adds query overhead; use selectively.

6.6 Suggestions (Autocomplete)

What it does: Configurable suggester for query suggestions.
Why it matters: Autocomplete improves conversion and reduces “no results” queries.
Practical benefit: Better UX with minimal engineering.
Caveats: Suggestions depend on data quality and configuration; not a semantic suggestion engine.

6.7 Geo (Location) Search

What it does: Indexes and queries latlon fields.
Why it matters: Required for “nearby” and location-based filtering.
Practical benefit: Enables real estate/travel/retail “near me” experiences.
Caveats: Validate exact geo query operators and limitations in official docs.

6.8 Scaling: Partitions and Replicas

What it does: Partitions scale data/throughput; replicas provide HA and more read capacity.
Why it matters: Lets you meet throughput and availability requirements.
Practical benefit: Scale without redesigning the app.
Caveats: More partitions/replicas directly increases cost; reconfiguration can take time.

6.9 Access Policies

What it does: Controls who can call document/search endpoints and configuration APIs.
Why it matters: Prevents data leaks and unauthorized indexing.
Practical benefit: You can restrict access by IAM principals and conditions.
Caveats: Misconfigured policies can unintentionally expose public search endpoints.

6.10 Monitoring via CloudWatch Metrics

What it does: Exposes service metrics for performance, error rates, indexing latency, and resource utilization.
Why it matters: Search issues are often silent until users complain.
Practical benefit: Alert before relevance and latency become a problem.
Caveats: Metrics are necessary but not sufficient; you still need application-level monitoring and query analytics.

7. Architecture and How It Works

High-level architecture

Amazon CloudSearch centers around a domain containing: – Search instances responsible for indexing and queries – Document ingestion pipeline exposed via the document endpoint – Search endpoint for queries and a suggest endpoint for autocomplete

You interact with CloudSearch in two planes: – Control plane: Create domains, define fields, configure scaling, set policies (AWS API calls, usually IAM-signed). – Data plane: Upload documents and issue queries to endpoints (access controlled via domain access policy; can require signed requests depending on policy).

Request, data, and control flow

Create domain (control plane)
Define index fields (control plane)
Upload documents (data plane → document endpoint)
Indexing happens (managed by service)
Search queries (data plane → search endpoint)
Suggestions (data plane → suggest endpoint)

Integrations with related AWS services

Common patterns: – S3: bulk export documents; a job reads from S3 and uploads to CloudSearch – Lambda: transform records and push updates to CloudSearch – DynamoDB Streams: change data capture (CDC) → Lambda → CloudSearch – RDS/Aurora: source-of-truth DB; app emits change events for indexing – CloudWatch: alarms on latency/errors/indexing backlogs – CloudTrail: audit control plane API calls

Dependency services

IAM (policies, signing, principals)
CloudWatch (metrics)
CloudTrail (control plane audit logs)

Security/authentication model

Control plane calls are authorized by IAM permissions.
Domain endpoints (search/document) are governed by the domain’s access policy:
You can allow/deny actions to principals and add conditions (for example, source IP).
If you require IAM-signed requests for endpoints, your clients must use SigV4 signing (common in AWS SDKs).

Networking model

Endpoints are DNS names provided by AWS for each domain.
Many deployments treat CloudSearch as an internet-accessible managed endpoint with strict access policies.
If you require private-only networking, verify CloudSearch networking options in official docs and consider alternatives like Amazon OpenSearch Service in a VPC.

Monitoring/logging/governance considerations

Track:
search latency and 4xx/5xx error rates
indexing latency / document processing issues
capacity utilization that signals scaling needs
Governance:
tag domains by environment, owner, cost center
separate prod vs non-prod accounts or at least separate domains

Simple architecture diagram (Mermaid)

flowchart LR
  U[Users / App] -->|Search queries| SE[CloudSearch Search Endpoint]
  U -->|Autocomplete| SG[CloudSearch Suggest Endpoint]
  APP[Ingestion Worker] -->|Upload docs| DE[CloudSearch Document Endpoint]
  ADMIN[Admin / CI Pipeline] -->|Create domain, define fields| CP[CloudSearch Control Plane APIs]
  CP --> D[(CloudSearch Domain)]
  DE --> D
  SE --> D
  SG --> D

Production-style architecture diagram (Mermaid)

flowchart TB
  subgraph VPC[Application VPC]
    WEB[Web / API Tier<br/>ECS/EKS/EC2] -->|Search| SE
    WEB -->|Suggest| SG
    DB[(Aurora / DynamoDB)] -->|Changes| STREAM[DynamoDB Streams / App Events]
    STREAM --> L[Lambda Transformer]
    L -->|Upload documents| DE
    CW[CloudWatch Alarms] --> OPS[Ops On-call]
  end

  subgraph AWSManaged[AWS Managed Services]
    D[(Amazon CloudSearch Domain)]
    SE[Search Endpoint]
    SG[Suggest Endpoint]
    DE[Document Endpoint]
    CT[CloudTrail]
  end

  ADMIN[CI/CD or Admin] -->|Define schema, scaling, policy| CP[CloudSearch Control Plane API]
  CP --> D
  DE --> D
  SE --> D
  SG --> D
  CP --> CT

8. Prerequisites

AWS account and billing

An AWS account with billing enabled
Ability to create and delete CloudSearch domains (deleting is important for cost control)

Permissions / IAM

Minimum IAM permissions typically needed for the lab (scope down in real environments): – cloudsearch:CreateDomain – cloudsearch:DeleteDomain – cloudsearch:DescribeDomains – cloudsearch:DefineIndexField – cloudsearch:IndexDocuments – cloudsearch:UpdateServiceAccessPolicies – cloudsearch:UpdateScalingParameters – cloudsearch:DescribeScalingParameters – cloudsearch:DescribeIndexFields

For uploading/searching via endpoints: – If your access policy allows anonymous access (not recommended for production), no signing needed. – If policy requires IAM-signed requests, ensure your client can sign requests (AWS SDKs can do this).

Tools

AWS CLI v2 (recommended)
Optional: Python 3.10+ for scripting uploads/queries (boto3), but the lab uses AWS CLI

CLI references: – CloudSearch: https://docs.aws.amazon.com/cli/latest/reference/cloudsearch/ – CloudSearch Domain: https://docs.aws.amazon.com/cli/latest/reference/cloudsearchdomain/

Region availability

Choose an AWS Region where Amazon CloudSearch is available. Verify the service availability in the AWS Regional Services List and CloudSearch docs.

Quotas / limits

CloudSearch has service quotas (domains per account, fields per domain, document size limits, etc.). Because these can change, verify in official docs: – AWS Service Quotas console (if CloudSearch is integrated there for your account) – CloudSearch Developer Guide limits section

Prerequisite services

None strictly required for the basic lab. For production pipelines you commonly use S3/Lambda/DynamoDB/RDS, but they are optional.

9. Pricing / Cost

Amazon CloudSearch pricing is usage-based and primarily driven by provisioned capacity.

Official pricing page: – https://aws.amazon.com/cloudsearch/pricing/

You can also estimate total costs with: – AWS Pricing Calculator: https://calculator.aws/#/

Pricing dimensions (typical)

While exact dimensions vary by Region and may evolve, CloudSearch costs generally relate to: – Instance-hours for search capacity (instance type) – Additional capacity through partitions and replicas (more instances → more cost) – Storage (index storage as applicable to the service’s model; verify how storage is metered in your Region) – Data transfer (especially data transfer out to the internet or cross-Region)

Do not assume pricing numbers from blogs or old posts. Always confirm on the official pricing page for your Region.

Free tier

CloudSearch has historically had limited free tier offerings at times, but this changes. Verify on the pricing page whether a free tier applies for your account/Region.

Primary cost drivers

Running a domain continuously (24/7 instance-hours)
Choosing larger instance types than necessary
Over-provisioning partitions and replicas
Heavy query volume (if it forces scaling up instance type/replicas)
High-volume document ingestion (may drive scaling to maintain indexing throughput)

Hidden/indirect costs

Data transfer from apps running outside AWS or cross-Region
NAT Gateway charges if your ingestion system uses NAT (CloudSearch endpoints are not inside your VPC in many designs)
Operational costs: monitoring, alerting, CI/CD automation time

Network/data transfer implications

Keep producers/consumers (indexing jobs and app servers) in the same Region as the CloudSearch domain to minimize latency and transfer cost.
Be careful with cross-Region traffic for multi-Region apps.

Cost optimization tips

Use the smallest instance type that meets latency/throughput objectives.
Start with minimal replicas/partitions; scale based on CloudWatch metrics.
Keep dev/test domains turned off by deleting them when not used (CloudSearch domains bill while they exist).
Use batching for document uploads to reduce overhead.
Cache common queries at the application layer (where appropriate).

Example low-cost starter estimate (conceptual)

A low-cost starter setup typically means: – 1 small domain – minimal partitions and replicas – low query volume – small dataset

Your estimate depends on: – chosen instance type – hours per month (continuous vs part-time) – Region pricing

Use the AWS Pricing Calculator with “Amazon CloudSearch” to model: – 1 domain × selected instance type × 730 hours/month – add replicas if needed – approximate data transfer

Example production cost considerations

In production, costs scale with: – larger instance types to hold bigger indexes or improve performance – additional partitions (larger datasets, higher write throughput) – at least one replica for availability – higher query volume requiring more replicas for read scaling

Rule of thumb: partitions × (replicas + 1) × instance-hour price is the mental model for compute-related cost, but confirm how CloudSearch bills each component in the official pricing details.

10. Step-by-Step Hands-On Tutorial

This lab creates a small CloudSearch domain, defines fields, uploads sample documents, runs search/facet/suggest queries, and cleans up.

Objective

Build a working Amazon CloudSearch index that supports: – keyword search – faceting by genre – sorting by year – autocomplete suggestions from titles

Lab Overview

You will: 1. Create a CloudSearch domain 2. Define index fields (schema) 3. Configure a suggester 4. Upload sample movie documents (JSON) 5. Query the search and suggest endpoints 6. Delete the domain to avoid ongoing cost

Step 1: Set your environment variables

Choose a Region and domain name.

export AWS_REGION="us-east-1"
export DOMAIN_NAME="cs-movies-lab-$(date +%s)"

Expected outcome: You have a unique domain name and a target Region.

Verify AWS identity:

aws sts get-caller-identity

Step 2: Create the CloudSearch domain

aws cloudsearch create-domain \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME"

Check status:

aws cloudsearch describe-domains \
  --region "$AWS_REGION" \
  --domain-names "$DOMAIN_NAME"

Expected outcome: The domain exists and shows a Processing status while AWS provisions resources.

Wait until domain processing finishes before proceeding (you can re-run describe-domains every minute). Some updates later will also trigger processing.

Step 3: Retrieve the domain endpoints

Once processing is complete, capture endpoints.

aws cloudsearch describe-domains \
  --region "$AWS_REGION" \
  --domain-names "$DOMAIN_NAME" \
  --query "DomainStatusList[0].Endpoints"

You should see endpoints similar to: – doc endpoint (document service) – search endpoint (search service)

Store them:

DOC_ENDPOINT=$(aws cloudsearch describe-domains \
  --region "$AWS_REGION" \
  --domain-names "$DOMAIN_NAME" \
  --query "DomainStatusList[0].Endpoints.doc" \
  --output text)

SEARCH_ENDPOINT=$(aws cloudsearch describe-domains \
  --region "$AWS_REGION" \
  --domain-names "$DOMAIN_NAME" \
  --query "DomainStatusList[0].Endpoints.search" \
  --output text)

echo "DOC_ENDPOINT=$DOC_ENDPOINT"
echo "SEARCH_ENDPOINT=$SEARCH_ENDPOINT"

Expected outcome: You have two endpoint hostnames (no protocol yet).

Step 4: Set a safe access policy (lab-only)

For production, you should restrict access to specific IAM principals and conditions. For this lab, the safest low-friction approach is usually to restrict by your current public IP.

Get your public IP:

MY_IP=$(curl -s https://checkip.amazonaws.com | tr -d '\n')
echo "$MY_IP"

Create a policy file that allows only your IP to call document and search endpoints. Save as cloudsearch-access-policy.json:

cat > cloudsearch-access-policy.json <<EOF
{
  "Version":"2012-10-17",
  "Statement":[
    {
      "Sid":"LabAccessBySourceIp",
      "Effect":"Allow",
      "Principal":"*",
      "Action":[
        "cloudsearch:search",
        "cloudsearch:suggest",
        "cloudsearch:document"
      ],
      "Condition":{
        "IpAddress":{"aws:SourceIp":"${MY_IP}/32"}
      },
      "Resource":"*"
    }
  ]
}
EOF

Apply it:

aws cloudsearch update-service-access-policies \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME" \
  --access-policies file://cloudsearch-access-policy.json

Expected outcome: Policy update triggers domain processing again. Wait for processing to complete.

Note: Some organizations prefer IAM-signed requests rather than IP-based controls. That is a better production posture in many cases, but it requires signed requests in your client. Verify recommended patterns in official docs.

Step 5: Define index fields (schema)

Create common fields: – title as text (searchable) – year as int (filter/sort) – genres as literal-array (filter/facet) – plot as text (searchable) – id as literal (exact match)

Define title:

aws cloudsearch define-index-field \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME" \
  --name "title" \
  --type "text" \
  --text-options '{"ReturnEnabled":true,"SortEnabled":true,"HighlightEnabled":true,"AnalysisScheme":"en"}'

Define plot:

aws cloudsearch define-index-field \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME" \
  --name "plot" \
  --type "text" \
  --text-options '{"ReturnEnabled":true,"HighlightEnabled":true,"AnalysisScheme":"en"}'

Define year:

aws cloudsearch define-index-field \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME" \
  --name "year" \
  --type "int" \
  --int-options '{"ReturnEnabled":true,"SortEnabled":true,"FacetEnabled":true,"SearchEnabled":true}'

Define genres:

aws cloudsearch define-index-field \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME" \
  --name "genres" \
  --type "literal-array" \
  --literal-array-options '{"ReturnEnabled":true,"FacetEnabled":true,"SearchEnabled":true}'

Define id:

aws cloudsearch define-index-field \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME" \
  --name "id" \
  --type "literal" \
  --literal-options '{"ReturnEnabled":true,"SearchEnabled":true}'

Expected outcome: Index fields are defined, and the domain enters processing again. Wait until processing is complete before uploading documents.

Check fields:

aws cloudsearch describe-index-fields \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME"

Step 6: Configure a suggester for autocomplete

Create a suggester that uses title.

aws cloudsearch define-suggester \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME" \
  --suggester '{"SuggesterName":"title_suggest","DocumentSuggesterOptions":{"SourceField":"title","FuzzyMatching":"none","SortExpression":"_score"}}'

Expected outcome: Suggester is defined; domain processes the change.

Wait for processing to complete.

Step 7: Create sample documents and upload them

CloudSearch document uploads use a JSON format with actions (add, delete). Create movies-docs.json:

cat > movies-docs.json <<'EOF'
[
  {
    "type": "add",
    "id": "m1",
    "fields": {
      "id": "m1",
      "title": "The Matrix",
      "year": 1999,
      "genres": ["sci-fi", "action"],
      "plot": "A hacker learns about the true nature of reality and his role in the war against its controllers."
    }
  },
  {
    "type": "add",
    "id": "m2",
    "fields": {
      "id": "m2",
      "title": "Inception",
      "year": 2010,
      "genres": ["sci-fi", "thriller"],
      "plot": "A thief who steals corporate secrets through dream-sharing technology is given a chance at redemption."
    }
  },
  {
    "type": "add",
    "id": "m3",
    "fields": {
      "id": "m3",
      "title": "Interstellar",
      "year": 2014,
      "genres": ["sci-fi", "drama"],
      "plot": "A team travels through a wormhole in space in an attempt to ensure humanity's survival."
    }
  },
  {
    "type": "add",
    "id": "m4",
    "fields": {
      "id": "m4",
      "title": "The Social Network",
      "year": 2010,
      "genres": ["drama"],
      "plot": "The founding of a social networking site and the resulting legal battles."
    }
  }
]
EOF

Upload documents using the cloudsearchdomain CLI. Note the endpoint is the hostname; the CLI builds the request.

aws cloudsearchdomain upload-documents \
  --region "$AWS_REGION" \
  --endpoint-url "https://$DOC_ENDPOINT" \
  --content-type "application/json" \
  --documents "file://movies-docs.json"

Expected outcome: A response indicating how many documents were added and the status of the upload.

Now explicitly trigger indexing:

aws cloudsearch index-documents \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME"

Wait a minute for indexing to complete.

Step 8: Run search queries (keyword, facet, sort, highlight)

8.1 Basic keyword search

Search for “dream”:

aws cloudsearchdomain search \
  --region "$AWS_REGION" \
  --endpoint-url "https://$SEARCH_ENDPOINT" \
  --query "dream" \
  --query-parser "simple" \
  --return "_all_fields"

Expected outcome: You should see Inception in results because “dream” appears in plot.

8.2 Faceting by genres

Facet on genres:

aws cloudsearchdomain search \
  --region "$AWS_REGION" \
  --endpoint-url "https://$SEARCH_ENDPOINT" \
  --query "sci-fi" \
  --query-parser "simple" \
  --facet "genres" \
  --return "title,year,genres"

Expected outcome: A facets section with counts by genre.

8.3 Sorting by year

Search all docs and sort by year descending:

aws cloudsearchdomain search \
  --region "$AWS_REGION" \
  --endpoint-url "https://$SEARCH_ENDPOINT" \
  --query "matchall" \
  --query-parser "structured" \
  --sort "year desc" \
  --return "title,year"

Expected outcome: Interstellar (2014) appears before older movies.

8.4 Highlighting plot matches

Search for “reality” and highlight plot:

aws cloudsearchdomain search \
  --region "$AWS_REGION" \
  --endpoint-url "https://$SEARCH_ENDPOINT" \
  --query "reality" \
  --query-parser "simple" \
  --highlight "plot={format:'text',max_phrases:2}" \
  --return "title,plot"

Expected outcome: Highlighted snippet shows the matching text in plot.

Step 9: Query suggestions (autocomplete)

Get suggestions for “int”:

aws cloudsearchdomain suggest \
  --region "$AWS_REGION" \
  --endpoint-url "https://$SEARCH_ENDPOINT" \
  --suggester "title_suggest" \
  --query "int"

Expected outcome: Suggestions should include “Interstellar” and possibly “Inception” depending on matching behavior and suggester config.

Validation

Use this checklist to confirm the lab worked: – describe-domains shows the domain in an Active state (not processing). – upload-documents returned success counts. – search for dream returns Inception. – Facet query returns counts for genres. – Sort query returns results ordered by year. – Suggest query returns suggestions for int.

Troubleshooting

Common issues and fixes:

Access denied / 403 when searching or uploading – Cause: Access policy doesn’t allow your source IP or requires signed requests. – Fix:
- Re-check your public IP (it can change).
- Update the access policy and wait for processing.
- If your organization requires IAM signing, use an SDK or signing-capable HTTP client and adjust policy accordingly.
Endpoint is empty in describe-domains – Cause: Domain is still provisioning or processing. – Fix: Wait and re-run describe-domains.
No results after uploading – Cause: Indexing hasn’t completed yet or fields aren’t configured as searchable/returnable. – Fix:
- Run index-documents and wait.
- Confirm field options: SearchEnabled, ReturnEnabled.
- Verify you’re querying the right parser (simple vs structured).
CLI says unknown operation cloudsearchdomain – Cause: AWS CLI installation is incomplete/outdated. – Fix: Upgrade to AWS CLI v2 and confirm aws cloudsearchdomain help works.
Validation errors when defining fields – Cause: Field options incompatible with field type. – Fix: Verify field option JSON for the type in the CloudSearch developer guide.

Cleanup

To avoid ongoing charges, delete the domain:

aws cloudsearch delete-domain \
  --region "$AWS_REGION" \
  --domain-name "$DOMAIN_NAME"

Confirm it is removed:

aws cloudsearch describe-domains \
  --region "$AWS_REGION" \
  --domain-names "$DOMAIN_NAME"

Expected outcome: The domain no longer appears (or shows deleting status briefly).

11. Best Practices

Architecture best practices

Separate domains by environment: dev/stage/prod should not share a domain.
Treat CloudSearch as a read-optimized index: keep a source-of-truth database (RDS/DynamoDB/S3).
Choose ingestion style intentionally:
Batch rebuilds for large nightly pipelines
Event-driven updates for near-real-time search
Design for reindexing: schema changes can require reprocessing; build repeatable ingestion jobs.

IAM/security best practices

Prefer least privilege for CloudSearch control-plane permissions.
Restrict endpoint access with:
IAM principals (preferred for many enterprises) and/or
IP conditions (useful for labs, limited admin networks)
Avoid public, anonymous policies for document endpoints in production.

Cost best practices

Start small and scale based on metrics.
Use minimal replicas/partitions required by your SLOs.
Delete unused dev/test domains promptly.
Monitor growth of indexed data and query volume to forecast scaling.

Performance best practices

Use correct field types:
literal for exact matches, facets, filters
text for full-text relevance
numeric/date for sorting and range filtering
Limit returned fields to only what you need (return parameter).
Cache “hot” queries at the application or CDN layer if suitable.
Batch document uploads to reduce overhead and improve throughput.

Reliability best practices

Use replicas for availability and read scaling.
Design clients with retries and backoff for transient failures (5xx).
Implement ingestion retry and dead-letter patterns (e.g., SQS DLQ) if using event-driven pipelines.

Operations best practices

CloudWatch alarms on:
elevated 5xx/4xx rates
increased search latency
indexing latency/backlogs (metrics vary; verify relevant metrics in docs)
Track schema changes in version control.
Automate domain provisioning (IaC) where possible; verify CloudSearch support in your chosen IaC tool.

Governance/tagging/naming best practices

Tag domains with:
Environment, Owner, CostCenter, DataClassification
Use clear naming:
myapp-search-prod, myapp-search-dev
Document access policies and review them periodically.

12. Security Considerations

Identity and access model

Control plane: IAM permissions govern domain creation and configuration.
Data plane: Domain access policy governs who can call:
cloudsearch:search
cloudsearch:suggest
cloudsearch:document

Use policies to: – restrict by IAM principal (role/user) – restrict by source IP range – add other conditions supported by AWS policy language (verify CloudSearch policy evaluation in docs)

Encryption

In transit: Use HTTPS endpoints for document uploads and search queries.
At rest: CloudSearch is managed; AWS handles underlying storage. The exact encryption-at-rest behavior and whether you can choose customer-managed KMS keys should be verified in official docs (CloudSearch historically has fewer customer-managed encryption controls than some newer services).

Network exposure

CloudSearch endpoints are typically reachable via public AWS DNS.
Rely on access policies to prevent exposure.
If your compliance posture requires private-only endpoints, evaluate alternatives (often OpenSearch Service in VPC).

Secrets handling

If using IAM signing from apps:
prefer IAM roles (EC2 instance profiles, ECS task roles, EKS IRSA)
avoid long-lived access keys in code or configs
Store any needed secrets in AWS Secrets Manager or SSM Parameter Store.

Audit/logging

CloudTrail logs CloudSearch API actions on the control plane (create domain, define fields, policy updates).
For query-level auditing, implement application-side request logging (user, query, filters, timing) because endpoint-level request logs are not always a managed feature across AWS services. Verify current CloudSearch logging options in official docs.

Compliance considerations

Data classification: ensure you understand what data is being indexed (PII, secrets).
Retention/deletion: implement deletes in the indexing pipeline when records are removed in the source of truth.
Access reviews: regularly review domain policies and IAM roles.

Common security mistakes

Setting an access policy with "Principal":"*" and no conditions in production
Allowing document endpoint writes from broad networks
Indexing sensitive fields that don’t belong in a search index (password resets, tokens, secrets)
Not separating tenant data properly (missing tenant filter)

Secure deployment recommendations

Use separate AWS accounts or at least separate domains per environment.
Restrict document endpoint to ingestion roles and trusted networks only.
Enforce tenant isolation by design: tenant_id fields + mandatory filters at the app layer.
Use least privilege IAM for control plane operations.

13. Limitations and Gotchas

Because limits evolve, confirm current values in the official docs. Common practical constraints include:

Known limitations / design constraints

CloudSearch is optimized for “classic” lexical search, not semantic/vector search.
Limited ecosystem compared with OpenSearch (plugins, dashboards, broad ingest tooling).
Schema changes can trigger processing and may require careful rollout planning.

Quotas

Maximum domains per account/Region
Maximum index fields per domain
Document size and batch size limits
Throughput constraints per instance type

Verify in official docs: CloudSearch quotas and limits.

Regional constraints

Not available in every Region; confirm before you design multi-Region architectures.

Pricing surprises

Domains bill continuously while running.
Replicas and partitions multiply instance-hours.
Cross-Region traffic adds latency and transfer costs.

Compatibility issues

Query syntax and field configuration options are specific to CloudSearch.
Migrating from Elasticsearch/OpenSearch or Solr may require query and schema translation.

Operational gotchas

Access policy mistakes can lock out ingestion or unintentionally expose endpoints.
Scaling changes and schema updates can take time and temporarily affect ingestion/query behavior.
Autocomplete suggestions require correct suggester configuration and relevant source field content.

Migration challenges

Relevance differences between engines require testing.
Reindex pipelines must be rebuilt around CloudSearch document upload format and endpoints.

Vendor-specific nuances

CloudSearch has a distinct API model (control plane vs domain endpoints).
Many “modern search platform” features are not part of CloudSearch; validate requirements early.

14. Comparison with Alternatives

Amazon CloudSearch often competes with other search options depending on scale, analytics needs, and operational preferences.

Option	Best For	Strengths	Weaknesses	When to Choose
Amazon CloudSearch	Application/site search with facets and managed ops	Simple managed service; structured + text search; suggest; straightforward APIs	Smaller ecosystem; fewer advanced analytics/observability tools; networking model may be limiting for some	You need classic app search and want a managed AWS-native service with minimal ops
Amazon OpenSearch Service (AWS)	Search + analytics (logs, metrics, text search)	Rich query/aggregation, dashboards, VPC support, broader ecosystem	More operational tuning; cluster sizing and index management complexity	You need analytics, aggregations, dashboards, VPC-only access, or broader compatibility
Amazon Kendra	Enterprise semantic search across sources	Connectors, relevance tuning, natural language capabilities	Different cost model; may be overkill for simple catalogs	You need semantic/enterprise search and managed connectors
Aurora/RDS + full-text features	Small-scale search within a relational app	Fewer moving parts; transactional + search in one	Limited relevance and scaling for complex search; can overload DB	You have small datasets and want basic search without another service
Self-managed OpenSearch/Elasticsearch	Maximum control, custom plugins	Full control; flexible	Highest ops burden; patching, scaling, failures	You have a platform team and need deep customization
Apache Solr (self-managed)	Solr-native organizations	Mature search engine	Ops-heavy	You already run Solr and need full control
Azure AI Search	Managed search on Azure	Tight Azure integration; AI enrichment options	Cross-cloud latency/integration if on AWS	You’re primarily on Azure
Google Cloud Discovery Engine / Vertex AI Search	Managed search on GCP	Google-managed search features	Cross-cloud	You’re primarily on GCP

15. Real-World Example

Enterprise example: Customer support portal search at a SaaS company

Problem A SaaS company has: – 2 million support tickets – internal notes and customer-visible articles – agents need fast search with filters (status, priority, product area) They want to avoid operating their own search clusters.

Proposed architecture – Source of truth: Aurora or DynamoDB – CDC/event stream: DynamoDB Streams or app events to EventBridge – Lambda transformer: – normalizes fields – removes sensitive data – formats CloudSearch document batches – CloudSearch domain: – title and body as text – status, priority, product as literal – updated_at as date – Web/API tier calls CloudSearch search endpoint – CloudWatch alarms on latency and errors – Access policy: allow only app roles and ingestion roles; deny public access

Why Amazon CloudSearch was chosen – Classic text+facet search is sufficient – Managed service reduces ops overhead – Straightforward ingestion and schema approach

Expected outcomes – Agents find tickets faster with facets and highlighting – Reduced DB load (search offloaded from relational queries) – Predictable operational model with CloudWatch monitoring

Startup/small-team example: E-commerce storefront search

Problem A small team needs product search with: – title/description search – filters by brand/category – sort by price They have limited DevOps capacity.

Proposed architecture – Product catalog in DynamoDB (or Shopify export) as source of truth – Nightly batch export to S3 – A scheduled job (Lambda or ECS task) pushes batch updates to CloudSearch – One CloudSearch domain for production, one for staging

Why Amazon CloudSearch was chosen – Faster to implement than running OpenSearch clusters – Good enough features for catalog search (facets/sort) – Lower operational overhead

Expected outcomes – Better conversion (search + autocomplete) – Faster page load with optimized search queries – Clear cost control by right-sizing instances and deleting dev domains

16. FAQ

Is Amazon CloudSearch still available on AWS?
Yes, Amazon CloudSearch remains available. For new projects, AWS customers often also evaluate Amazon OpenSearch Service or Amazon Kendra depending on requirements.
What is a CloudSearch “domain”?
A domain is the primary resource that contains your indexed data, schema (index fields), endpoints, scaling settings, and access policy.
Is Amazon CloudSearch serverless?
No. CloudSearch uses a provisioned capacity model (instance types, partitions, replicas). You pay for running capacity.
How do I ingest data into CloudSearch?
You upload documents (JSON or XML) to the document endpoint using the CloudSearch document format (add/delete). Many teams use batch jobs or event-driven pipelines (Lambda).
Can CloudSearch replace my database?
No. Treat it as a search index. Keep a source-of-truth system (RDS/DynamoDB/S3). Use CloudSearch for discovery and retrieval of IDs, then fetch authoritative records from your DB if needed.
Does CloudSearch support faceted navigation?
Yes, via facets on fields configured as facetable (commonly literal/literal-array, sometimes numeric fields depending on configuration).
Can I do sorting (e.g., by date or price)?
Yes, if the field is defined with sorting enabled and the data type supports it.
Does CloudSearch support autocomplete?
Yes, via suggesters configured on one or more source fields.
How do I secure my CloudSearch endpoints?
Use domain access policies to restrict actions (search, suggest, document) to specific IAM principals and/or IP ranges. Avoid public write access.
Can CloudSearch be placed inside a VPC?
CloudSearch networking is not identical to VPC-native services. Verify current networking and access options in official docs. If you require VPC-only endpoints, evaluate Amazon OpenSearch Service.
How do I handle multi-tenancy?
Add a tenant_id field and enforce tenant filters in every query. Consider access controls and separate domains for stronger isolation when required.
How do schema changes affect indexing?
Many schema changes trigger domain processing and can require reindexing or at least reprocessing. Plan schema with versioning and test changes in staging first.
How do I monitor CloudSearch health?
Use Amazon CloudWatch metrics for latency, error rates, and capacity signals. Add app-level monitoring for query success and business KPIs.
What are common reasons for “no results”?
Indexing not completed, wrong query parser (simple vs structured), fields not searchable/returnable, or searching the wrong field configuration.
How do I estimate costs?
Use the CloudSearch pricing page for your Region and model instance-hours based on instance type × partitions × replicas. Use the AWS Pricing Calculator for monthly estimates.
Is CloudSearch good for analytics workloads?
For “analytics” in the sense of searching content and exploring facets, it can help. For log analytics and heavy aggregations, Amazon OpenSearch Service is usually the closer fit.
Can I migrate from CloudSearch to OpenSearch later?
Yes, but expect rework: schema mapping, query syntax differences, reindexing pipeline changes, and relevance retuning.

17. Top Online Resources to Learn Amazon CloudSearch

Resource Type	Name	Why It Is Useful
Official Documentation	CloudSearch Developer Guide: https://docs.aws.amazon.com/cloudsearch/latest/developerguide/	Primary authoritative reference for domains, index fields, access policies, and querying
Official API Reference	CloudSearch API Reference: https://docs.aws.amazon.com/cloudsearch/latest/APIReference/	Details for control plane APIs (create domain, define fields, policies)
Official CLI Reference	AWS CLI `cloudsearch`: https://docs.aws.amazon.com/cli/latest/reference/cloudsearch/	Practical commands for managing domains and schema
Official CLI Reference	AWS CLI `cloudsearchdomain`: https://docs.aws.amazon.com/cli/latest/reference/cloudsearchdomain/	Commands for document upload, search, and suggest
Official Pricing	Amazon CloudSearch Pricing: https://aws.amazon.com/cloudsearch/pricing/	Accurate pricing dimensions by Region
Cost Estimation	AWS Pricing Calculator: https://calculator.aws/#/	Build monthly estimates including instance-hours and data transfer
Security Guidance	IAM JSON Policy Reference: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements.html	Helps correctly craft and validate CloudSearch access policies
Monitoring	Amazon CloudWatch: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/	Guidance on metrics, alarms, and operational monitoring patterns
Logging/Auditing	AWS CloudTrail: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/	Audit control-plane actions for governance and compliance
Related Architecture	AWS Architecture Center: https://aws.amazon.com/architecture/	Patterns for ingestion pipelines and managed search alternatives
Alternative Service	Amazon OpenSearch Service Docs: https://docs.aws.amazon.com/opensearch-service/	Useful for deciding when OpenSearch is a better fit
Alternative Service	Amazon Kendra Docs: https://docs.aws.amazon.com/kendra/	Useful when you need semantic enterprise search rather than classic search

18. Training and Certification Providers

Institute	Suitable Audience	Likely Learning Focus	Mode	Website URL
DevOpsSchool.com	DevOps engineers, cloud engineers, architects	AWS + DevOps practices; managed services integration; operational readiness	Check website	https://www.devopsschool.com/
ScmGalaxy.com	Students, engineers learning tooling and cloud	DevOps/SCM foundations and cloud operations basics	Check website	https://www.scmgalaxy.com/
CLoudOpsNow.in	Cloud ops and platform teams	Cloud operations practices, monitoring, automation	Check website	https://www.cloudopsnow.in/
SreSchool.com	SREs, production engineers	Reliability engineering, SLOs, monitoring, incident response	Check website	https://www.sreschool.com/
AiOpsSchool.com	Ops teams exploring AIOps	Observability, automation, event correlation concepts	Check website	https://www.aiopsschool.com/

19. Top Trainers

Platform/Site	Likely Specialization	Suitable Audience	Website URL
RajeshKumar.xyz	DevOps/cloud training content	Beginners to intermediate engineers	https://rajeshkumar.xyz/
devopstrainer.in	DevOps coaching and courses	Engineers seeking guided learning	https://www.devopstrainer.in/
devopsfreelancer.com	Freelance DevOps help/training	Teams needing practical enablement	https://www.devopsfreelancer.com/
devopssupport.in	DevOps support/training platform	Ops teams and engineers	https://www.devopssupport.in/

20. Top Consulting Companies

Company Name	Likely Service Area	Where They May Help	Consulting Use Case Examples	Website URL
cotocus.com	Cloud/DevOps consulting	Architecture reviews, implementation support, operations	Build ingestion pipelines, secure access policies, monitoring setup	https://cotocus.com/
DevOpsSchool.com	DevOps and cloud consulting	Delivery, automation, training + consulting	IaC rollout, CI/CD for schema changes, operational best practices	https://www.devopsschool.com/
DEVOPSCONSULTING.IN	DevOps consulting services	Platform automation, reliability improvements	Observability, incident response processes, managed search integration patterns	https://www.devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Amazon CloudSearch

AWS fundamentals: IAM, Regions, networking basics, CloudWatch/CloudTrail
Data modeling basics: JSON documents, schema design
Search concepts: inverted indexes, relevance, faceting, filtering vs searching

What to learn after Amazon CloudSearch

Amazon OpenSearch Service for richer analytics/search ecosystems
Event-driven ingestion on AWS:
DynamoDB Streams, EventBridge, Kinesis
Lambda patterns (retries, DLQs)
Observability: metrics, logs, tracing; alert design
Security deep dives: IAM policies, least privilege, data classification

Job roles that use it

Cloud engineer / cloud developer
Solutions architect
DevOps engineer / SRE
Backend engineer building search-driven features
Platform engineer (managed services enablement)

Certification path (AWS)

CloudSearch is not typically a stand-alone certification topic, but it appears as part of broader AWS architecture knowledge. Relevant certifications: – AWS Certified Solutions Architect – Associate/Professional – AWS Certified Developer – Associate – AWS Certified SysOps Administrator – Associate (Verify the current exam guides for coverage emphasis.)

Project ideas for practice

Build a product catalog search with facets and sort.
Implement a CDC pipeline: DynamoDB Streams → Lambda → CloudSearch.
Add autocomplete and measure “no results” rate improvements.
Create a multi-tenant search API with enforced tenant filters and audit logs.
Build a reindex pipeline that can rebuild from S3 on demand.

22. Glossary

Access Policy: A resource policy attached to a CloudSearch domain controlling who can call search/suggest/document endpoints.
Autocomplete/Suggester: A CloudSearch feature that returns suggestions for partial queries.
Control Plane: AWS APIs used to create/configure domains (schema, scaling, policies).
Data Plane: Endpoints used to upload documents and run queries.
Domain: The main CloudSearch resource that contains your index and endpoints.
Facet: Aggregated counts per field value used for filters (e.g., genre counts).
Field / Index Field: A schema-defined attribute in your documents (title, year, tags).
Filtering: Restricting results by structured constraints (e.g., genre = sci-fi).
Highlighting: Returning snippets that show where matches occurred in a text field.
Partition: A scaling unit used to distribute index data and throughput.
Replica: A copy of a partition to improve availability and read performance.
Reindexing: Rebuilding the index after schema changes or large ingestion changes.
Schema: The set of field definitions and indexing options.
Search Endpoint: The endpoint used to execute search queries.
Document Endpoint: The endpoint used to upload documents for indexing.
Structured Query: A query format that supports boolean logic and field constraints (syntax defined by CloudSearch).

23. Summary

Amazon CloudSearch is a managed AWS service for building classic application search: ingest documents, define fields, and query with full-text relevance plus filters, facets, sorting, highlighting, and suggestions. It matters when you need reliable search features without operating your own search clusters, especially for product catalogs, documentation, and internal portals.

Architecturally, CloudSearch fits as a search index alongside a source-of-truth database, fed by batch exports or event-driven pipelines. Cost is primarily driven by provisioned instance-hours, multiplied by partitions and replicas, plus data transfer. Security hinges on correctly scoping domain access policies and using HTTPS; validate encryption-at-rest and private networking requirements in official docs if you have strict compliance constraints.

Use Amazon CloudSearch when your requirements align with classic search and you value simplicity; consider Amazon OpenSearch Service or Amazon Kendra when you need deeper analytics, VPC-native deployment, or semantic search features. Next step: build a production-grade ingestion pipeline (CDC + retries + DLQ) and set up CloudWatch alarms based on real query and indexing behavior.

Category