AWS Amazon Kendra Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Machine Learning (ML) and Artificial Intelligence (AI)

Category

Machine Learning (ML) and Artificial Intelligence (AI)

1. Introduction

Amazon Kendra is an AWS-managed enterprise search service that uses Machine Learning (ML) to help users find accurate answers and relevant documents across many content repositories (documents, wikis, knowledge bases, file shares, and SaaS tools).

In simple terms: you connect Amazon Kendra to your data (for example, an Amazon S3 bucket or a wiki), let it index content, and then your users can search using natural language (for example, “How do I reset my VPN token?”). Kendra returns ranked results and often highlights the exact passage that answers the question.

Technically, Amazon Kendra builds and manages a search index that combines semantic ranking, document understanding, metadata filtering, and (optionally) access control enforcement. It supports ingesting data through pre-built connectors (data sources), custom ingestion APIs, and document enrichment pipelines. Applications query Kendra via AWS APIs/SDKs, and can integrate the results into portals, chatbots, and Retrieval-Augmented Generation (RAG) workflows (for example, using Amazon Bedrock) without having to operate their own search infrastructure.

What problem it solves: Traditional keyword search often fails in enterprise environments because content is scattered, titles are inconsistent, users ask questions (not keywords), and relevance depends on context and permissions. Amazon Kendra aims to deliver “enterprise-grade search” with better relevance, easier integration, and managed operations.

2. What is Amazon Kendra?

Official purpose: Amazon Kendra is a fully managed intelligent search service for enterprise content. It is designed to help organizations index and search large volumes of unstructured and semi-structured data stored across AWS and third-party systems.

Core capabilities (high level): – Create and manage indexes for enterprise search. – Ingest content from data sources/connectors (for example Amazon S3, wiki platforms, CRM/ITSM tools, and web pages) and/or via custom ingestion APIs. – Run natural-language queries and return ranked results with highlighted passages and metadata. – Apply filters/facets and relevance tuning. – Enforce document visibility with access control (when configured and supported by the ingestion method/connector). – Improve ingestion quality with document enrichment (for example, extracting metadata, transforming text, or adding tags via AWS Lambda).

Major components:Index: The core searchable repository that stores processed content and metadata. – Data sources (connectors): Managed connectors and sync jobs that pull documents, metadata, and (in some cases) ACLs from repositories into the index. – Custom document ingestion: APIs to push documents directly (useful for proprietary systems or custom pipelines). – Query APIs: APIs to query/search the index and retrieve results. – Access control configuration: Options to associate user identity/group information with documents and enforce “who can see what” during query. – Relevance tuning & metadata: Field mapping, boosting, facets, and query-time filtering.

Service type: Fully managed AWS service (SaaS-like within AWS). You do not manage servers, clusters, or shards.

Scope (regional/global/account): – Amazon Kendra is a regional service. You create indexes in a specific AWS Region, and data sources/sync jobs run in that Region. – Resources are account-scoped within the Region (subject to IAM permissions).
Verify current Region availability in the official docs: https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html

How it fits into the AWS ecosystem: – Uses IAM for authentication/authorization to Kendra APIs and for granting Kendra permission to read from your repositories (for example, S3). – Integrates with AWS KMS for encryption at rest (service-managed encryption and/or customer-managed keys depending on configuration—verify exact options in docs). – Works well with Amazon S3 (common document store), AWS Lambda (document enrichment), Amazon CloudWatch (metrics), and AWS CloudTrail (API auditing). – Commonly paired with Amazon Lex (chatbots), Amazon Bedrock (RAG), AWS IAM Identity Center (enterprise identity), and application front ends via Amazon Cognito, API Gateway, or custom web apps.

3. Why use Amazon Kendra?

Business reasons

  • Faster answers, less time wasted: Reduce time employees spend hunting through wikis, PDFs, ticket systems, and shared drives.
  • Better self-service: Improve customer or employee self-service by indexing knowledge bases and support documentation.
  • Improved knowledge reuse: Make institutional knowledge discoverable even when content is poorly titled or inconsistently tagged.

Technical reasons

  • Natural language search: Designed for question-like queries, not just keywords.
  • Connectors reduce integration time: Many common repositories can be indexed without writing a custom crawler.
  • Metadata filtering and facets: Combine semantic ranking with structured filtering (department, product, date, confidentiality, etc.).
  • APIs for application integration: Use AWS SDKs to embed search into portals, apps, and chat systems.

Operational reasons

  • Managed infrastructure: No cluster provisioning, patching, scaling, or shard management.
  • Repeatable ingestion: Scheduled or on-demand sync jobs with status tracking.
  • Observability hooks: Metrics and audit events integrate into standard AWS operational tooling.

Security/compliance reasons

  • AWS IAM integration: Fine-grained control over who can administer indexes and data sources.
  • Encryption and auditing: Standard AWS encryption and API audit patterns (verify exact encryption capabilities for your configuration in official docs).
  • Access control-aware search (when configured): Search results can be filtered by user identity/ACL rules, which is critical for enterprise content.

Scalability/performance reasons

  • Designed for enterprise content volumes: Kendra is intended for large document sets and many concurrent users (subject to quotas and edition limits).
  • Relevance at scale: ML ranking is managed by AWS; you focus on content quality and metadata.

When teams should choose Amazon Kendra

Choose Amazon Kendra when you need: – Enterprise search across multiple repositories – Strong relevance for natural language queries – Managed connectors and ingestion workflows – Access control-aware search in a managed service – A search layer that can feed RAG systems (retrieve relevant passages for an LLM)

When teams should not choose Amazon Kendra

Consider alternatives when: – You only need simple keyword search on one small dataset (OpenSearch or database full-text search may be cheaper/simpler). – You need full control of ranking algorithms, analyzers, or low-level search internals (OpenSearch/Elasticsearch gives more control). – You are primarily building vector similarity search with custom embeddings and scoring (Amazon OpenSearch Service vector search or purpose-built vector databases may fit better; Kendra is not marketed as a general vector database). – You have strict constraints that require on-prem-only operation (Kendra is an AWS managed service).

4. Where is Amazon Kendra used?

Industries

  • Technology & SaaS: Internal engineering knowledge search, runbooks, product docs.
  • Financial services: Policy/procedure search, compliance documents, internal knowledge bases (with strong access control requirements).
  • Healthcare & life sciences: Research document discovery, internal SOPs (ensure compliance requirements are met).
  • Manufacturing: Maintenance manuals, part catalogs, troubleshooting docs.
  • Retail & e-commerce: Customer support knowledge, product information aggregation.
  • Public sector/education: Policy search, intranet knowledge, research repositories (subject to governance requirements).

Team types

  • Platform teams building internal portals
  • Support engineering / IT service management teams
  • Data/ML teams building RAG assistants
  • Security and compliance teams managing knowledge access
  • DevOps/SRE teams indexing operational runbooks and incident retrospectives

Workloads

  • Enterprise search portals and intranets
  • Support agent assist tools (“suggest the best KB article for this ticket”)
  • Knowledge retrieval layer for chatbots and virtual assistants
  • Compliance and policy discovery
  • Document discovery across multiple silos

Architectures

  • Central search index with multiple connectors
  • Per-department index model with strict separation (sometimes used for governance/cost control)
  • RAG architecture: Kendra retrieval → LLM summarization/answering (via Amazon Bedrock or another LLM platform)

Real-world deployment contexts

  • Indexing documents from S3 + SharePoint + Confluence into one index
  • Adding an internal search bar to a company portal
  • Integrating with ticketing systems for support knowledge

Production vs dev/test usage

  • Dev/test: Usually one small index, limited documents, infrequent sync. Delete when not needed to control hourly costs.
  • Production: Carefully designed index strategy, ingestion schedules, monitoring, ACL enforcement, and change management for metadata/schema.

5. Top Use Cases and Scenarios

Below are realistic scenarios where Amazon Kendra is commonly a strong fit.

1) Internal IT Helpdesk Knowledge Search

  • Problem: Employees submit repetitive tickets because solutions are hard to find.
  • Why Kendra fits: Indexes IT knowledge articles, PDFs, and runbooks; supports natural language questions.
  • Scenario: “How do I connect to the corporate VPN from macOS?” returns the exact step-by-step doc and highlights the relevant passage.

2) Support Agent Assist for Faster Ticket Resolution

  • Problem: Support agents waste time searching multiple systems while on a call.
  • Why Kendra fits: Single search layer over KB + product docs + past resolutions; can filter by product/version.
  • Scenario: A support console calls Kendra for each ticket, showing top 5 suggested articles and known fixes.

3) Enterprise Policy and Compliance Search

  • Problem: Policies exist in many PDFs and sites; people can’t find the “right” version.
  • Why Kendra fits: Indexes policies with metadata (effective date, owner, department). Facets improve discovery.
  • Scenario: “Travel reimbursement for contractors” returns the correct policy section with excerpt.

4) Engineering Runbook and Incident Retrospective Discovery

  • Problem: On-call engineers can’t quickly find relevant runbooks and past incidents.
  • Why Kendra fits: Natural language works well (“latency spike in us-east-1”), and metadata filters help (service/team).
  • Scenario: During an outage, a chatbot uses Kendra to retrieve relevant runbooks and links.

5) HR and Employee Self-Service Portal

  • Problem: Employees ask HR the same questions repeatedly.
  • Why Kendra fits: Index HR policies, benefits docs, and internal wiki pages; support synonyms (PTO vs vacation).
  • Scenario: “How many vacation days do I have?” returns the benefits guide and highlights the PTO accrual section.

6) Knowledge Search Across Multiple SaaS Tools

  • Problem: Critical content is spread across Confluence, SharePoint, Google Drive, and internal docs.
  • Why Kendra fits: Connectors reduce custom development; unified index improves user experience.
  • Scenario: A single search UI queries Kendra and returns results from multiple sources with source badges.

7) Product Documentation Search for Customers (Authenticated)

  • Problem: Customers can’t find relevant product docs quickly; search results are noisy.
  • Why Kendra fits: Can index documentation and support semantic ranking; combine with authentication and filtering.
  • Scenario: Logged-in customers search “configure SSO for Okta”, getting the best matching docs.

8) Retrieval Layer for RAG (LLM Assistants)

  • Problem: LLMs hallucinate without trustworthy context and citations.
  • Why Kendra fits: Retrieves relevant documents/passages; your app can provide sources to the LLM.
  • Scenario: A Bedrock-powered assistant uses Kendra results as context and returns an answer with citations.

9) Document Discovery for Research/Legal Teams

  • Problem: Teams need to find documents and clauses quickly across large corpora.
  • Why Kendra fits: Semantic ranking and excerpt highlighting help locate relevant sections.
  • Scenario: “Indemnification clause termination” retrieves and highlights the clause across templates.

10) Central Search for Technical Training Materials

  • Problem: Training content is fragmented; learners can’t find the right lab or module.
  • Why Kendra fits: Indexes PDFs, HTML, and wiki pages; metadata facets by course/topic.
  • Scenario: “Kubernetes ingress troubleshooting lab” returns the lab guide and prerequisites.

11) M&A Knowledge Integration

  • Problem: After an acquisition, documentation is split across two toolchains.
  • Why Kendra fits: Connectors can index both repositories into one searchable index (with governance).
  • Scenario: Users search once and see results labeled by legacy company source.

12) Field Service / Maintenance Manual Search

  • Problem: Technicians need answers quickly from manuals and service bulletins.
  • Why Kendra fits: PDF-heavy content and question queries are common; excerpt highlighting is useful.
  • Scenario: “Error code E17 compressor” returns the manual section and troubleshooting steps.

6. Core Features

Note: Amazon Kendra features evolve. Always verify the latest capabilities and connector list in the official documentation: https://docs.aws.amazon.com/kendra/

1) Managed indexes (Developer/Enterprise editions)

  • What it does: Provides managed search indexes without running servers.
  • Why it matters: Removes operational burden of scaling and maintaining search clusters.
  • Practical benefit: Faster time to value; consistent managed experience.
  • Caveats: Pricing is typically hourly and can be significant; choose the correct edition and delete unused indexes.

2) Data source connectors (managed ingestion)

  • What it does: Connects to supported repositories and syncs documents and metadata into Kendra.
  • Why it matters: Integration is often the hardest part of enterprise search.
  • Practical benefit: Faster onboarding for common systems (for example S3 and popular collaboration tools).
  • Caveats: Connector availability and ACL support vary by connector; verify capabilities per connector in docs.

3) Custom document ingestion APIs

  • What it does: Allows you to push documents directly using APIs (for proprietary systems or event-driven pipelines).
  • Why it matters: Not every repository has a connector.
  • Practical benefit: You can index content generated by applications or stored in custom databases.
  • Caveats: You must manage batching, retries, idempotency, and mapping metadata fields correctly.

4) Natural language query understanding and semantic ranking

  • What it does: Interprets user questions and ranks results using ML-based relevance.
  • Why it matters: Users ask questions (“How do I…”) rather than exact keywords.
  • Practical benefit: Better top results and fewer “no results found” experiences.
  • Caveats: Relevance depends on content quality, metadata, and correct field mappings.

5) Excerpts, highlights, and answer-like results

  • What it does: Returns snippets from documents that match the query and highlights relevant passages.
  • Why it matters: Users can quickly validate if a result contains the answer.
  • Practical benefit: Faster click-through and less time scanning long documents.
  • Caveats: Quality varies by document format and extraction; scanned PDFs may require OCR before indexing.

6) Metadata schema, facets, and filtering

  • What it does: Supports metadata fields and query-time filters (and faceted navigation in UIs).
  • Why it matters: Enterprise search often needs “filter by department/product/date/confidentiality”.
  • Practical benefit: Higher precision searches, better UX for large corpora.
  • Caveats: You must design metadata carefully; incorrect field types or sparse metadata reduces usefulness.

7) Relevance tuning (boosting, field importance)

  • What it does: Adjusts ranking by boosting certain fields or data sources.
  • Why it matters: Business context matters (official policies > drafts, latest version > old).
  • Practical benefit: Aligns search results with what users actually need.
  • Caveats: Over-boosting can hide relevant results; test changes and monitor user feedback.

8) Synonyms / thesaurus (terminology alignment)

  • What it does: Helps treat related terms as equivalent (for example, “PTO” and “vacation”).
  • Why it matters: Organizations use inconsistent terminology.
  • Practical benefit: Better recall and fewer missed results.
  • Caveats: Poor synonym design can increase noise; treat as a controlled vocabulary.

9) Document enrichment (preprocessing and metadata extraction)

  • What it does: Applies transformations to documents during ingestion (often via AWS Lambda) to add metadata, redact content, or normalize text.
  • Why it matters: “Garbage in, garbage out” applies strongly to enterprise search.
  • Practical benefit: Adds tags, cleans up content, extracts key fields for filtering.
  • Caveats: Enrichment adds complexity, latency, and Lambda costs; ensure idempotency and handle failures.

10) Access control-aware search (ACLs and user context)

  • What it does: Restricts results so users only see documents they are permitted to view.
  • Why it matters: Enterprise content is rarely all-public.
  • Practical benefit: Enables indexing sensitive repositories while respecting permissions.
  • Caveats: Correct ACL ingestion and identity mapping is critical; support varies by connector and configuration. Verify connector ACL support and identity requirements.

11) Query suggestions (type-ahead)

  • What it does: Can suggest queries as users type (depending on configuration and API usage).
  • Why it matters: Improves UX and helps users discover common queries.
  • Practical benefit: Faster search and more consistent query patterns.
  • Caveats: Suggestions are not always desired for sensitive environments; evaluate privacy and UX.

12) Amazon Kendra Intelligent Ranking (related capability)

  • What it does: Provides ML-based re-ranking for search results from other search engines (for example, OpenSearch/Elasticsearch), so you can improve relevance without migrating the index.
    Verify current compatibility in official docs.
  • Why it matters: Many organizations already have search engines but want better ranking.
  • Practical benefit: Incremental improvement path.
  • Caveats: This is a distinct capability with its own setup and pricing model; not the same as running a full Kendra index.

7. Architecture and How It Works

High-level service architecture

Amazon Kendra sits between your content repositories and your search applications:

  1. Ingestion: Kendra connects to content repositories via data source connectors or receives documents via APIs.
  2. Processing: It extracts text and metadata, applies enrichment (optional), and builds an index.
  3. Query: Applications call Kendra query APIs. Kendra evaluates user query + filters + (optional) user context for access control.
  4. Results: Returns ranked documents/snippets, metadata, and links back to the source.

Request / data / control flows

  • Control plane: Create index, configure schema, configure data sources, run syncs, manage relevance tuning, and configure access control.
  • Data plane: Documents flow into the index during sync/ingestion; queries flow from apps to Kendra and results back.
  • Security plane: IAM policies govern who can administer and query. IAM roles govern what Kendra can read from data sources (for example, S3). Optional user identity context can constrain results.

Integrations with related AWS services

Common integrations include: – Amazon S3 for document storage and as a primary data source. – AWS Lambda for document enrichment during ingestion (custom metadata extraction, normalization, redaction workflows). – Amazon CloudWatch for metrics (and operational dashboards/alarms). – AWS CloudTrail for auditing API calls. – AWS KMS for encryption keys (depending on configuration). – Amazon Cognito / IAM Identity Center for authenticating end users of a search portal. – Amazon Bedrock (or other LLM providers) for RAG: Kendra retrieves relevant context; the LLM generates an answer with citations.

Dependency services

  • For data sources: the repository itself (S3, SaaS, on-prem connectors via required connectivity).
  • For enrichment: Lambda (and any services your Lambda calls).
  • For identity/ACL: your identity provider and group mapping strategy.

Security/authentication model

  • AWS API authentication: IAM (SigV4). Applications use IAM roles/users to call Kendra APIs.
  • Repository access: Kendra assumes a service role you provide for connectors (for example, an IAM role granting s3:GetObject on a bucket).
  • End-user authorization: If you need “per-user” result filtering, you typically pass user context to the query and ensure ACLs were ingested correctly. The exact approach depends on connector and identity strategy—verify in official docs.

Networking model

  • Kendra is a managed regional AWS service with public regional endpoints.
  • Many AWS services support VPC interface endpoints (AWS PrivateLink) to keep traffic on the AWS network. Availability can vary by Region and service—verify PrivateLink support for Amazon Kendra in your Region in official AWS documentation.

Monitoring/logging/governance considerations

  • Track:
  • Index status and data source sync status (success/failure, document counts).
  • Query volume and latency (metrics).
  • API activity (CloudTrail).
  • Govern:
  • Index naming/tagging standards.
  • IAM least privilege for admins, sync roles, and query clients.
  • Cost controls: number of indexes, edition choice, and sync schedules.

Simple architecture diagram (Mermaid)

flowchart LR
  U[Users / App] -->|Query API| K[Amazon Kendra Index]
  S3[(Amazon S3 Documents)] -->|Data Source Sync| K
  K -->|Ranked results + excerpts| U

Production-style architecture diagram (Mermaid)

flowchart TB
  subgraph Identity
    IDP[Corporate IdP] --> IC[IAM Identity Center / SSO Mapping]
    IC --> COG[Amazon Cognito / App Auth Layer]
  end

  subgraph Content
    S3[(Amazon S3)]
    CONF[Confluence / Wiki]
    SP[SharePoint]
    ITSM[Service Desk / ITSM]
  end

  subgraph Ingestion
    DS[Amazon Kendra Data Sources] --> IDX[Amazon Kendra Index]
    L[Document Enrichment\n(AWS Lambda)] --> IDX
  end

  S3 --> DS
  CONF --> DS
  SP --> DS
  ITSM --> DS

  subgraph Apps
    PORTAL[Internal Search Portal]
    BOT[Chatbot / Agent Assist]
    RAG[RAG Service]
    LLM[Amazon Bedrock (LLM)]
  end

  COG --> PORTAL
  PORTAL -->|Query + Filters + User Context| IDX
  BOT -->|Query| IDX

  RAG -->|Retrieve relevant passages| IDX
  RAG -->|Context + citations| LLM
  LLM -->|Answer| RAG

  subgraph Security_Operations
    IAM[IAM Roles/Policies]
    KMS[AWS KMS]
    CW[Amazon CloudWatch Metrics/Alarms]
    CT[AWS CloudTrail]
  end

  IAM --> DS
  IAM --> IDX
  KMS --> IDX
  IDX --> CW
  DS --> CW
  IDX --> CT
  DS --> CT

8. Prerequisites

AWS account and billing

  • An active AWS account with billing enabled.
  • Amazon Kendra is not a “free” service by default; expect hourly charges for indexes. Plan to clean up resources after labs.

Permissions / IAM roles

You will need permissions to: – Create and manage Kendra resources: index, data sources, sync jobs. – Create and manage an S3 bucket and upload sample documents. – Create or pass an IAM role for Kendra to access S3 (and optionally KMS).

Typical IAM permissions (high level): – kendra:* for lab/admin (scope down for production). – s3:CreateBucket, s3:PutObject, s3:GetObject, s3:ListBucket. – iam:CreateRole, iam:PutRolePolicy, iam:PassRole. – kms:* only if using customer-managed keys (scope down in production). – cloudtrail:LookupEvents and CloudWatch read permissions for validation.

Tools

  • AWS Management Console access, or:
  • AWS CLI v2 (optional, used in validation steps)
  • (Optional) Python 3 + boto3 for query examples

Region availability

  • Choose an AWS Region where Amazon Kendra is available.
  • Verify availability and supported Regions: https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html

Quotas / limits

  • Amazon Kendra has service quotas (for example, number of indexes, document limits, throughput).
  • Always verify current quotas in Service Quotas and Kendra documentation:
  • https://docs.aws.amazon.com/servicequotas/
  • https://docs.aws.amazon.com/kendra/

Prerequisite services

For this tutorial: – Amazon S3 (for storing sample documents) – IAM (for roles/policies)

Optional (production patterns): – CloudTrail (recommended) – CloudWatch alarms/dashboards (recommended) – KMS customer-managed key (optional; verify support/configuration requirements)

9. Pricing / Cost

Amazon Kendra pricing changes over time and varies by Region and edition. Do not rely on blog posts for exact numbers—use official pricing.

Official pricing page: https://aws.amazon.com/kendra/pricing/
AWS Pricing Calculator: https://calculator.aws/

Pricing dimensions (typical model)

Amazon Kendra cost is generally driven by: – Index capacity billed per time (hourly), and the edition (for example Developer vs Enterprise).
Exact included capacity and scaling model is documented on the pricing page—verify the current edition definitions and hourly rates. – Potential additional charges for related capabilities such as Amazon Kendra Intelligent Ranking (if used).
Verify on the official pricing page and the Intelligent Ranking docs.

Kendra also indirectly drives costs in connected services: – S3 storage for your documents. – Data transfer (for example, if connectors pull from outside AWS or across Regions). – AWS Lambda costs if you use enrichment functions. – Secrets management costs if connectors require stored credentials (for example AWS Secrets Manager). – CloudWatch costs for metrics, dashboards, and alarms (usually modest, but not always zero). – KMS costs if using customer-managed keys (API requests, key usage).

Free tier

Amazon Kendra historically has not had a broad “always-free” tier like some AWS services. Some AWS services offer limited free usage, but verify whether Amazon Kendra currently offers any free tier or trial on the pricing page.

Key cost drivers

  • Number of indexes: Each index typically incurs hourly charges. Multiple environments (dev/test/prod) can multiply cost quickly.
  • Edition choice: Developer vs Enterprise can change the baseline hourly cost and capacity.
  • Document volume and update frequency: More content and frequent re-syncs may require larger capacity or more operational effort (pricing impact depends on current Kendra model—verify).
  • Query volume: Depending on pricing model, queries may or may not be a direct line item. Verify on pricing page.
  • Enrichment: Lambda-based enrichment can add compute cost and ingestion latency.

Network/data transfer implications

  • Uploading documents to S3 in the same Region as the Kendra index is typically the simplest and avoids cross-Region transfer.
  • If indexing external SaaS or on-prem systems, network egress and connector connectivity can introduce costs (and security constraints).

How to optimize cost

  • Start with one index and prove value before scaling to many per team/department.
  • Use Developer edition for labs/dev when appropriate (verify edition constraints).
  • Minimize idle indexes: If you don’t need an index, delete it. (Kendra is managed; you typically can’t “stop” it to avoid hourly costs.)
  • Control sync frequency: Don’t sync every 5 minutes if daily is enough.
  • Keep documents clean: Avoid indexing duplicates, stale versions, and low-value content.
  • Evaluate alternatives for simple search: If requirements are basic keyword search, OpenSearch can be more cost-efficient.

Example low-cost starter estimate (no fabricated numbers)

A typical low-cost lab scenario includes: – 1 Kendra index (Developer edition if supported/appropriate) – 1 S3 data source – A small set of documents (10–100) – A single manual sync – A few interactive queries for validation

How to estimate: 1. Go to https://aws.amazon.com/kendra/pricing/ and identify the hourly rate for the chosen edition in your Region. 2. Multiply by the number of hours you plan to keep the index. 3. Add S3 storage (small) and any Lambda enrichment costs (if used).

Example production cost considerations

For production, account for: – Multiple indexes (prod + staging + dev) – Higher availability requirements and governance (more tooling and operational work) – Ongoing sync schedules – Identity/ACL integration (often increases complexity and operational overhead) – Potential need for multiple data sources and content growth – RAG usage: additional costs for LLM inference (Amazon Bedrock) and any caching layers

10. Step-by-Step Hands-On Tutorial

Objective

Build a small, realistic Amazon Kendra search experience on AWS by: 1. Creating a Kendra index 2. Indexing documents stored in Amazon S3 3. Querying the index via the console and AWS CLI 4. Cleaning up resources to avoid ongoing charges

Lab Overview

You will create: – An S3 bucket with a few small text/HTML/PDF documents (keep it minimal) – An IAM role that Amazon Kendra assumes to read from the bucket – An Amazon Kendra index (choose the lowest-cost edition appropriate for labs—verify current options) – An S3 data source and a one-time sync job – A few test queries to validate results

Expected outcome: You can type a natural language query and get ranked results with excerpts from your uploaded documents.


Step 1: Choose a Region and confirm service access

  1. In the AWS Console, select an AWS Region where Amazon Kendra is supported.
  2. Confirm Amazon Kendra console loads: https://console.aws.amazon.com/kendra/
  3. (Recommended) Confirm you have permission to create IAM roles and S3 buckets.

Expected outcome: You can open the Amazon Kendra console without permission errors.


Step 2: Create an S3 bucket and upload sample documents

  1. Open the S3 console: https://console.aws.amazon.com/s3/
  2. Create a bucket (example):
    – Bucket name: kendra-lab-<your-unique-suffix> – Region: same as your Kendra index Region – Keep defaults for a lab (do not enable public access)

  3. Upload a few small documents. Create three files locally and upload them:

vpn-reset.txt

VPN Token Reset Procedure
1) Open the VPN portal.
2) Click "Reset token".
3) Confirm using MFA.
If you are locked out, contact IT Support.

expense-policy.txt

Travel Expense Policy (Summary)
- Meals are reimbursable up to the daily limit.
- Receipts are required for expenses over $25.
- Contractors must obtain manager approval before booking travel.

oncall-runbook.txt

On-Call Runbook: API Latency Spikes
1) Check dashboards for error rates and p95 latency.
2) Review recent deployments.
3) Verify upstream dependencies.
4) If needed, rollback the last deployment.

Upload these into a prefix like docs/ (optional but tidy): – docs/vpn-reset.txtdocs/expense-policy.txtdocs/oncall-runbook.txt

Expected outcome: Your S3 bucket contains at least 3 documents.

Verification: In S3 console, open the bucket → verify objects exist and you can view/download them.


Step 3: Create an IAM role for Amazon Kendra to read from S3

Amazon Kendra needs permission to read objects from your S3 bucket when running the data source sync.

  1. Open IAM console: https://console.aws.amazon.com/iam/
  2. Create a role: – Trusted entity type: AWS service – Use case: look for Amazon Kendra (or a generic service trust if presented differently) – If the console doesn’t provide a Kendra-specific wizard, use the trust policy recommended by Kendra documentation.
    Verify in official docs: https://docs.aws.amazon.com/kendra/

  3. Attach a policy that allows reading the bucket (minimum for this lab):

Replace kendra-lab-<your-unique-suffix> with your bucket name:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ListBucket",
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": ["arn:aws:s3:::kendra-lab-<your-unique-suffix>"]
    },
    {
      "Sid": "ReadObjects",
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": ["arn:aws:s3:::kendra-lab-<your-unique-suffix>/*"]
    }
  ]
}

Name the role something like: KendraS3DataSourceRole.

Expected outcome: You have an IAM role that Amazon Kendra can assume to read your S3 documents.

Verification: In IAM → Roles → open the role → confirm: – Trust relationship includes Kendra service principal (as documented) – Permissions include s3:ListBucket and s3:GetObject for the bucket


Step 4: Create an Amazon Kendra index

  1. Open Amazon Kendra console: https://console.aws.amazon.com/kendra/
  2. Choose Create an index
  3. Configure: – Index name: kendra-lab-index – Description: optional – IAM role: choose/create the service role for Kendra index management as prompted – Edition: choose the lowest-cost edition suitable for labs (often “Developer edition”, if available).
    Verify current edition options and constraints in the console and pricing page.

  4. Create the index.

Expected outcome: The index enters a “Creating” state, then becomes “Active/Ready”.

Verification: In Kendra console, the index status shows Active (or equivalent) before proceeding.

Common wait time: Several minutes.


Step 5: Add an S3 data source to the index

  1. In Kendra console, open your index: kendra-lab-index
  2. Go to Data sourcesAdd data source
  3. Choose Amazon S3
  4. Configure the data source: – Name: kendra-lab-s3 – S3 bucket: kendra-lab-<your-unique-suffix> – (Optional) Inclusion prefix: docs/ to limit indexing – IAM role: select KendraS3DataSourceRole – Sync schedule: set to Run on demand (or disable schedule) for the lab

  5. Save/add the data source.

Expected outcome: The data source is created and ready to run a sync.

Verification: The data source appears in the list with a status like “Ready” (wording may vary).


Step 6: Run a sync job

  1. In the data source details page, choose Sync now (or Run).
  2. Monitor the sync status: – It will move through states such as “Syncing”, then “Succeeded” or “Failed”.

Expected outcome: Sync completes successfully and documents are indexed.

Verification: – Data source shows Last sync status: Succeeded – Document count in index statistics increases (exact UI varies)


Step 7: Query the index in the Kendra console

  1. In the index view, open the Search console (Kendra provides a built-in test search UI)
  2. Run queries such as: – How do I reset my VPN token?expense receipts requiredwhat to do during API latency spikes

Expected outcome: You get ranked results and excerpts matching the correct document.

Verification tips: – The VPN query should return vpn-reset.txt near the top. – The expense query should return expense-policy.txt. – The on-call query should return oncall-runbook.txt.


Step 8 (Optional): Query using AWS CLI

This helps you validate programmatic access—how real applications will query Kendra.

8.1 Configure AWS CLI

aws configure
aws sts get-caller-identity

8.2 Find your index ID

In the Kendra console, open the index details and copy the Index ID.

Or use CLI (if permissions allow):

aws kendra list-indices

8.3 Run a query

Replace INDEX_ID:

aws kendra query \
  --index-id "INDEX_ID" \
  --query-text "How do I reset my VPN token?"

Expected outcome: JSON output includes matching document(s), titles, URIs, and excerpt text fields.


Step 9 (Optional): Query with Python (boto3)

Install dependencies:

python3 -m pip install boto3

Example script (query_kendra.py):

import boto3

INDEX_ID = "INDEX_ID"

kendra = boto3.client("kendra")

resp = kendra.query(
    IndexId=INDEX_ID,
    QueryText="expense receipts required",
)

for item in resp.get("ResultItems", []):
    print(item.get("Type"), item.get("DocumentTitle", {}))
    excerpt = item.get("DocumentExcerpt", {}).get("Text", "")
    if excerpt:
        print("Excerpt:", excerpt[:200])
    print("---")

Run:

python3 query_kendra.py

Expected outcome: Printed results include excerpts referencing receipts and policy limits.


Validation

Use this checklist:

  • [ ] Index status is Active
  • [ ] Data source last sync status is Succeeded
  • [ ] Query in console returns correct top documents
  • [ ] (Optional) CLI query returns structured results
  • [ ] (Optional) Python query prints excerpts

If validation fails, use the troubleshooting section.


Troubleshooting

Common issues and fixes:

  1. Data source sync failed: AccessDenied to S3 – Cause: IAM role doesn’t have correct s3:ListBucket/s3:GetObject permissions, or bucket policy blocks access. – Fix:

    • Recheck IAM role permissions.
    • Ensure bucket policy doesn’t deny access.
    • Confirm the data source is using the intended role.
  2. Index never becomes Active – Cause: Missing service-linked roles/permissions, or account restrictions. – Fix:

    • Check IAM permissions for creating Kendra resources.
    • Check AWS Health Dashboard and service limits.
    • Review CloudTrail for failed API calls.
  3. No results returned – Cause: Sync didn’t index documents, wrong prefix, unsupported file type, or query mismatch. – Fix:

    • Confirm objects exist under the inclusion prefix.
    • Confirm the sync status and document counts.
    • Try simpler queries (keywords) to sanity-check.
  4. Results returned but excerpts look empty – Cause: Document text extraction issues (format, encoding, scanned PDFs). – Fix:

    • Use plain text files to validate.
    • For PDFs, ensure they contain selectable text (OCR may be required upstream).
  5. CLI returns AccessDeniedException for kendra:Query – Cause: Your IAM identity doesn’t have query permissions. – Fix: Attach an IAM policy allowing kendra:Query on the index ARN.


Cleanup

To avoid ongoing charges, delete resources in this order:

  1. Delete Kendra index – Kendra console → Indexes → select kendra-lab-indexDelete – This is the main cost driver; delete it even if you keep the S3 bucket.

  2. Delete Kendra data source (if required separately by the console flow)

  3. Delete S3 objects and bucket – Empty the bucket – Delete the bucket

  4. Delete IAM role – IAM → Roles → delete KendraS3DataSourceRole (and any inline policies)

Expected outcome: No Kendra index remains, preventing further hourly charges.

11. Best Practices

Architecture best practices

  • Design your index strategy intentionally
  • One index for the whole organization is simpler for users, but can be harder for governance and cost attribution.
  • Multiple indexes (per department/app) can simplify access control boundaries but increases cost and operational overhead.
  • Keep data in-region
  • Store documents in S3 in the same Region as the Kendra index to reduce latency and cross-Region transfer.
  • Use metadata as a first-class design element
  • Define fields like department, product, document_type, effective_date, owner, confidentiality.
  • Enforce consistent tagging at ingestion time.

IAM/security best practices

  • Least privilege
  • Separate roles for: Kendra admins, data source sync role(s), and application query role(s).
  • Restrict who can modify relevance tuning
  • Ranking changes can impact business outcomes; treat as controlled configuration with change management.
  • Use explicit iam:PassRole constraints
  • Allow passing only the specific data source role(s) to Kendra.

Cost best practices

  • Delete dev/test indexes quickly
  • Kendra index hourly charges can accumulate.
  • Avoid duplicate content
  • Deduplicate documents and remove outdated versions.
  • Tune sync schedules
  • Sync only as often as needed; use on-demand sync for low-change repositories.

Performance best practices

  • Use filters to narrow broad queries
  • Expose facets in UIs where possible.
  • Optimize document quality
  • Prefer machine-readable PDFs and clean text. Extract text from scanned docs before indexing.

Reliability best practices

  • Monitor sync health
  • Alert on sync failures or prolonged sync durations.
  • Have a rollback plan for schema changes
  • Metadata schema changes can affect filters and relevance.

Operations best practices

  • Tag resources
  • Add tags for Env, App, Owner, CostCenter, DataClassification.
  • Use CloudTrail for auditing
  • Track who changed data sources, index settings, or access control configuration.
  • Document connector credentials lifecycle
  • Rotate credentials and store in secure services (for example AWS Secrets Manager) when applicable.

Governance/tagging/naming best practices

  • Naming:
  • kendra-<app>-<env>-index
  • kendra-<app>-<env>-ds-<source>
  • Tagging:
  • Environment=dev|staging|prod
  • OwnerEmail=...
  • CostCenter=...
  • DataSensitivity=public|internal|confidential

12. Security Considerations

Identity and access model

  • Administrative access is controlled by IAM permissions (create/update/delete indexes and data sources).
  • Query access should be granted to application roles/users with kendra:Query (and related APIs needed).
  • Data source access is controlled by the IAM role Kendra assumes to read the repository (for example S3).
  • End-user document access control requires:
  • Correct ACL ingestion (connector-dependent)
  • Correct identity mapping (user/group) provided to Kendra at query time (implementation varies—verify in docs)

Encryption

  • In transit: Use HTTPS endpoints for Kendra APIs.
  • At rest: Kendra stores indexed content and metadata. Encryption at rest is expected in AWS services; confirm your exact options (AWS-owned keys vs customer-managed keys) in Kendra docs and console for your Region.

Network exposure

  • By default, applications call regional Kendra endpoints over the internet (HTTPS).
  • If your environment requires private connectivity, check whether VPC interface endpoints (PrivateLink) are available for Kendra in your Region and architecture. Verify in official AWS docs.

Secrets handling

  • For connectors that require credentials (SaaS systems), store secrets in a managed secret store (commonly AWS Secrets Manager) and restrict access with IAM.
  • Rotate credentials and audit secret access.

Audit/logging

  • Enable CloudTrail in all Regions (or at least the Kendra Region) to log Kendra API calls.
  • Use CloudWatch metrics and alarms for:
  • Data source sync failures
  • Unusual query spikes
  • Operational anomalies

Compliance considerations

  • Understand what content is indexed (including sensitive fields).
  • Ensure your access control model matches compliance requirements (least privilege, separation of duties).
  • For regulated industries, validate:
  • Data residency (Region)
  • Encryption model (KMS options)
  • Audit requirements (CloudTrail retention, log immutability)
  • Connector handling of ACLs and permissions

Common security mistakes

  • Granting kendra:* to broad roles used by applications (over-privileged).
  • Indexing sensitive repositories without ACL enforcement, then exposing a search UI broadly.
  • Forgetting to restrict iam:PassRole, allowing users to attach overly permissive roles to data sources.
  • Sync roles with overly broad S3 permissions (for example s3:* on *).

Secure deployment recommendations

  • Use separate IAM roles for:
  • Index administration
  • Data source sync
  • Application query
  • Use resource-level permissions where supported (restrict to specific index ARNs).
  • Keep documents in private S3 buckets; avoid public access.
  • Implement defense-in-depth: authentication (Cognito/SSO), authorization, logging, and monitoring.

13. Limitations and Gotchas

Always verify current service quotas and connector limitations in official docs.

Known limitations / constraints (common categories)

  • Connector variability: Not all connectors support all features (especially ACL ingestion). Verify per-connector capabilities.
  • Document format limitations: Some file types may not extract well; scanned PDFs often require OCR before indexing.
  • Quota limits: Number of indexes per account, document limits, data source limits, and query throughput limits exist.
  • Regional availability: Kendra is not available in every Region; connector support can also vary by Region.
  • Cost visibility: Index hourly costs can surprise teams if indexes are left running in dev/test.
  • Access control complexity: Proper ACL enforcement requires careful identity mapping and testing; mistakes can cause overexposure or missing results.
  • Schema changes require planning: Metadata changes can break filters/facets in UIs and require re-indexing behaviors depending on configuration.

Operational gotchas

  • Sync failures may not be obvious to end users
  • If sync silently stops, search results become stale. Add alarms/workflows.
  • Inclusion/exclusion prefix mistakes
  • A wrong S3 prefix can lead to “0 documents indexed”.
  • Duplicate content
  • Indexing multiple repositories with duplicates can degrade relevance.
  • Over-broad synonyms
  • Poor thesaurus design can significantly reduce precision.

Migration challenges

  • Moving from an existing search engine to Kendra often requires:
  • Metadata normalization
  • New ingestion pipelines
  • Access control mapping
  • Query UX updates (filters, facets)
  • Consider phased rollout: one repository first, then expand.

14. Comparison with Alternatives

Amazon Kendra is one option in AWS’s broader search + AI ecosystem. The best choice depends on relevance needs, control requirements, and cost.

Comparison table

Option Best For Strengths Weaknesses When to Choose
Amazon Kendra Enterprise search across multiple repositories with ML relevance Managed connectors, semantic ranking, excerpt highlighting, enterprise-focused features Can be costly; less low-level control than self-managed engines; ACL/identity can be complex You need managed enterprise search with strong relevance and connectors
Amazon OpenSearch Service Custom search applications, logs/observability search, keyword + vector search Deep control, flexible indexing/analyzers, predictable cluster sizing, broad ecosystem You manage cluster sizing/tuning; relevance tuning is more manual; connectors are DIY You need control, custom scoring, vector search, or already run OpenSearch
OpenSearch (self-managed) Maximum control, on-prem/hybrid Full control, extensibility High ops burden, scaling, patching, security hardening You must run on-prem or need custom extensions not available managed
Database full-text search (Aurora/RDS engine-specific) Simple search in app databases Minimal extra infra, simple integration Not designed for enterprise multi-repo search; limited semantic relevance Small apps with basic search requirements
Azure AI Search (other cloud) Enterprise search in Azure ecosystem Tight Azure integration; managed search Cross-cloud complexity; data gravity Organization is standardized on Azure
Google Cloud Vertex AI Search (other cloud) Enterprise search in GCP ecosystem Tight GCP integration Cross-cloud complexity; data gravity Organization is standardized on GCP
RAG-only vector DB approach Similarity search for LLM context Strong semantic similarity; embeddings-driven retrieval Requires embedding pipelines; governance/ACL must be designed carefully You primarily need embedding similarity retrieval for LLMs, not enterprise connector search

15. Real-World Example

Enterprise example: Global financial services internal policy + procedures search

Problem – Policies and procedures exist in SharePoint, PDFs in S3, and wiki pages. – Employees need quick answers, but content is sensitive and access differs by department and region. – Compliance requires auditable access and controlled changes.

Proposed architecture – Amazon Kendra index in the primary Region for the organization. – Data sources: – SharePoint connector for controlled sites – S3 connector for policy PDFs – Confluence connector for engineering procedures – Enrichment: – Lambda enrichment extracts metadata: policy_owner, effective_date, region, classification – Access control: – Connector-level ACL ingestion where supported – Query-time user context tied to corporate identity (verify the best practice for your identity setup in Kendra docs) – Front end: – Internal portal using Cognito/SSO authentication – API layer (API Gateway + Lambda) calling Kendra Query APIs – Monitoring: – CloudWatch alarms on sync failures and index health – CloudTrail for audit trails

Why Amazon Kendra was chosen – Managed connectors reduce integration effort. – Better relevance for question-like queries compared to legacy keyword search. – Enterprise features (metadata, access control patterns) align with compliance needs.

Expected outcomes – Reduced time to locate policies. – Fewer compliance escalations due to outdated information usage. – Centralized search with clear auditing and governance.


Startup/small-team example: Support knowledge base and RAG assistant

Problem – A fast-growing startup has docs in S3 and a wiki, but support agents can’t find answers quickly. – They want an LLM assistant, but need citations and reliable grounding.

Proposed architecture – Single Amazon Kendra index (start small). – S3 as the primary source for product docs and troubleshooting guides. – Simple metadata: product_area, version. – RAG service: – Application queries Kendra to retrieve top passages – Sends passages + citations to an LLM (for example, Amazon Bedrock) – Minimal ops: – On-demand sync after doc updates, then move to scheduled sync when stable

Why Amazon Kendra was chosen – Faster setup than building a search cluster. – Works as a retrieval layer for RAG with citations back to source docs. – Reduces engineering time spent building search infrastructure.

Expected outcomes – Faster support resolution time. – Fewer escalations to engineering. – Higher customer satisfaction due to consistent answers with citations.

16. FAQ

1) Is Amazon Kendra the same as Amazon OpenSearch Service?
No. Amazon Kendra is a managed enterprise search service focused on ML relevance and connectors. Amazon OpenSearch Service is a managed search/analytics engine (OpenSearch) that offers more low-level control and broader use cases (logs, metrics, custom search).

2) Is Amazon Kendra a vector database?
Amazon Kendra is not typically positioned as a general-purpose vector database. It provides ML-based relevance for enterprise search and retrieval. For dedicated vector similarity search, evaluate OpenSearch vector search or specialized vector stores. Verify current Kendra retrieval capabilities in the docs.

3) Can Amazon Kendra enforce document permissions?
Yes, when configured correctly and when ACL ingestion/user context is supported for your connector or ingestion approach. This requires careful identity mapping and testing—verify the recommended approach in official docs.

4) Does Amazon Kendra support Amazon S3 as a data source?
Yes. S3 is one of the most common Kendra data sources. You provide an IAM role for Kendra to read objects.

5) Can I index content from SaaS tools like Confluence or SharePoint?
Kendra supports multiple connectors for third-party repositories. Connector availability and features vary—check the current connector list in the Kendra documentation.

6) How long does indexing take?
It depends on document count, size, and connector. For small labs, minutes. For large repositories, longer. Monitor sync status in the console.

7) Can I run Kendra in multiple Regions?
You can create indexes in multiple Regions, but each is separate. Consider data residency, latency, and cost.

8) Can I “pause” a Kendra index to stop hourly charges?
Typically, managed indexes are billed while they exist. The reliable way to stop charges is usually to delete the index. Verify current billing behavior on the pricing page.

9) How do I improve poor search relevance?
Start with content hygiene and metadata: – Ensure titles and headings are meaningful – Add consistent metadata fields – Use relevance tuning (boost fields/sources) – Add synonyms carefully – Remove duplicates and outdated versions

10) What’s the best way to support RAG with Amazon Kendra?
Use Kendra to retrieve top relevant passages/documents, then provide them as grounded context to an LLM (for example Amazon Bedrock). Keep citations (source URIs) and implement guardrails (don’t let the model answer without retrieved context).

11) Does Amazon Kendra integrate with Amazon Lex?
Kendra is commonly used as a knowledge search backend for chatbots. Validate the current best practice integration patterns in AWS docs for Lex and Kendra.

12) How do I secure a public-facing search experience?
Don’t expose Kendra directly to browsers. Put an authenticated API layer in front (API Gateway + Lambda) and apply IAM least privilege, rate limiting, and logging.

13) How do I handle scanned PDFs?
Kendra may not extract text well from scanned images. Perform OCR upstream (for example using Amazon Textract or another OCR solution) and index the extracted text.

14) Can I index multiple S3 buckets?
Yes, typically by creating multiple S3 data sources, each with its own configuration and IAM access. Validate quotas and best practices for your scale.

15) How do I track who changed index settings?
Enable CloudTrail and review events for Kendra API calls (create/update/delete index, data source changes, sync triggers).

16) What’s the difference between a data source sync and custom ingestion?
– Data source sync: Kendra pulls from a repository on schedule/on-demand via connector. – Custom ingestion: your pipeline pushes documents into Kendra using APIs. Choose based on repository type and control needs.

17) How do I avoid indexing sensitive data accidentally?
Use inclusion/exclusion patterns, metadata classification, and (if needed) enrichment-based redaction/tagging. Apply IAM controls and review the content scope during onboarding.

17. Top Online Resources to Learn Amazon Kendra

Resource Type Name Why It Is Useful
Official Documentation Amazon Kendra Documentation Primary source for features, APIs, connectors, quotas, and security guidance. https://docs.aws.amazon.com/kendra/
Official “What is” page What is Amazon Kendra? Good conceptual overview and core terminology. https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html
Official Pricing Amazon Kendra Pricing Current pricing by edition/Region and related capabilities. https://aws.amazon.com/kendra/pricing/
Pricing Tool AWS Pricing Calculator Build Region-specific estimates including related services. https://calculator.aws/
API Reference Amazon Kendra API Reference Exact request/response shapes for Query, ingestion, and admin APIs. Verify latest endpoints via docs navigation: https://docs.aws.amazon.com/kendra/
Security/Auditing AWS CloudTrail User Guide How to audit Kendra API calls and set retention. https://docs.aws.amazon.com/awscloudtrail/latest/userguide/
Monitoring Amazon CloudWatch Documentation Metrics, dashboards, and alarms for operational visibility. https://docs.aws.amazon.com/cloudwatch/
Architecture Guidance AWS Architecture Center Patterns for building secure, scalable AWS solutions (search for Kendra references). https://aws.amazon.com/architecture/
Samples (Trusted) AWS Samples on GitHub (search: “amazon kendra”) Example code for querying and integration patterns. https://github.com/aws-samples (use search for “kendra”)
Videos AWS YouTube Channel (search: “Amazon Kendra”) Service deep dives, demos, and integration examples. https://www.youtube.com/user/AmazonWebServices

18. Training and Certification Providers

Institute Suitable Audience Likely Learning Focus Mode Website URL
DevOpsSchool.com Beginners to experienced cloud/DevOps practitioners AWS fundamentals, DevOps, and practical cloud labs (verify current Kendra coverage on site) Check website https://www.devopsschool.com/
ScmGalaxy.com Students and early-career engineers DevOps/SCM basics, cloud introductions, hands-on learning Check website https://www.scmgalaxy.com/
CLoudOpsNow.in Cloud operations and platform teams Cloud operations practices, automation, operational readiness Check website https://www.cloudopsnow.in/
SreSchool.com SREs, DevOps, operations engineers Reliability engineering practices, monitoring, incident response Check website https://www.sreschool.com/
AiOpsSchool.com Ops + AI/ML practitioners AIOps concepts, automation, AI-assisted operations Check website https://www.aiopsschool.com/

19. Top Trainers

Platform/Site Likely Specialization Suitable Audience Website URL
RajeshKumar.xyz Cloud/DevOps training content (verify specific offerings) Learners seeking guided training and mentorship https://rajeshkumar.xyz/
devopstrainer.in DevOps and cloud training Beginners to intermediate engineers https://www.devopstrainer.in/
devopsfreelancer.com Freelance DevOps help/training platform (verify services) Teams needing short-term coaching or implementation help https://www.devopsfreelancer.com/
devopssupport.in DevOps support and training resources (verify scope) Operations teams seeking practical support-style learning https://www.devopssupport.in/

20. Top Consulting Companies

Company Name Likely Service Area Where They May Help Consulting Use Case Examples Website URL
cotocus.com Cloud/DevOps consulting (verify offerings) Architecture reviews, implementation support, automation Designing an AWS search portal architecture; setting up IAM governance; CI/CD for related apps https://cotocus.com/
DevOpsSchool.com Training + consulting (verify offerings) Enablement, cloud/DevOps delivery, workshops Kendra proof-of-concept; operational readiness; cost and security review for a search/RAG rollout https://www.devopsschool.com/
DEVOPSCONSULTING.IN DevOps consulting (verify offerings) DevOps transformation, cloud operations, platform practices Secure AWS deployment patterns; monitoring strategy; IAM least-privilege design for Kendra apps https://www.devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Amazon Kendra

  • AWS fundamentals
  • IAM users/roles/policies, least privilege, iam:PassRole
  • S3 buckets, object permissions, bucket policies
  • CloudWatch and CloudTrail basics
  • Search fundamentals
  • Precision/recall, relevance, metadata, facets
  • Content lifecycle and governance
  • Security fundamentals
  • Data classification, encryption basics, audit logging

What to learn after Amazon Kendra

  • RAG architectures
  • Retrieval strategies, chunking, citations, evaluation
  • Integrating Kendra retrieval with Amazon Bedrock
  • Enterprise identity
  • IAM Identity Center, SAML/OIDC concepts, user/group mapping
  • Operational excellence
  • Dashboards, alarms, incident playbooks for ingestion failures
  • Alternatives
  • Amazon OpenSearch Service for custom ranking/vector search

Job roles that use it

  • Cloud engineer / cloud developer
  • Solutions architect
  • DevOps / SRE (internal tooling and portals)
  • Knowledge management engineer
  • ML engineer (RAG integrations)
  • Security engineer (governance and access control validation)

Certification path (if available)

AWS certifications don’t typically focus on a single service, but relevant paths include: – AWS Certified Solutions Architect (Associate/Professional) – AWS Certified Developer (Associate) – AWS Certified Machine Learning / AI-related certifications (check current AWS certification catalog)
Verify current certification names and availability: https://aws.amazon.com/certification/

Project ideas for practice

  • Build a secure internal search portal with:
  • Cognito auth
  • API Gateway + Lambda
  • Kendra query + metadata filters
  • Implement document enrichment:
  • Add tags like team, severity, service based on content
  • Build a RAG assistant:
  • Kendra retrieval + Bedrock generation + citations
  • Implement governance:
  • Multiple indexes by environment with tagging + cost reporting
  • CloudWatch alarms on sync failures

22. Glossary

  • ACL (Access Control List): Rules that define which users/groups can access a document. In search, ACLs must be enforced so users only see permitted results.
  • Connector (Data source): A managed integration that syncs documents from a repository (S3, wiki, SaaS tool) into Kendra.
  • Data source sync: The job that reads from the repository and updates the Kendra index.
  • Document enrichment: A pipeline step that transforms documents or adds metadata during ingestion (often using AWS Lambda).
  • Facet: A UI element that lets users filter results by a metadata field (for example department or date).
  • IAM: AWS Identity and Access Management—controls permissions for AWS API calls and role assumption.
  • Index: The searchable structure Kendra builds from ingested documents.
  • Metadata: Structured fields attached to documents (owner, department, date, tags) used for filtering and relevance.
  • RAG (Retrieval-Augmented Generation): An architecture where a retrieval system fetches relevant context (documents/passages) for an LLM to generate grounded answers.
  • Relevance tuning: Adjusting ranking behavior (boosting fields/sources) to improve result quality.
  • Synonyms/Thesaurus: Configuration that maps related terms (PTO/vacation) to improve recall.
  • User context: Information about the querying user (identity/groups) used to enforce access control during query.
  • VPC interface endpoint (PrivateLink): A private network path to AWS service APIs without traversing the public internet (availability varies; verify for Kendra).

23. Summary

Amazon Kendra is AWS’s managed enterprise search service in the Machine Learning (ML) and Artificial Intelligence (AI) category, built to index documents across repositories and return highly relevant results for natural language queries. It matters when your organization needs better-than-keyword relevance, unified search across silos, and a managed operational model with strong AWS integrations.

From an architecture perspective, Kendra sits between content sources (often S3 and SaaS tools) and applications (portals, chatbots, and RAG assistants). Cost is primarily driven by the existence and edition/capacity of indexes (often billed hourly), so cost control depends on minimizing unnecessary indexes, choosing the correct edition, and tuning sync schedules. Security depends on IAM least privilege, careful connector role design, encryption configuration (verify options), and—if needed—correct ACL and identity mapping so users only see what they should.

Use Amazon Kendra when you want managed enterprise search with connectors and ML relevance. Consider alternatives like Amazon OpenSearch Service when you need deeper control, lower-level customization, or dedicated vector search. Next, extend this tutorial by adding metadata schema design, document enrichment via Lambda, and a production-grade authenticated search API—then evaluate a RAG workflow using Amazon Bedrock with Kendra as the retrieval layer.