Category
Containers
1. Introduction
Red Hat OpenShift Service on AWS (ROSA) is a managed Red Hat OpenShift platform that runs natively on AWS. It’s designed to let teams build, deploy, and operate containerized applications on Kubernetes with OpenShift’s enterprise features—without having to self-manage the underlying OpenShift control plane and core platform operations.
In simple terms: ROSA gives you an OpenShift cluster on AWS where major platform lifecycle tasks (like installation and upgrades) are handled by Red Hat (and, depending on the deployment model, AWS), while you focus on deploying apps, configuring namespaces/projects, and managing your application workloads.
Technically, ROSA provisions OpenShift clusters into your AWS account and VPC (for many cluster types), integrates with AWS Identity and Access Management (IAM) using AWS Security Token Service (STS), and uses OpenShift’s built-in operators and platform components (Ingress, monitoring, logging integrations, image registry options, etc.) to provide an opinionated, enterprise-grade Kubernetes experience. Control plane and worker architecture depends on the ROSA deployment model you choose (for example, “classic” vs. hosted control planes—verify current options in official docs).
The problem ROSA solves is operational complexity: running production OpenShift/Kubernetes reliably is hard. ROSA reduces the burden of cluster provisioning, upgrades, and core platform SRE work while still giving you the flexibility of OpenShift and the breadth of AWS infrastructure services.
2. What is Red Hat OpenShift Service on AWS (ROSA)?
Official purpose: Red Hat OpenShift Service on AWS (ROSA) is a managed OpenShift service jointly offered by Red Hat and AWS to run Red Hat OpenShift on AWS with integrated billing and support options (availability and support model can vary by offering and region—verify in official docs).
Core capabilities: – Managed OpenShift clusters on AWS – OpenShift APIs and developer experience (Projects, Routes, Operators, Build/Deploy workflows) – Kubernetes orchestration with OpenShift’s enterprise platform capabilities – Integration with AWS services for networking, compute, and storage – Enterprise-grade security patterns (IAM integration, private networking options, encryption controls depending on AWS services used)
Major components (conceptual): – OpenShift control plane (Kubernetes API, etcd, controllers, OpenShift API components) – Worker nodes (EC2 instances running application workloads) – OpenShift Operators (platform lifecycle controllers that manage components) – Ingress / Routes (OpenShift Router providing HTTP(S) routing) – Container image registry (options vary; OpenShift has internal registry patterns; many production teams use external registries such as Amazon ECR—verify recommended approach for ROSA) – Identity and access (OpenShift RBAC + AWS IAM integration patterns) – Networking (VPC, subnets, Security Groups, Load Balancers, DNS)
Service type: Managed Kubernetes/OpenShift platform (PaaS-like Kubernetes platform), running on AWS infrastructure.
Scope and placement: – Regional: Clusters are created in a specific AWS region and use regional AWS services (VPC, EC2, EBS, ELB, etc.). – Account-scoped resources: The cluster consumes resources in your AWS account (and typically your VPC), with AWS IAM roles created for ROSA and cluster operators (especially in STS-based deployments). – Project/namespace-scoped workloads: Applications are deployed into OpenShift Projects (namespaces), with RBAC and quotas applied at that scope.
How it fits into the AWS ecosystem: – Uses EC2 for worker nodes and possibly other compute depending on cluster architecture. – Uses VPC networking primitives (subnets, routing, Security Groups). – Uses Elastic Load Balancing for external/internal service exposure. – Commonly integrates with EBS (block storage) and/or EFS (shared file storage), depending on storage class configuration and workload needs. – Often pairs with Amazon ECR for container images, CloudWatch for metrics/logs ingestion patterns (implementation depends on your logging stack), and KMS for encryption keys (where applicable).
Service name status: “Red Hat OpenShift Service on AWS (ROSA)” is the current official name. ROSA has evolved in deployment options over time (for example, STS-based clusters and hosted control plane options). Always confirm the current supported cluster types and architecture choices in the official ROSA documentation.
3. Why use Red Hat OpenShift Service on AWS (ROSA)?
Business reasons
- Faster time to platform: Avoid building and maintaining a self-managed OpenShift practice for baseline cluster lifecycle tasks.
- Enterprise support model: Centralized support and defined responsibility boundaries (shared responsibility varies by offering—verify in docs).
- Standardization: Create a consistent platform for multiple teams with governance guardrails (RBAC, projects, quotas, policies).
Technical reasons
- OpenShift developer workflow: Built-in constructs like Routes, integrated OAuth, OperatorHub, and CI/CD-friendly patterns.
- Kubernetes plus opinionated platform: OpenShift includes platform components and defaults that reduce “choose-your-own-adventure” Kubernetes complexity.
- AWS-native placement: Your workloads run in AWS regions near your data sources, AWS-managed databases, and existing network topology.
Operational reasons
- Reduced SRE burden for cluster lifecycle: Managed installation and upgrade paths (exact upgrade responsibilities and scheduling capabilities depend on ROSA offering—verify in docs).
- Repeatable multi-cluster patterns: Supports standardized cluster creation with CLI and automation.
- Built-in monitoring: OpenShift includes a Prometheus-based monitoring stack for cluster and workload metrics (details vary by cluster profile).
Security/compliance reasons
- IAM integration with AWS STS: Minimizes long-lived cloud credentials in the cluster.
- Isolation with VPC: Use private subnets, route tables, and security groups to control ingress/egress.
- Policy-based governance: Use OpenShift RBAC, Security Context Constraints (SCCs), and admission policies (options evolve—verify current supported policy toolchain).
Scalability/performance reasons
- Horizontal scaling: Scale worker nodes and pods, use cluster autoscaling patterns (availability depends on your configuration).
- Workload flexibility: Run stateless services, APIs, event-driven workloads, and selected stateful workloads with appropriate storage.
When teams should choose ROSA
- You want OpenShift specifically (not just Kubernetes), including its operational model and developer tooling.
- You need managed OpenShift lifecycle on AWS to reduce platform ops burden.
- You are standardizing a platform for multiple product teams and want consistent governance and RBAC.
- You have compliance needs where OpenShift’s enterprise controls and support model are beneficial.
When teams should not choose ROSA
- You only need a lightweight Kubernetes cluster and prefer AWS-native Kubernetes (Amazon EKS) with minimal platform opinionation.
- You have strict requirements to fully control control-plane configuration beyond what ROSA allows.
- You need extremely low-cost dev clusters and are willing to accept DIY operations; ROSA’s subscription and baseline infrastructure can be relatively expensive.
- You have workloads that require unsupported kernel modules, privileged node customizations, or niche networking/storage features not available in managed OpenShift (verify constraints in docs).
4. Where is Red Hat OpenShift Service on AWS (ROSA) used?
Industries
- Financial services and insurance (governance and standardization)
- Healthcare and life sciences (compliance-focused platform engineering)
- Retail and e-commerce (microservices, seasonal scaling)
- Telecommunications (multi-tenant platforms, internal developer platforms)
- Media and SaaS providers (CI/CD-driven delivery, multi-cluster deployments)
- Government/regulated industries (region-specific compliance requirements—verify)
Team types
- Platform engineering teams building Internal Developer Platforms (IDPs)
- DevOps/SRE teams standardizing deployment patterns
- Application teams that want self-service namespaces and pipelines
- Security and compliance teams enforcing baseline controls
Workloads
- Microservices APIs (REST/gRPC)
- Web frontends and backends
- Event-driven services (with Kafka or managed messaging—verify integration choices)
- Batch jobs and scheduled workloads
- CI/CD build pipelines (OpenShift builds or external CI)
- Selected stateful services (with careful storage planning)
Architectures
- Multi-tier apps with private services + public ingress
- Hybrid integrations with AWS services (RDS, DynamoDB, S3, MSK, etc.)
- Multi-account landing zone deployments (network hub/spoke patterns)
- Multi-cluster patterns for environment separation (dev/test/prod)
Real-world deployment contexts
- Running OpenShift for regulated workloads where support and consistent patching matter
- Standardizing deployments across teams using Operators and templates/Helm
- Migrating from on-prem OpenShift to AWS while keeping OpenShift consistency
Production vs dev/test usage
- Production: Common, with private cluster patterns, controlled ingress, and integration with enterprise IAM and monitoring.
- Dev/test: Used, but cost and baseline resource requirements can be higher than minimal Kubernetes. Many teams use smaller worker profiles and strict shutdown/cleanup discipline.
5. Top Use Cases and Scenarios
Below are realistic scenarios where Red Hat OpenShift Service on AWS (ROSA) is commonly used.
1) Enterprise Kubernetes platform standardization
- Problem: Multiple teams deploy to inconsistent Kubernetes clusters with different add-ons and security posture.
- Why ROSA fits: OpenShift provides a consistent platform layer with governance controls and managed lifecycle.
- Example: A bank standardizes all new microservices on ROSA to enforce RBAC, namespace isolation, and controlled ingress.
2) Lift-and-shift from on-prem OpenShift to AWS
- Problem: Existing OpenShift workloads need cloud elasticity and reduced datacenter operations.
- Why ROSA fits: Maintains OpenShift compatibility while running on AWS infrastructure.
- Example: A manufacturer migrates OpenShift 4 workloads to ROSA and reuses pipelines and deployment manifests with minimal change.
3) Regulated workload hosting with clearer responsibility boundaries
- Problem: Auditors require demonstrable patching, access controls, and support.
- Why ROSA fits: Managed service model plus OpenShift security features and logging/auditing.
- Example: A healthcare SaaS hosts patient-facing APIs in ROSA with private networking and strict RBAC.
4) Internal Developer Platform (IDP) for self-service namespaces
- Problem: Developers wait days for infrastructure and permissions.
- Why ROSA fits: OpenShift Projects, templates/operators, and consistent CI/CD integration enable self-service.
- Example: Platform team provides a “golden path” namespace with standardized build/deploy and policy.
5) Microservices modernization on AWS
- Problem: Monolith applications need to be decomposed and deployed continuously.
- Why ROSA fits: Kubernetes + OpenShift routing, operators, and CI/CD-friendly patterns.
- Example: Retailer decomposes checkout into services deployed on ROSA with rolling updates and autoscaling.
6) Multi-tenant cluster for multiple internal teams
- Problem: Too many clusters create operational sprawl; too few clusters create policy conflicts.
- Why ROSA fits: Projects, quotas, RBAC, and network policies support multi-team operations.
- Example: A telco runs shared non-prod ROSA clusters with strict quotas per team.
7) Hybrid integration with AWS managed data services
- Problem: Apps need Kubernetes portability but rely on managed databases and queues.
- Why ROSA fits: Runs close to AWS services while keeping OpenShift app platform patterns.
- Example: App pods on ROSA use Amazon RDS for PostgreSQL and S3 for object storage.
8) GitOps-based application delivery at scale
- Problem: Manual deployments create drift and inconsistent environments.
- Why ROSA fits: OpenShift supports GitOps workflows (often via Argo CD-based tooling; verify current packaged options).
- Example: A SaaS uses GitOps to sync 200+ services across multiple ROSA clusters.
9) Blue/green and canary releases for critical services
- Problem: Need safer releases without downtime.
- Why ROSA fits: OpenShift Routes and Kubernetes deployment strategies enable traffic shifting patterns (implementation depends on tooling).
- Example: Payments API uses canary releases with monitoring-based rollback.
10) Container security standardization (build → scan → deploy)
- Problem: Inconsistent image provenance and runtime constraints.
- Why ROSA fits: OpenShift integrates well with enterprise image policies, RBAC, and security controls; can integrate with external scanning tools.
- Example: CI pipelines push signed images to Amazon ECR; OpenShift enforces runtime policies and restricted SCCs.
11) Data processing and ML inference services
- Problem: Need scalable inference endpoints and batch processing.
- Why ROSA fits: Kubernetes scaling plus AWS infrastructure integration; can pair with AWS AI services as needed.
- Example: A marketing analytics team deploys inference microservices and scheduled batch ETL jobs.
12) Partner-managed platform operations
- Problem: Organization lacks deep Kubernetes/OpenShift SRE expertise.
- Why ROSA fits: Managed platform reduces baseline complexity; partners focus on app operations and governance.
- Example: A mid-size company uses ROSA plus a managed services partner for day-2 operations.
6. Core Features
Feature availability can depend on ROSA cluster type, region, and OpenShift version. Verify in official docs for your exact configuration.
Managed OpenShift cluster lifecycle
- What it does: Helps automate installation, upgrades, and core platform maintenance tasks under a managed service model.
- Why it matters: Reduces time and risk compared to self-managing OpenShift.
- Practical benefit: More predictable operations and patch posture.
- Caveats: Upgrade timing and control boundaries vary by ROSA offering; confirm responsibility matrix.
AWS-native infrastructure integration
- What it does: Runs worker nodes on Amazon EC2 inside AWS VPC networking, using AWS load balancers and storage backends.
- Why it matters: Leverages AWS maturity for compute, networking, and storage.
- Practical benefit: Deploy near AWS data services; integrate with existing AWS accounts/VPC designs.
- Caveats: You must design VPC/subnet capacity carefully for growth and multi-AZ needs.
STS-based IAM integration (recommended pattern)
- What it does: Uses AWS Security Token Service (STS) and IAM roles for cluster and operator access to AWS resources, reducing long-lived credentials.
- Why it matters: Stronger security posture and credential rotation properties.
- Practical benefit: Aligns with AWS best practices for temporary credentials.
- Caveats: Requires creating and managing specific IAM roles and (often) an OIDC provider; naming and permission boundaries matter.
OpenShift developer experience
- What it does: Provides OpenShift Console (web UI), Projects, Routes, builds (depending on configuration), and operator-driven capabilities.
- Why it matters: Improves developer self-service and discoverability.
- Practical benefit: Faster onboarding and consistent deployment workflows.
- Caveats: Some built-in capabilities may be limited or configured differently in managed service profiles.
OperatorHub and Kubernetes Operators
- What it does: Enables installation and lifecycle management of platform add-ons and software operators.
- Why it matters: Reduces manual add-on configuration drift.
- Practical benefit: Standardized installs for supported operators; easier updates.
- Caveats: Not every community operator is appropriate for production; validate operator supportability and security.
Multi-AZ high availability patterns (where supported)
- What it does: Supports spreading nodes across Availability Zones for resilience.
- Why it matters: Reduces impact of AZ failures.
- Practical benefit: Higher availability for critical services.
- Caveats: Multi-AZ increases cost and complexity (subnets, load balancers, cross-AZ data).
Ingress and routing via OpenShift Routes
- What it does: Provides HTTP(S) routing to services via OpenShift Router.
- Why it matters: Simplifies app exposure patterns.
- Practical benefit: Consistent TLS termination and host-based routing patterns.
- Caveats: Underlying load balancer types, TLS options, and route sharding patterns depend on configuration and version.
Integrated monitoring stack (Prometheus/Alerting)
- What it does: Cluster and workload monitoring with Prometheus-based metrics, Alertmanager, and dashboards (exact UI integration depends on version).
- Why it matters: Observability is mandatory for production Kubernetes.
- Practical benefit: Faster troubleshooting and SLO tracking.
- Caveats: Long-term metrics retention and enterprise observability integration usually require external systems.
Logging and auditability (platform + workload)
- What it does: OpenShift provides cluster logging patterns and auditing; integrations can forward logs to external systems.
- Why it matters: Required for incident response and compliance.
- Practical benefit: Centralized logs and audit trails.
- Caveats: Logging stacks can be resource-intensive; verify supported logging operator options for ROSA and plan retention costs externally.
Kubernetes storage integrations on AWS
- What it does: Uses CSI drivers to provision volumes (commonly EBS; EFS may be used for shared storage use cases).
- Why it matters: Enables stateful workloads and persistent data.
- Practical benefit: Dynamic provisioning with StorageClasses.
- Caveats: Stateful workloads require careful design (backups, IOPS planning, AZ pinning).
Network policy and workload isolation
- What it does: Supports Kubernetes network policies (implementation depends on cluster networking).
- Why it matters: Limits lateral movement inside the cluster.
- Practical benefit: Enforce tenant boundaries between projects.
- Caveats: Network policies require explicit design; defaults may be permissive.
7. Architecture and How It Works
High-level service architecture
At a high level, ROSA provisions an OpenShift cluster on AWS: – Worker nodes are EC2 instances in your selected VPC/subnets. – The OpenShift control plane is managed according to your ROSA cluster type (for example, “classic” clusters often place control plane components in your AWS account; hosted control plane options may place control plane in a service-managed account—verify current behavior and offerings). – Ingress is provided via OpenShift routers, backed by AWS load balancers. – Storage uses AWS-backed CSI drivers. – IAM integration uses AWS STS roles, with an OIDC provider to map Kubernetes service accounts to AWS IAM roles (pattern similar to IRSA).
Request/data/control flow (typical)
- A user or CI pipeline authenticates to OpenShift API (via
oc loginor console OAuth). - Deployments create pods scheduled by Kubernetes onto worker nodes.
- External traffic hits an AWS Load Balancer, then routes to OpenShift router pods, then to your service/pods.
- Pods pull images from a registry (often Amazon ECR or another registry).
- If pods need AWS API access, they assume IAM roles via STS (service account token → OIDC → IAM role).
- Metrics and logs are collected by in-cluster components and optionally forwarded to external systems.
Integrations with related AWS services
Common integrations include: – Amazon ECR for container images – Amazon Route 53 for DNS (especially for custom domains and private zones) – AWS Certificate Manager (ACM) for TLS certificates (often paired with load balancers; OpenShift router TLS is configurable—verify best practice for your setup) – Amazon VPC constructs: private subnets, NAT gateways, VPC endpoints (PrivateLink) for private egress to AWS APIs – AWS KMS for encryption keys used by AWS services (EBS encryption, etc.) – CloudWatch for infrastructure metrics/logs (often via node exporters/agents and log forwarding)
Dependency services (typical)
- EC2, EBS, ELB, VPC, IAM, STS
- Optional: EFS, Route 53, KMS, CloudWatch, S3 (backups/artifacts), Secrets Manager (application secrets—requires integration approach)
Security/authentication model (summary)
- AWS IAM controls who can create/modify ROSA clusters and underlying AWS resources.
- OpenShift OAuth controls cluster user authentication (can integrate with external identity providers; verify supported IdPs for ROSA).
- OpenShift RBAC controls authorization inside the cluster.
- STS and OIDC enable fine-grained AWS permissions for workloads.
Networking model (summary)
- Cluster runs inside a VPC with subnets across one or more AZs.
- Ingress uses AWS load balancers in public or private subnets depending on exposure.
- Egress commonly goes through NAT gateways for private clusters; VPC endpoints can reduce NAT and improve security for AWS API access.
- NetworkPolicy can enforce pod-to-pod isolation.
Monitoring/logging/governance considerations
- Use cluster and workload monitoring; integrate alerting with incident management.
- Centralize logs (application + audit) into an enterprise log platform.
- Use tagging standards for AWS resources created for the cluster (where possible) and consistent naming for clusters/projects.
- Consider multi-cluster governance tooling (policy-as-code) for fleets.
Simple architecture diagram (conceptual)
flowchart LR
U[User / CI] -->|OAuth / oc login| API[OpenShift API]
API --> SCH[Kubernetes Scheduler]
SCH --> N[Worker Nodes on EC2]
N --> PODS[Application Pods]
EXT[Internet / Corp Network] --> LB[AWS Load Balancer]
LB --> RT[OpenShift Router (Ingress)]
RT --> SVC[K8s Service]
SVC --> PODS
PODS -->|pull| REG[Container Registry (e.g., Amazon ECR)]
PODS -->|assume role| STS[AWS STS / IAM]
Production-style architecture diagram (multi-AZ, private networking)
flowchart TB
subgraph AWSRegion[AWS Region]
subgraph VPC[Customer VPC]
subgraph AZ1[AZ-A]
W1[Worker Nodes]
R1[Router Pods]
end
subgraph AZ2[AZ-B]
W2[Worker Nodes]
R2[Router Pods]
end
subgraph AZ3[AZ-C]
W3[Worker Nodes]
R3[Router Pods]
end
ILB[Internal or Public AWS Load Balancer]
ILB --> R1
ILB --> R2
ILB --> R3
W1 --> EBS1[(EBS Volumes)]
W2 --> EBS2[(EBS Volumes)]
W3 --> EBS3[(EBS Volumes)]
end
IAM[IAM Roles + STS]
OIDC[OIDC Provider]
IAM <---> OIDC
W1 -->|AWS API| IAM
W2 -->|AWS API| IAM
W3 -->|AWS API| IAM
ECR[Amazon ECR]
W1 --> ECR
W2 --> ECR
W3 --> ECR
CW[CloudWatch / External Observability]
W1 --> CW
W2 --> CW
W3 --> CW
end
Corp[Corporate Network] -->|VPN/Direct Connect| ILB
8. Prerequisites
Accounts, subscriptions, and access
- AWS account with permissions to create IAM roles/policies, VPC resources, EC2, load balancers, and related dependencies.
- Red Hat account (or access method required by ROSA) to obtain a ROSA token and access ROSA CLI. Verify current onboarding steps in official docs.
- ROSA entitlement/subscription: ROSA is a managed service with subscription pricing and AWS infrastructure costs. Ensure your account is eligible and billing is set up.
Permissions / IAM roles
You typically need:
– AWS IAM permissions to create and attach policies, create roles, and manage an OIDC provider for STS-based clusters.
– Permissions to create VPC/subnets or to use an existing VPC (depending on how you provision the cluster).
– Ability to pass roles (iam:PassRole) if automation creates roles for cluster operators.
Exact IAM policies are detailed in ROSA docs; do not improvise permissions in production—use official least-privilege guidance.
Billing requirements
- A valid AWS payment method.
- Ability to accept ROSA charges (subscription) plus underlying AWS infrastructure costs (EC2/EBS/ELB/NAT/data transfer).
CLI/tools needed
- AWS CLI: https://docs.aws.amazon.com/cli/
- ROSA CLI (
rosa): Official installation instructions are in ROSA docs. - OpenShift CLI (
oc): https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/getting-started-cli.html - Optional:
kubectl(mostly redundant if you haveoc),jq, and a terminal shell.
Region availability
ROSA is not available in every AWS region. Always verify supported regions in: – Official ROSA documentation and/or AWS ROSA product pages.
Quotas/limits
Plan for: – EC2 instance quotas for your chosen instance families – VPC limits (subnets, route tables, security groups) – Elastic IP and NAT gateway scaling considerations – Load balancer quotas – OpenShift cluster sizing constraints (minimum worker count and instance type requirements vary—verify in docs)
Prerequisite services
- VPC and subnet strategy (public/private)
- DNS strategy (Route 53 public/private hosted zones if you use custom domains)
- Container registry strategy (Amazon ECR or alternative)
- Observability strategy (in-cluster vs external aggregation)
9. Pricing / Cost
ROSA cost is typically the combination of: 1. ROSA service/subscription pricing (managed OpenShift fee) 2. AWS infrastructure costs (EC2, EBS, ELB, NAT gateways, data transfer, etc.)
Pricing dimensions (typical)
- Per-cluster hourly rate (or similar subscription metric) for ROSA management and OpenShift subscription.
- Worker node compute: EC2 instance hours, including on-demand/reserved/savings plans (depending on what you choose).
- Storage: EBS volume GB-month, IOPS/throughput (for certain volume types), snapshots.
- Load balancers: hourly + LCU/processed bytes (depends on LB type and usage).
- NAT gateways: hourly + per-GB processed (often a major hidden cost in private clusters).
- Data transfer: inter-AZ traffic, internet egress, and traffic to/from other AWS services.
- Optional services: Route 53 hosted zones/queries, CloudWatch ingestion/retention, KMS key usage, etc.
Free tier
ROSA generally is not a typical AWS “free tier” service. Sometimes promotions or trials may exist; treat them as time-limited and verify.
Cost drivers
Major cost drivers in real deployments: – Worker node count and instance size – Multi-AZ vs single-AZ – NAT gateways and outbound traffic volume – Persistent storage consumption and performance tier – Observability stack footprint (logging can be costly) – Load balancer count (each app/route pattern can influence LB usage)
Hidden/indirect costs to watch
- NAT gateways: Private clusters that pull images, talk to external APIs, or ship logs can generate significant NAT data processing charges.
- Cross-AZ traffic: Microservice chatter across AZs can add up.
- Logging: High-volume application logs can become a top cost if forwarded to paid ingestion platforms.
Network/data transfer implications
- Prefer VPC endpoints for AWS APIs (ECR, S3, STS, CloudWatch, etc.) to reduce NAT egress and improve security.
- Keep data-heavy services in the same AZ when possible if latency and transfer cost matter (but balance with HA requirements).
How to optimize cost (practical)
- Start with the smallest supported worker node size and count for dev/test (verify minimums).
- Use cluster autoscaling where appropriate (verify support).
- Use Savings Plans/Reserved Instances for steady-state worker compute.
- Use VPC endpoints to reduce NAT usage.
- Use external managed data services (RDS, DynamoDB) when they are cheaper and simpler than stateful pods.
- Enforce resource requests/limits and quotas per namespace to prevent noisy neighbor waste.
- Be deliberate about logging verbosity and retention.
Example low-cost starter estimate (model, not a price quote)
A minimal dev cluster typically includes: – ROSA cluster management subscription fee (per hour) – 2–3 worker nodes (EC2) of the smallest supported type – EBS for system and small persistent volumes – At least one load balancer for ingress – NAT gateway if private (potentially expensive)
Because region, instance type, and ROSA pricing vary, use official sources: – AWS ROSA pricing: https://aws.amazon.com/rosa/pricing/ – AWS Pricing Calculator: https://calculator.aws/
Example production cost considerations (what changes)
In production you often add: – Multi-AZ worker pools – Larger instances or more nodes – Higher-performance storage (IOPS/throughput) – Multiple load balancers / ingress sharding patterns – Centralized logging and long-term metrics systems – Higher egress and inter-service traffic
A realistic production cost review should include: – A workload sizing estimate (CPU/memory and storage) – A networking bill forecast (NAT, cross-AZ) – A logging ingestion forecast (GB/day) – A savings strategy (Savings Plans, scaling)
10. Step-by-Step Hands-On Tutorial
This lab creates a ROSA cluster using the ROSA CLI, deploys a sample app, exposes it with an OpenShift Route, verifies access, and then cleans up.
Important: ROSA clusters incur charges while running (subscription + AWS infrastructure). If you are cost-sensitive, plan to complete the lab promptly and delete the cluster.
Objective
- Provision a Red Hat OpenShift Service on AWS (ROSA) cluster using supported defaults.
- Log in with
oc. - Deploy a small sample application.
- Expose it externally using an OpenShift Route.
- Validate and clean up safely.
Lab Overview
You will:
1. Install and configure tools (aws, rosa, oc).
2. Authenticate to ROSA and AWS.
3. Create required account roles (STS pattern).
4. Create a ROSA cluster.
5. Create an OpenShift admin user and log in.
6. Deploy a sample app and expose it with a Route.
7. Validate functionality.
8. Delete the cluster and roles.
Step 1: Install CLI tools (aws, rosa, oc)
Expected outcome: You can run aws --version, rosa version, and oc version.
1) Install AWS CLI: – Follow official AWS instructions: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
2) Install ROSA CLI:
– Follow official ROSA docs for your OS (the binary name is typically rosa).
– Official docs landing page (verify latest):
https://docs.openshift.com/rosa/
or Red Hat docs portal:
https://docs.redhat.com/ (search for “Red Hat OpenShift Service on AWS”)
3) Install OpenShift CLI (oc):
– Official OpenShift CLI docs: https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/getting-started-cli.html
Verify:
aws --version
rosa version
oc version
Step 2: Authenticate to AWS and ROSA
Expected outcome: Your AWS CLI points to the correct account/region, and rosa whoami shows your ROSA user.
1) Configure AWS credentials (choose one approach):
– aws configure for a local profile, or
– AWS SSO / IAM Identity Center, or
– Assume-role credentials via environment variables.
Example:
aws configure
aws sts get-caller-identity
2) Log in to ROSA: – ROSA typically requires an offline access token from Red Hat. – Get the token from the official login flow referenced by ROSA docs, then:
rosa login
rosa whoami
If rosa login opens a browser, follow the prompts. If it requests a token, paste the token.
Step 3: Choose a region and plan basic parameters
Expected outcome: You have a chosen region and a unique cluster name.
Pick an AWS region that supports ROSA (verify region support in official docs). Set environment variables:
export AWS_REGION="us-east-1" # example, verify supported region
export CLUSTER_NAME="rosa-lab-$(date +%Y%m%d%H%M)"
Also ensure your account has sufficient quotas for the worker instance family you will use.
Step 4: Create ROSA account roles (STS mode)
Expected outcome: Required IAM roles exist in your AWS account for ROSA.
ROSA commonly uses a set of account-wide IAM roles (naming may vary). The ROSA CLI can create these.
Typical command pattern (verify flags in current docs/CLI help):
rosa create account-roles --mode auto --yes
Notes:
– --mode auto generally creates roles automatically in your AWS account.
– Some organizations require --mode manual and change control.
Confirm roles:
rosa list account-roles
If your organization uses permission boundaries or SCPs (AWS Organizations), role creation may fail; see troubleshooting.
Step 5: Create the cluster
Expected outcome: A cluster begins provisioning and appears in rosa list clusters.
Create the cluster with STS enabled. A common pattern is:
rosa create cluster \
--cluster-name "$CLUSTER_NAME" \
--sts \
--mode auto \
--region "$AWS_REGION"
What happens next: – ROSA validates AWS prerequisites. – It creates cluster-specific roles (“operator roles”) and an OIDC provider if required by your cluster type. – It provisions OpenShift components and worker nodes.
Check status:
rosa list clusters
rosa describe cluster -c "$CLUSTER_NAME"
Provisioning can take significant time (often tens of minutes). Keep checking until state indicates “ready” (wording varies).
If you want a lower-cost or newer control plane model (for example hosted control planes), ROSA supports additional cluster creation options in some regions. Use
rosa create cluster --helpand verify in official docs for current flags and tradeoffs.
Step 6: Create an OpenShift admin user and log in
Expected outcome: You can access the OpenShift web console and run oc get nodes.
Create an admin user:
rosa create admin -c "$CLUSTER_NAME"
The output typically provides:
– Console URL
– oc login command
– Username/password (or instructions)
Log in using the provided command, for example:
oc login https://api.<cluster-domain>:6443 --username <user> --password '<pass>'
Verify:
oc whoami
oc get nodes
oc get clusterversion
Step 7: Deploy a sample application
Expected outcome: A Deployment/Pods are running in a new project.
Create a new project (namespace):
oc new-project rosa-lab
Deploy a simple sample app. One portable approach is to use a public container image. For example, deploy an NGINX container (for demo only):
oc create deployment hello --image=nginx:stable
oc expose deployment hello --port=80
Wait for the pod:
oc get pods -w
You should see a pod in Running.
Step 8: Expose the app using a Route
Expected outcome: You get a public (or internal) URL and can fetch the page.
Create a Route:
oc expose service hello
oc get route hello
Get the URL:
export APP_HOST="$(oc get route hello -o jsonpath='{.spec.host}')"
echo "http://$APP_HOST"
Test it:
curl -I "http://$APP_HOST"
Expected: HTTP 200 or 301/302 depending on image/config.
If your cluster is private/internal-only, the route may not be reachable from the public internet; you may need VPN/Direct Connect or a bastion host inside the VPC.
Validation
Run the following checks:
# Cluster health basics
oc get nodes
oc get co
# Workload health
oc -n rosa-lab get deploy,po,svc,route
# Confirm route responds
curl -s "http://$APP_HOST" | head
Expected results:
– Nodes show Ready
– ClusterOperators mostly Available=True (some may be progressing during upgrades)
– Deployment hello has AVAILABLE replicas
– Route points to the hello service and responds to HTTP requests
Troubleshooting
Common issues and fixes:
1) ROSA role creation fails
– Symptoms: rosa create account-roles errors with access denied.
– Fix:
– Ensure your AWS identity has IAM admin (or ROSA-required) permissions.
– Check AWS Organizations SCPs and permission boundaries.
– Use --mode manual if your org requires explicit role creation.
2) Cluster create fails due to quotas – Symptoms: errors referencing EC2 quotas or inability to launch instances. – Fix: – Increase EC2 quota for the required instance family in the selected region. – Choose a different region or instance type (within supported worker types).
3) oc login fails
– Symptoms: certificate, DNS, or auth errors.
– Fix:
– Use the exact oc login command produced by rosa create admin.
– Confirm you can resolve cluster API DNS from your network.
– If private cluster, connect through appropriate private networking.
4) Route not reachable
– Symptoms: curl fails (timeout).
– Fix:
– Confirm whether cluster ingress is public or private.
– Verify Security Group and network access.
– Use internal connectivity path (VPN/bastion) if needed.
5) Pods stuck in ImagePullBackOff
– Symptoms: nodes can’t pull images.
– Fix:
– Ensure egress is available (NAT gateway or egress configuration).
– For private clusters, consider VPC endpoints for ECR and required services (if using ECR) or allow outbound to public registry.
Cleanup
Delete the sample project (optional; cluster deletion will remove it anyway):
oc delete project rosa-lab
Delete the cluster:
rosa delete cluster -c "$CLUSTER_NAME" --yes
Wait until it is deleted:
rosa list clusters
Then delete cluster-specific roles and OIDC provider if your cluster type created them. Commands vary by ROSA version/cluster type; verify with rosa help. Typical patterns include:
rosa delete operator-roles -c "$CLUSTER_NAME" --mode auto --yes
rosa delete oidc-provider -c "$CLUSTER_NAME" --mode auto --yes
Finally, if this AWS account will not host ROSA clusters again, delete account roles (optional; many orgs keep them for reuse):
rosa delete account-roles --mode auto --yes
Always confirm what will be deleted before running cleanup commands, especially in shared AWS accounts.
11. Best Practices
Architecture best practices
- Prefer multi-AZ for production to tolerate AZ failures; use single-AZ only for dev/test or non-critical workloads.
- Design VPC and subnets with growth in mind: IP exhaustion is a common Kubernetes failure mode.
- Use separate clusters for prod vs non-prod when strong isolation is required; enforce promotion via CI/CD rather than manual changes.
- Externalize state when possible: managed databases (RDS/Aurora/DynamoDB) often reduce operational load versus in-cluster stateful sets.
IAM/security best practices
- Use STS/OIDC patterns to avoid static AWS keys in pods.
- Enforce least privilege for AWS IAM roles assumed by workloads.
- Integrate cluster authentication with a centralized identity provider (IdP) and enforce MFA (verify supported IdPs).
- Use separate AWS accounts (Organizations) for environment isolation where appropriate.
Cost best practices
- Start with right-sized worker nodes and enforce requests/limits.
- Reduce NAT gateway spend with VPC endpoints and controlled outbound access.
- Limit log volume and set retention policies.
- Use Savings Plans for baseline worker pools in steady-state production.
Performance best practices
- Use resource requests/limits for predictable scheduling and performance.
- Use HPA/VPA patterns where supported and appropriate (verify versions and policies).
- Choose the right EBS volume type for workload I/O patterns and verify storage class parameters.
Reliability best practices
- Use PodDisruptionBudgets for critical services.
- Spread replicas across nodes/AZs using topology spread constraints.
- Implement backup/restore strategy for persistent data (EBS snapshots, app-level backups, or external data services).
- Regularly test disaster recovery (cluster recreation, GitOps rehydration).
Operations best practices
- Standardize cluster creation with Infrastructure as Code (Terraform, pipelines) once you validate manual steps.
- Define SLOs and alert routing; avoid alert fatigue.
- Maintain an upgrade playbook: test upgrades in staging first.
- Use GitOps for desired state and drift control.
Governance/tagging/naming best practices
- Use consistent cluster naming:
<org>-<env>-<region>-<purpose>. - Enforce AWS tags for cost allocation where possible (some managed resources may have limited tagging—verify).
- Apply namespace quotas and limit ranges to prevent runaway consumption.
12. Security Considerations
Identity and access model
- AWS IAM: Controls who can create clusters and manage AWS infrastructure.
- OpenShift OAuth: Controls who can log into the cluster.
- OpenShift RBAC: Controls what authenticated users can do in Projects and cluster-wide.
- Workload IAM: Use service accounts mapped to AWS IAM roles via OIDC/STS for AWS API access.
Recommendations: – Grant cluster-admin sparingly; use admin groups and break-glass accounts. – Use separate roles for platform admins vs app operators. – Audit privileged SCC usage and restrict where possible.
Encryption
- In transit: Use TLS for API and ingress. Manage certificates properly (OpenShift router and/or AWS load balancer termination patterns).
- At rest: EBS encryption can be enabled (often by default via account policy). Use KMS keys where required.
- Secrets: Kubernetes/OpenShift secrets are base64-encoded; treat them as sensitive and protect access. Consider external secrets management patterns (Secrets Manager/HashiCorp Vault) if required—verify supported integrations.
Network exposure
- Prefer private clusters for sensitive workloads, accessed via VPN/Direct Connect.
- Restrict inbound traffic to ingress endpoints; avoid exposing internal services.
- Use NetworkPolicy to limit pod-to-pod communications between namespaces.
- Control outbound egress (egress firewall/proxy patterns) for supply-chain risk reduction.
Secrets handling
- Never store AWS keys in ConfigMaps or container images.
- Use STS/OIDC roles for service accounts.
- Rotate application secrets and integrate with CI/CD securely.
Audit/logging
- Enable and retain audit logs according to compliance needs (verify ROSA audit logging mechanisms and integration options).
- Centralize logs with immutable storage where required.
- Log access to cluster-admin actions and IAM role changes.
Compliance considerations
- Determine whether ROSA in your region meets your compliance framework needs (HIPAA, PCI, SOC, ISO, etc.). Compliance depends on your configuration and AWS/Red Hat attestations—verify official compliance docs.
- Document the shared responsibility model for auditors.
Common security mistakes
- Using cluster-admin for day-to-day operations
- Leaving default namespaces overly permissive
- No NetworkPolicy and broad east-west traffic
- Excessive outbound internet access from nodes
- Unrestricted image sources and no supply-chain controls
Secure deployment recommendations
- Use private networking, STS roles, least privilege, and GitOps.
- Enforce image policy: trusted registries, signed images (if your toolchain supports it).
- Regularly scan images and dependencies; patch base images.
13. Limitations and Gotchas
Limitations vary by ROSA offering, OpenShift version, and region. Validate these in official docs for your cluster type.
Known limitations / constraints (common patterns)
- Region availability: Not all AWS regions support ROSA.
- Cluster type differences: Capabilities differ between classic and hosted control plane models (verify).
- Minimum worker requirements: OpenShift has baseline requirements; you may not be able to run a “tiny” cluster cheaply.
- Networking complexity: Private clusters require NAT/VPN/Direct Connect and careful DNS planning.
- IP exhaustion: If your VPC CIDR/subnets are too small, scaling fails.
- Operator compatibility: Not all operators are supported or recommended on managed offerings.
- Ingress exposure: Public vs private ingress configuration can block access unexpectedly.
- Egress costs: NAT gateway and data transfer charges can surprise teams.
- Stateful workload complexity: Backups, AZ affinity, and storage tuning are non-trivial.
Quotas (examples to check)
- EC2 instance quotas per family
- Load balancer quotas
- EIP quotas
- VPC/subnet limits
- OpenShift object limits (depends on etcd sizing and platform constraints)
Compatibility issues
- Some Kubernetes add-ons expect full control-plane access; managed OpenShift may restrict certain changes.
- Privileged workloads and custom kernel features may be constrained.
- Storage driver feature sets vary; verify CSI driver options and supported parameters.
Migration challenges
- Moving from self-managed OpenShift to ROSA requires:
- Aligning cluster versions
- Rebuilding cluster-wide policies (RBAC, SCCs, admission)
- Reworking ingress domains and DNS
- Revalidating storage classes and backup strategy
- Reconfiguring identity provider integration
Vendor-specific nuances
- Support boundaries: clarify what Red Hat/AWS manages vs what you manage.
- Upgrade windows and processes: understand how upgrades are scheduled and tested in your organization.
14. Comparison with Alternatives
ROSA is not the only way to run Containers and Kubernetes on AWS. The “best” option depends on how much platform you want and who operates it.
| Option | Best For | Strengths | Weaknesses | When to Choose |
|---|---|---|---|---|
| Red Hat OpenShift Service on AWS (ROSA) | Teams wanting managed OpenShift on AWS | OpenShift developer platform + managed lifecycle; enterprise workflows; strong governance primitives | Higher baseline cost; opinionated platform; managed constraints | You want OpenShift specifically and managed operations on AWS |
| Amazon EKS | Teams wanting AWS-native Kubernetes | Deep AWS integration; large ecosystem; flexible add-ons | You assemble platform pieces (ingress, policies, CI/CD); more DIY ops than ROSA | You want Kubernetes without OpenShift and can manage add-ons |
| Amazon ECS | Teams wanting simple container orchestration without Kubernetes | Simpler than Kubernetes; strong AWS integration; cost-effective | Not Kubernetes; portability tradeoffs; platform feature differences | You want AWS-native container orchestration and don’t need Kubernetes APIs |
| Self-managed OpenShift on AWS | Teams needing maximum OpenShift control | Full control of config and lifecycle | Highest operational burden; upgrades/security are on you | You must control everything or need unsupported managed constraints |
| Azure Red Hat OpenShift (ARO) | Teams on Azure wanting managed OpenShift | Managed OpenShift on Azure | Not AWS; different integrations and networking | Your core infrastructure is Azure and you want OpenShift |
| Google Cloud (GKE) / Anthos | Teams on GCP or multi-cloud governance | Strong Kubernetes platform; multi-cluster options | Not AWS; different cost and integration model | Your strategy is GCP-centric or Anthos-centric |
15. Real-World Example
Enterprise example (regulated industry)
- Problem: A financial services company must modernize dozens of internal apps, meet strict audit requirements, and reduce platform variance across teams.
- Proposed architecture:
- Multiple ROSA clusters across separate AWS accounts (dev/test/prod)
- Private networking with VPN/Direct Connect
- Centralized identity provider integration for OpenShift OAuth
- GitOps toolchain to deploy apps and policies
- Container images stored in Amazon ECR, scanned in CI
- Centralized logs forwarded to a SIEM; metrics exported to enterprise monitoring
- Why ROSA was chosen:
- Managed OpenShift lifecycle reduces operational burden and patch risk
- OpenShift RBAC/projects and policy patterns align with governance needs
- Runs on AWS where the company already has approved network and security tooling
- Expected outcomes:
- Faster app onboarding with standardized namespaces and pipelines
- Reduced cluster drift and improved audit readiness
- Predictable platform operations with clearer responsibility boundaries
Startup/small-team example
- Problem: A SaaS startup needs a reliable platform for microservices but has limited SRE bandwidth and wants OpenShift’s developer experience.
- Proposed architecture:
- One ROSA cluster for production (multi-AZ) and one smaller cluster for staging
- Managed database on RDS, object storage on S3
- Minimal in-cluster state; focus on stateless services
- Tight cost controls: right-sized worker nodes, VPC endpoints, logging limits
- Why ROSA was chosen:
- Avoids self-managing OpenShift upgrades and core platform components
- Provides a consistent developer workflow with console visibility and routing
- Expected outcomes:
- Faster releases with fewer platform distractions
- Easier hiring/training because OpenShift workflows are well documented
- Controlled scaling as customer demand grows
16. FAQ
1) What is Red Hat OpenShift Service on AWS (ROSA) in one sentence?
ROSA is a managed Red Hat OpenShift service that runs on AWS, combining OpenShift’s enterprise Kubernetes platform with a managed operations model.
2) Is ROSA the same as Amazon EKS?
No. EKS is AWS-managed Kubernetes; ROSA is managed OpenShift (which is Kubernetes plus OpenShift platform features and opinionated defaults).
3) Who manages the control plane in ROSA?
ROSA is managed, but the exact responsibility split depends on your ROSA offering and cluster type (classic vs hosted control planes). Verify the current shared responsibility model in official docs.
4) Does ROSA run inside my AWS account?
Worker nodes typically run in your AWS account and VPC. Control plane placement depends on cluster type; verify your selected architecture in official docs.
5) Do I need AWS Security Token Service (STS) for ROSA?
STS-based clusters are a common recommended pattern to avoid long-lived credentials. Verify current requirements and recommendations in ROSA docs.
6) Can I make a ROSA cluster private (no public ingress)?
Private cluster patterns are commonly supported, but exact options depend on region and cluster type. Verify the supported private/public API and ingress settings in ROSA docs.
7) How do applications access AWS services securely from ROSA?
Use an OIDC provider and map Kubernetes service accounts to AWS IAM roles (STS). This avoids embedding AWS keys in pods.
8) What’s the minimum cluster size?
OpenShift has baseline requirements for worker count and instance sizes. Minimums vary by ROSA offering/version—verify in official docs before provisioning.
9) Can I use Amazon ECR for images?
Yes, many teams use ECR. Configure authentication and network access appropriately (private clusters often require VPC endpoints or NAT).
10) How do upgrades work in ROSA?
Upgrades are managed under the ROSA service model, but scheduling/control options depend on offering. Always test upgrades in staging and follow official guidance.
11) Can I install any Kubernetes operator?
Not all operators are appropriate or supported in managed environments. Prefer supported operators and validate security posture and upgrade behavior.
12) Does ROSA support autoscaling?
Cluster autoscaling and HPA patterns are common in Kubernetes/OpenShift, but exact support depends on your configuration. Verify in ROSA docs for your cluster type.
13) How do I control costs?
Right-size worker nodes, reduce NAT gateway usage with VPC endpoints, minimize log ingestion, and use Savings Plans for steady compute.
14) Is ROSA suitable for stateful workloads?
It can be, with careful storage class selection, backups, and performance planning. Many teams prefer AWS managed databases for critical state.
15) How do I delete everything to avoid ongoing charges?
Delete the ROSA cluster, then delete cluster-specific IAM roles and OIDC provider created for it. Confirm cleanup steps with rosa CLI help and official docs.
16) Can I integrate ROSA with my enterprise IdP (SAML/OIDC/LDAP)?
OpenShift supports multiple identity providers, but supported configurations for ROSA should be validated in official docs.
17) Do I need to learn Kubernetes before OpenShift/ROSA?
It helps. OpenShift builds on Kubernetes; understanding pods, deployments, services, ingress concepts, and RBAC will make ROSA much easier.
17. Top Online Resources to Learn Red Hat OpenShift Service on AWS (ROSA)
| Resource Type | Name | Why It Is Useful |
|---|---|---|
| Official documentation | ROSA docs (OpenShift docs site) – https://docs.openshift.com/rosa/ | Primary how-to guides, cluster creation, IAM/STS, networking, operations |
| Official documentation | Red Hat documentation portal – https://docs.redhat.com/ (search “Red Hat OpenShift Service on AWS”) | Central location for Red Hat’s official product docs and updates |
| Official pricing | AWS ROSA pricing – https://aws.amazon.com/rosa/pricing/ | Official pricing model and dimensions (subscription + infra) |
| Official calculator | AWS Pricing Calculator – https://calculator.aws/ | Estimate EC2, EBS, ELB, NAT, data transfer costs for your design |
| CLI reference | oc CLI docs – https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/getting-started-cli.html |
Learn operational commands for deployments, routes, debugging |
| AWS docs | AWS VPC endpoints – https://docs.aws.amazon.com/vpc/latest/privatelink/what-is-privatelink.html | Reduce NAT costs and improve security for private clusters |
| AWS docs | IAM and STS – https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html | Understand temporary credentials model used by STS-based access |
| Architecture guidance | AWS Architecture Center – https://aws.amazon.com/architecture/ | Reference architectures for networking, security, multi-account patterns |
| Videos/webinars | AWS YouTube channel – https://www.youtube.com/@amazonwebservices | Sessions on containers, EKS/OpenShift ecosystem, landing zones (search ROSA) |
| Videos/webinars | Red Hat OpenShift YouTube – https://www.youtube.com/user/RedHatVideos | OpenShift concepts, operators, GitOps, security patterns |
| Samples (verify) | OpenShift / Red Hat GitHub orgs – https://github.com/openshift | Reference implementations and tooling (validate relevance to ROSA) |
18. Training and Certification Providers
| Institute | Suitable Audience | Likely Learning Focus | Mode | Website URL |
|---|---|---|---|---|
| DevOpsSchool.com | DevOps engineers, SREs, platform teams, beginners to intermediate | DevOps fundamentals, Kubernetes/OpenShift, CI/CD, cloud operations | check website | https://www.devopsschool.com/ |
| ScmGalaxy.com | Students, engineers learning DevOps tooling | SCM, CI/CD, DevOps practices, automation | check website | https://www.scmgalaxy.com/ |
| CLoudOpsNow.in | Cloud engineers, ops teams | Cloud operations, monitoring, reliability, automation | check website | https://cloudopsnow.in/ |
| SreSchool.com | SREs, operations teams, architects | SRE principles, SLOs, observability, incident response | check website | https://sreschool.com/ |
| AiOpsSchool.com | Ops teams exploring AIOps | AIOps concepts, monitoring automation, event correlation | check website | https://aiopsschool.com/ |
19. Top Trainers
| Platform/Site | Likely Specialization | Suitable Audience | Website URL |
|---|---|---|---|
| RajeshKumar.xyz | DevOps/cloud training content (verify offerings) | Beginners to intermediate DevOps learners | https://rajeshkumar.xyz/ |
| devopstrainer.in | DevOps training and coaching (verify offerings) | DevOps engineers, students | https://devopstrainer.in/ |
| devopsfreelancer.com | DevOps consulting/training resources (verify offerings) | Teams seeking external DevOps help | https://devopsfreelancer.com/ |
| devopssupport.in | DevOps support and training resources (verify offerings) | Operations teams needing practical support | https://devopssupport.in/ |
20. Top Consulting Companies
| Company | Likely Service Area | Where They May Help | Consulting Use Case Examples | Website URL |
|---|---|---|---|---|
| cotocus.com | Cloud/DevOps consulting (verify exact portfolio) | Platform engineering, automation, cloud adoption | Designing ROSA landing zone, CI/CD pipelines, observability setup | https://cotocus.com/ |
| DevOpsSchool.com | DevOps consulting and enablement (verify exact portfolio) | Training + implementation support | ROSA onboarding workshops, GitOps rollout, SRE practices | https://www.devopsschool.com/ |
| DEVOPSCONSULTING.IN | DevOps consulting (verify exact portfolio) | DevOps transformation and operations | Kubernetes/OpenShift operational readiness, security reviews, cost optimization | https://devopsconsulting.in/ |
21. Career and Learning Roadmap
What to learn before ROSA
1) Linux fundamentals: processes, networking, systemd, logs.
2) Containers: images, registries, Docker/OCI concepts.
3) Kubernetes basics: pods, deployments, services, ingress, configmaps/secrets, RBAC.
4) AWS fundamentals: IAM, VPC, EC2, security groups, load balancers, Route 53, CloudWatch.
5) Networking: CIDR/subnets, DNS, TLS, NAT gateways, VPC endpoints.
What to learn after ROSA
- Advanced OpenShift: Operators, SCCs, Routes/TLS patterns, cluster monitoring tuning
- GitOps at scale (Argo CD patterns, multi-cluster)
- Policy-as-code and governance (admission control, image policies—tooling varies)
- Observability engineering (metrics/logs/traces pipelines)
- Cost optimization for Kubernetes (requests/limits governance, chargeback/showback)
- Reliability engineering (SLOs, error budgets, chaos testing)
Job roles that use ROSA
- Platform Engineer / Internal Developer Platform Engineer
- DevOps Engineer
- Site Reliability Engineer (SRE)
- Cloud Engineer (Containers specialization)
- Solutions Architect (Containers and modernization)
- Security Engineer (container platform security)
Certification path (if available)
- Consider Red Hat OpenShift certifications (e.g., OpenShift administration/development tracks) and AWS certifications (Solutions Architect, DevOps Engineer).
- ROSA-specific certification paths may change—verify in official Red Hat and AWS training catalogs.
Project ideas for practice
- Build a multi-namespace platform with quotas + RBAC and deploy 3 microservices with Routes.
- Implement GitOps deployments to dev/stage/prod clusters.
- Configure workload IAM via STS/OIDC to access S3 securely.
- Create a cost-optimized private cluster design using VPC endpoints and minimal logging.
- Design a multi-account ROSA landing zone with centralized ingress and private DNS.
22. Glossary
- ROSA: Red Hat OpenShift Service on AWS; managed OpenShift clusters on AWS.
- OpenShift: Red Hat’s enterprise Kubernetes platform with additional features and opinionated defaults.
- Cluster: A Kubernetes/OpenShift environment consisting of control plane + worker nodes.
- Control plane: Components that expose the Kubernetes API and manage desired state (API server, etcd, controllers).
- Worker node: The machine (EC2 instance) where pods run.
- Pod: Smallest deployable unit in Kubernetes; one or more containers sharing networking/storage.
- Deployment: Kubernetes object that manages replicas of pods and rolling updates.
- Service: Stable virtual IP/DNS abstraction to reach pods.
- Route: OpenShift resource for exposing services via HTTP(S) routing.
- Operator: Kubernetes-native controller that manages the lifecycle of applications/platform components.
- STS (Security Token Service): AWS service that issues temporary credentials.
- OIDC provider: Identity provider used for federated authentication; in this context, enables mapping service accounts to IAM roles.
- RBAC: Role-Based Access Control; authorization model in Kubernetes/OpenShift.
- SCC (Security Context Constraints): OpenShift’s mechanism controlling pod security permissions (similar in spirit to Pod Security policies).
- VPC: Virtual Private Cloud; AWS network boundary for subnets, routing, and security groups.
- NAT gateway: Provides outbound internet access for private subnets; can be a major cost driver.
- VPC endpoint (PrivateLink): Private connectivity to AWS services without traversing the public internet/NAT.
- EBS/EFS: AWS block/file storage services commonly used via CSI drivers.
- Multi-AZ: Using multiple Availability Zones for higher availability.
23. Summary
Red Hat OpenShift Service on AWS (ROSA) is a managed OpenShift platform in AWS’s Containers ecosystem that helps teams run enterprise Kubernetes workloads with OpenShift’s developer experience and governance features—while reducing the operational burden of managing OpenShift itself.
It matters when you want standardized developer workflows, stronger platform governance, and a managed lifecycle model for OpenShift clusters running close to AWS services. Architecturally, ROSA typically places worker nodes in your AWS VPC, integrates tightly with AWS IAM using STS/OIDC patterns, and relies on AWS networking, load balancing, and storage services.
From a cost perspective, plan for both the ROSA subscription and AWS infrastructure, and watch for indirect cost drivers like NAT gateways, load balancers, data transfer, and logging volume. From a security perspective, prioritize least privilege, private networking where required, workload identity via STS, and strong RBAC/SCC practices.
Use ROSA when OpenShift is the platform you want and you value managed operations. Next step: follow the official ROSA documentation to validate region support, cluster types (classic vs hosted control planes), and the exact CLI flags for your target architecture, then automate your cluster provisioning using Infrastructure as Code once the manual lab is successful.