Category
Networking and content delivery
1. Introduction
AWS Transit Gateway is an AWS networking service that lets you connect multiple Amazon VPCs (and on-premises networks) through a central routing hub. Instead of building a large mesh of point-to-point connections, you attach networks to the transit gateway and control traffic flow using transit gateway route tables.
In simple terms: AWS Transit Gateway is a cloud router you manage as a service. You connect VPCs, VPNs, and AWS Direct Connect to it, then add routes so those networks can communicate in a controlled, scalable way.
Technically, AWS Transit Gateway is a regional, highly available routing service that supports multiple attachment types (VPC, Site-to-Site VPN, AWS Direct Connect via Direct Connect Gateway association, transit gateway peering, and Transit Gateway Connect). It uses route tables, route propagation, and attachment associations to determine how packets flow between connected networks. It integrates with AWS Organizations / AWS Resource Access Manager (RAM) for multi-account architectures, with logging via transit gateway flow logs, and with monitoring via Amazon CloudWatch.
The problem it solves is network sprawl: as the number of VPCs, accounts, and on-premises connections grows, managing connectivity with VPC peering and individual VPNs becomes complex, expensive, and operationally risky. AWS Transit Gateway centralizes routing and scales better for hub-and-spoke architectures.
2. What is AWS Transit Gateway?
Official purpose (scope-aligned summary): AWS Transit Gateway is designed to simplify and scale network connectivity by acting as a hub that interconnects VPCs and on-premises networks. It reduces the number of point-to-point connections required and provides centralized routing control.
Core capabilities
- Central hub routing for many VPCs and networks.
- Multiple attachment types:
- VPC attachments
- Site-to-Site VPN attachments
- AWS Direct Connect connectivity (via Direct Connect Gateway association)
- Transit gateway peering (including inter-Region peering)
- Transit Gateway Connect (GRE/BGP for SD-WAN and appliance connectivity)
- Route tables and segmentation to control which attachments can communicate.
- Multi-account sharing using AWS Resource Access Manager (RAM) and AWS Organizations.
- Observability with transit gateway flow logs and CloudWatch metrics.
Major components
- Transit gateway (TGW): The central routing hub resource.
- Attachments: Connections between TGW and a VPC/VPN/peering/Connect/DXGW association.
- Transit gateway route tables: Determine where traffic goes; support association and propagation concepts.
- Association: Which route table an attachment uses for route lookups.
- Propagation: Automatic learning/installation of routes from an attachment into a route table (where supported).
- Route entries: Static or propagated routes (longest prefix match applies).
Service type
- Managed networking service (control plane + data plane operated by AWS).
- You configure connectivity and routing; AWS runs the underlying infrastructure.
Regional / global / scope characteristics
- Regional service: A transit gateway is created in a specific AWS Region.
- Cross-Region connectivity: Supported via transit gateway peering (inter-Region peering attachments).
- Multi-account: A TGW can be shared across accounts using AWS RAM (common in hub-and-spoke organizations).
- Not AZ-scoped: It’s not “deployed into” a single Availability Zone, but VPC attachments are AZ-aware because you select subnets per AZ for the attachment.
How it fits into the AWS ecosystem
AWS Transit Gateway is typically the network backbone for multi-VPC and hybrid connectivity. It commonly integrates with: – Amazon VPC (VPC attachments, routing) – AWS Site-to-Site VPN (VPN attachments) – AWS Direct Connect (via Direct Connect Gateway association) – AWS Organizations & AWS RAM (sharing and governance) – AWS Network Manager (visibility and management for global networks) – Amazon CloudWatch (metrics) and Transit Gateway Flow Logs – AWS CloudTrail (API auditing) – AWS Firewall Manager / AWS Network Firewall (often deployed in inspection VPC patterns; routing controlled via TGW)
Service name status: “AWS Transit Gateway” is current and active. (Verify the latest feature set and quotas in the official docs for your Region.)
Official documentation: https://docs.aws.amazon.com/transitgateway/latest/ug/what-is-transit-gateway.html
3. Why use AWS Transit Gateway?
Business reasons
- Faster time to onboard new VPCs/accounts: Attach a new VPC to the hub instead of creating many peerings.
- Lower operational overhead: Centralizes routing and network changes.
- Supports organizational scale: Works well with AWS Organizations multi-account models.
Technical reasons
- Hub-and-spoke routing at scale: Avoids the complexity of a full mesh design.
- Segmentation with route tables: Build separate network domains (prod/dev/shared services/partner).
- Hybrid connectivity patterns: Centralize Site-to-Site VPN and Direct Connect connectivity.
- Inter-Region architectures: Use transit gateway peering for cross-Region traffic patterns (verify constraints for your design in official docs).
Operational reasons
- Repeatable patterns: Standard TGW + shared services VPC + inspection VPC + workload VPCs.
- Simplified troubleshooting: Central place to view attachments and routes (still requires good tooling and conventions).
- Integration with flow logs: Better visibility into traffic patterns when enabled.
Security/compliance reasons
- Central inspection and egress control patterns: Steer traffic through firewall/inspection VPCs.
- Account separation: Central networking account can host the TGW, shared to workload accounts via RAM.
- Auditable changes: Use CloudTrail to track TGW configuration changes and IAM for least privilege.
Scalability/performance reasons
- Fewer connections to manage: Scaling VPC peering grows combinatorially; TGW scales in a hub model.
- High availability: Designed to be highly available within a Region; attachments can be built across multiple AZs (for VPC attachments you select subnets in multiple AZs).
When teams should choose it
Choose AWS Transit Gateway when you have: – Many VPCs across one or more accounts – Multiple on-premises sites – A need for centralized security inspection or shared services routing – A desire for consistent network segmentation patterns at scale
When teams should not choose it
AWS Transit Gateway may be unnecessary or suboptimal when: – You only need to connect a small number of VPCs with simple, static requirements (VPC peering might be simpler). – You need service-level exposure rather than network-level connectivity (AWS PrivateLink is usually better for exposing specific services without broad routing). – You cannot accept the ongoing hourly + data processing costs of a TGW (pricing is usage-based). – Your application architecture can avoid L3 connectivity by using managed integration (API Gateway, event buses, managed services) instead of network routing.
4. Where is AWS Transit Gateway used?
Industries
- Financial services (segmented environments, strict controls, hybrid connectivity)
- Healthcare (compliance-driven segmentation)
- Retail and e-commerce (multi-account, multi-environment networks)
- SaaS providers (shared services, tenant segmentation patterns)
- Manufacturing and logistics (site connectivity, hybrid and SD-WAN)
- Media and gaming (multi-Region backends, centralized services)
Team types
- Platform engineering teams operating shared network foundations
- Cloud networking teams building landing zones
- Security teams implementing centralized inspection
- SRE/operations teams standardizing connectivity and troubleshooting
Workloads
- Shared services (Active Directory, DNS resolvers, CI/CD, artifact repositories)
- Hybrid apps with on-prem databases or services
- Microservices across multiple VPCs needing routed connectivity
- Centralized egress and inbound inspection (via firewall appliances)
Architectures
- Hub-and-spoke multi-VPC design (most common)
- Multi-account landing zone with a central “network account”
- Hybrid hub connecting multiple on-prem sites to multiple VPCs
- Inter-Region connected hubs (with transit gateway peering)
Real-world deployment contexts
- Production: Commonly used as the central backbone; requires strong governance, route control, and logging.
- Dev/test: Useful when dev/test spans many accounts/VPCs. Some teams avoid TGW in dev to reduce cost; others keep parity with production to avoid surprises.
5. Top Use Cases and Scenarios
Below are realistic scenarios where AWS Transit Gateway is a strong fit.
1) Multi-account hub-and-spoke VPC connectivity
- Problem: Hundreds of VPC peerings are hard to manage across many accounts.
- Why AWS Transit Gateway fits: Central hub with attach-and-route model; share TGW via AWS RAM.
- Example: A platform team shares a TGW from a networking account to 50 workload accounts; each account attaches its VPCs to the hub.
2) Centralized egress (internet) with inspection
- Problem: Each VPC managing its own NAT gateways, routes, and security tools leads to inconsistent controls.
- Why it fits: You can steer traffic to an egress VPC (often containing NAT gateways and inspection).
- Example: All workload VPCs route
0.0.0.0/0to an inspection VPC through TGW (verify the exact routing design and return path requirements in official docs).
3) Centralized inbound inspection (north-south) before apps
- Problem: Inbound traffic needs consistent filtering and logging.
- Why it fits: Create an ingress VPC with firewalls/IDS; then route to app VPCs via TGW.
- Example: Internet-facing ALBs live in a dedicated ingress VPC; traffic to app tiers goes over TGW.
4) Shared services VPC (DNS, AD, patching, CI/CD)
- Problem: Every VPC recreates shared infrastructure and connectivity rules.
- Why it fits: TGW simplifies routing from workload VPCs to a shared services VPC.
- Example: A central Microsoft AD and Route 53 Resolver endpoints live in one VPC; all others route to it via TGW.
5) Hybrid connectivity hub (multiple sites to many VPCs)
- Problem: On-premises networks must reach multiple AWS VPCs without building a separate VPN to each VPC.
- Why it fits: Site-to-Site VPN attachment(s) to TGW, then routes to many VPC attachments.
- Example: Two data centers connect with VPN to TGW; dozens of VPCs are reachable with controlled routing.
6) Direct Connect hub for private hybrid routing
- Problem: You need predictable private connectivity from on-premises to multiple VPCs.
- Why it fits: Associate TGW with a Direct Connect Gateway and route to multiple VPCs.
- Example: A DX connection terminates into a DXGW; TGW association provides controlled access to multiple application VPCs.
7) Inter-Region connectivity for distributed systems
- Problem: Applications span Regions and need private routed connectivity.
- Why it fits: Transit gateway peering supports inter-Region TGW-to-TGW routing (verify supported combinations and constraints).
- Example: Active-active services run in two Regions; service-to-service traffic uses TGW peering.
8) SD-WAN integration using Transit Gateway Connect
- Problem: Branch connectivity uses SD-WAN appliances and needs dynamic routing into AWS.
- Why it fits: TGW Connect supports GRE/BGP-based connectivity for SD-WAN appliances.
- Example: A virtual SD-WAN appliance connects to TGW using GRE tunnels; BGP advertises branch prefixes.
9) Network segmentation by environment (prod/dev/shared)
- Problem: Hard to enforce separation when networks are highly connected.
- Why it fits: Multiple TGW route tables enable segmentation; attachments associate to different route tables.
- Example: Prod VPCs associate to a “prod” TGW route table that only routes to shared services and on-prem; dev VPCs are isolated.
10) Centralized traffic engineering through inspection appliances (appliance mode)
- Problem: Stateful inspection appliances require symmetric routing; asymmetry breaks flows.
- Why it fits: TGW “appliance mode” (for supported scenarios) helps maintain symmetry when steering traffic through appliances (verify details in docs).
- Example: Traffic from spoke VPCs is routed through a firewall VPC; return traffic follows the same path.
11) Mergers and acquisitions (M&A) network integration
- Problem: Two organizations need temporary connectivity while networks and identities integrate.
- Why it fits: TGW can connect multiple domains and restrict routes as needed.
- Example: A partner VPC set is attached to TGW with a limited route table that only exposes a few services.
12) Controlled partner connectivity (limited blast radius)
- Problem: Partners need access to a few internal services without full network access.
- Why it fits: Attach partner networks via VPN to TGW and selectively route to only required CIDRs.
- Example: Vendor VPN attachment routes only to a reporting subnet; no other routes are present.
6. Core Features
This section focuses on widely used, current capabilities. Always verify Region availability and feature support in official docs.
6.1 Transit gateway attachments (VPC, VPN, peering, Connect, DXGW association)
- What it does: Connects different networks to the transit gateway.
- Why it matters: Attachments are the fundamental building blocks of TGW connectivity.
- Practical benefit: Add or remove networks with minimal rework.
- Caveats: Each attachment has cost implications (hourly and/or data processing). Attachment quotas apply (see Service Quotas).
6.2 Transit gateway route tables (association + propagation)
- What it does: Controls routing decisions for traffic entering TGW from an attachment.
- Why it matters: Enables segmentation and policy-driven connectivity.
- Practical benefit: Isolate environments (prod/dev), build shared services routing, restrict partner access.
- Caveats: Misconfigurations can blackhole traffic or expose networks unintentionally. Treat route tables as critical infrastructure.
6.3 Route propagation (where supported)
- What it does: Learns routes from attachments and installs them into route tables automatically.
- Why it matters: Reduces manual route management at scale.
- Practical benefit: VPN and VPC attachments can populate routes automatically depending on configuration.
- Caveats: Overuse can create overly permissive routing. Many enterprises prefer controlled propagation + explicit routes.
6.4 Inter-Region transit gateway peering
- What it does: Connects transit gateways in different Regions via peering attachments.
- Why it matters: Enables private, routed cross-Region networking under your control.
- Practical benefit: Multi-Region architectures with predictable routing patterns.
- Caveats: Costs differ from in-Region routing; data transfer pricing and architecture constraints apply (verify in docs).
6.5 Multi-account sharing with AWS Resource Access Manager (RAM)
- What it does: Allows a central account to share a TGW with other AWS accounts.
- Why it matters: Enables “network account” patterns and reduces duplicated TGWs.
- Practical benefit: Central governance; consistent security and routing controls.
- Caveats: Requires careful IAM and organizational controls; ensure tagging and ownership are clear.
6.6 Transit Gateway Connect (GRE/BGP)
- What it does: Provides a way to connect appliances (including SD-WAN) using GRE tunnels and BGP.
- Why it matters: Fits enterprise WAN patterns and dynamic routing into AWS.
- Practical benefit: Easier integration with third-party routing appliances than building many VPNs.
- Caveats: Requires routing expertise (BGP, ASN planning). Ensure failover and health monitoring are designed.
6.7 Multicast (for supported scenarios)
- What it does: Enables multicast traffic between VPCs via TGW multicast domains.
- Why it matters: Some enterprise applications require multicast (e.g., certain market data or discovery protocols).
- Practical benefit: Avoids complex overlay solutions for multicast.
- Caveats: Not all environments need it; multicast adds design and operational complexity.
6.8 Appliance mode (traffic symmetry support)
- What it does: Helps maintain symmetric routing when traffic is steered to network appliances.
- Why it matters: Stateful firewalls/IDS/IPS often require the return path to match the forward path.
- Practical benefit: More reliable centralized inspection designs.
- Caveats: Verify exact behavior, constraints, and supported attachment types in official docs before relying on it.
6.9 Transit Gateway Flow Logs
- What it does: Captures flow metadata for traffic traversing the transit gateway.
- Why it matters: Provides visibility for troubleshooting, security analytics, and auditing.
- Practical benefit: Identify top talkers, unexpected routes, or denied patterns (in conjunction with other logs).
- Caveats: Logging has cost (log ingestion/storage). Make sure retention and destinations (CloudWatch Logs / S3) are planned.
6.10 IPv6 support (where applicable)
- What it does: Supports IPv6 routing for attached networks (verify current IPv6 behavior per attachment type).
- Why it matters: Dual-stack and IPv6-only designs are increasingly common.
- Practical benefit: Consistent routing and segmentation for IPv6.
- Caveats: Ensure all attached components (instances, appliances, on-prem) support IPv6 and that route tables are correct.
7. Architecture and How It Works
High-level service architecture
At a high level, AWS Transit Gateway acts like a central router: 1. You create a transit gateway in a Region. 2. You create attachments from VPCs (and/or VPN, peering, Connect). 3. Each attachment is associated with a transit gateway route table (for route lookups). 4. Routes in the transit gateway route table tell TGW which attachment to forward traffic to. 5. You update VPC route tables so that traffic destined for remote CIDRs uses the TGW attachment.
Data plane flow (typical VPC-to-VPC)
- An EC2 instance in VPC-A sends a packet to an IP in VPC-B.
- VPC-A subnet route table has a route for VPC-B’s CIDR pointing to the transit gateway.
- Traffic enters the TGW through the VPC-A attachment.
- TGW consults its associated route table and forwards traffic to the VPC-B attachment.
- VPC-B route table returns traffic back to TGW for VPC-A CIDR (symmetric routing required for many stateful scenarios).
Control plane flow
- You manage TGW resources via:
- AWS Management Console
- AWS CLI / SDKs
- Infrastructure as Code tools (AWS CloudFormation, Terraform, etc.—verify your chosen tool’s resource coverage)
- AWS CloudTrail records API activity for audit.
Integrations with related services
- Amazon VPC: Attachments, subnet selection, VPC route tables.
- Site-to-Site VPN: VPN attachments to TGW for hybrid connectivity.
- AWS Direct Connect: DXGW association to TGW for private connectivity.
- AWS RAM: Share TGW across accounts.
- AWS Network Manager: Network visibility and centralized management (especially in global networks).
- CloudWatch: Metrics; Flow Logs destinations may include CloudWatch Logs (verify destinations supported in your Region).
- CloudTrail: Audit for TGW changes.
- Security tooling: Often combined with AWS Network Firewall or third-party appliances using inspection VPC patterns.
Dependency services (common)
- VPC subnets and route tables
- EC2 (for test instances, or appliances)
- IAM roles/policies
- Systems Manager (optional, for access to test instances without SSH)
Security/authentication model
- Control plane access is governed by IAM policies.
- Resource sharing uses AWS RAM and can be constrained via Organizations.
- Data plane is controlled by:
- Transit gateway route tables
- VPC route tables
- Security groups and NACLs on resources inside VPCs
- Firewall/appliance policies if you steer through inspection VPCs
Networking model considerations
- AZ design matters for cost and availability: For VPC attachments you pick subnets in one or more AZs. If workloads exist in multiple AZs, attach subnets in each to avoid cross-AZ traffic inside the VPC (which can incur data transfer costs).
- CIDR planning is critical: Avoid overlapping address spaces where possible; it complicates routing and segmentation.
Monitoring/logging/governance
- Enable Flow Logs for TGW where appropriate for operational visibility.
- Use CloudTrail + IAM Access Analyzer (as applicable) to govern changes.
- Use consistent tags, naming conventions, and change management due to TGW’s blast radius.
Simple architecture diagram (Mermaid)
flowchart LR
A[VPC A\n10.0.0.0/16] -->|VPC attachment| TGW[(AWS Transit Gateway)]
B[VPC B\n10.1.0.0/16] -->|VPC attachment| TGW
TGW -->|Routes| A
TGW -->|Routes| B
Production-style architecture diagram (Mermaid)
flowchart TB
subgraph Org[AWS Organization]
subgraph NetAcct[Networking Account]
TGW[(Transit Gateway)]
RT1[Route Table: Prod]
RT2[Route Table: Shared]
RT3[Route Table: Partner]
TGW --- RT1
TGW --- RT2
TGW --- RT3
end
subgraph SharedAcct[Shared Services Account]
VPCSS[Shared Services VPC\nDNS/AD/CI]
end
subgraph SecAcct[Security Account]
VPCINSP[Inspection VPC\nNetwork Firewall / Appliances]
LOGS[CloudWatch / S3 Logs]
end
subgraph ProdAcct[Production Accounts]
VPCP1[Prod VPC 1]
VPCP2[Prod VPC 2]
end
end
OnPrem[On-Prem Data Center] --- VPN[Site-to-Site VPN]
VPN --> TGW
VPCSS -->|Attachment| TGW
VPCINSP -->|Attachment| TGW
VPCP1 -->|Attachment| TGW
VPCP2 -->|Attachment| TGW
TGW --> LOGS
8. Prerequisites
Account and billing
- An AWS account with billing enabled.
- AWS Transit Gateway is not a free-tier-only service; expect charges while resources exist.
Permissions (IAM)
You need permissions to create and manage: – EC2 networking resources (VPCs, subnets, route tables, internet gateways) – AWS Transit Gateway and related attachments – EC2 instances (for testing) – Optional: Systems Manager (for Session Manager access)
Common managed policies that might help in a lab (not least-privilege):
– AmazonVPCFullAccess
– AmazonEC2FullAccess
– AWSTransitGatewayFullAccess (if available in your account; verify in IAM)
For production, create least-privilege IAM policies tailored to TGW actions.
Tools
Choose one: – AWS Management Console (browser) – AWS CLI v2 (recommended for repeatability): https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
If using CLI:
– Configure credentials: aws configure
– Ensure your default Region is set or pass --region.
Region availability
- AWS Transit Gateway is available in many AWS Regions, but feature availability can vary (for example, certain advanced capabilities may have Region constraints). Verify Region support in official docs and the console.
Quotas/limits
AWS Transit Gateway has quotas such as: – Number of attachments per TGW – Number of route tables per TGW – Routes per route table – Multicast-related limits (if used) Check Service Quotas for “AWS Transit Gateway” in your Region: https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html
Prerequisite services/resources for this lab
- 2 VPCs with non-overlapping IPv4 CIDRs
- Subnets and route tables
- 2 EC2 instances (one in each VPC) for connectivity testing
9. Pricing / Cost
AWS Transit Gateway pricing is usage-based and varies by Region. Do not assume one Region’s pricing applies to another.
Official pricing page (start here): https://aws.amazon.com/transit-gateway/pricing/
Also use: – AWS Pricing Calculator: https://calculator.aws/
Pricing dimensions (typical)
Common pricing dimensions include (verify current pricing categories on the official page): – Transit gateway hourly charge (per transit gateway). – Attachment hourly charges (per attachment type, such as VPC attachments, VPN attachments, peering, Connect). – Data processing charges (per GB of data processed by the transit gateway). – Inter-Region peering data transfer and processing (cross-Region traffic typically introduces additional data transfer costs).
Free tier
- No general free tier for AWS Transit Gateway usage. Some accounts may have promotional credits; verify your account.
Primary cost drivers
- Number of attachments (and how long they run).
- Amount of data processed through TGW.
- Cross-AZ data transfer inside VPCs if your VPC attachments don’t cover all AZs used by workloads.
- Cross-Region data transfer if using inter-Region TGW peering.
- Logging costs (CloudWatch Logs ingestion, S3 storage, analytics tools).
Hidden/indirect costs to plan for
- EC2 instances for appliances (firewalls, routers) and test clients.
- NAT Gateways in centralized egress patterns (significant hourly + per-GB costs).
- VPC endpoints (if you design private management access, e.g., SSM endpoints).
- Firewall services (AWS Network Firewall charges, third-party marketplace appliances).
- Data transfer out to internet or to on-premises (Direct Connect/VPN data transfer pricing differs).
Network/data transfer implications
- Traffic routed between VPCs via TGW is subject to:
- TGW data processing charges, and
- Potential inter-AZ transfer charges within a VPC if traffic must traverse AZ boundaries to reach the TGW attachment subnet.
- Cross-Region designs often combine:
- TGW processing charges, plus
- Inter-Region data transfer charges.
Cost optimization strategies
- Minimize unnecessary attachments: Don’t attach every dev sandbox to the production TGW if not needed.
- Use segmentation to reduce “chatty” east-west traffic across TGW when services can stay within a VPC.
- Attach subnets in each AZ used by workloads to reduce cross-AZ data transfer inside the VPC.
- Be deliberate with flow logs: Start with targeted logging and appropriate retention; export to S3 for cost-effective long-term storage if needed.
- Right-size inspection: Centralized inspection patterns can concentrate costs; validate throughput needs and appliance sizing early.
Example low-cost starter estimate (conceptual)
A minimal lab typically includes: – 1 transit gateway running for a few hours – 2 VPC attachments running for a few hours – Minimal data (simple pings/tests) Even with low data volume, hourly charges will apply. Use the AWS Pricing Calculator to estimate based on your Region and expected duration.
Example production cost considerations
In production, cost is dominated by: – Many attachments across accounts – Significant east-west traffic volumes (data processing) – Cross-Region traffic – Centralized inspection and egress design (NAT/firewall + data) A best practice is to build a monthly unit-cost model: – Cost per attached VPC per month – Cost per GB routed via TGW – Cost per GB cross-Region …and allocate/chargeback to accounts using tags and cost allocation reports.
10. Step-by-Step Hands-On Tutorial
Objective
Create an AWS Transit Gateway in one Region, attach two VPCs, configure routes, and verify private connectivity between EC2 instances across VPCs.
Lab Overview
You will build:
– VPC-A: 10.0.0.0/16 with one public subnet and one EC2 instance
– VPC-B: 10.1.0.0/16 with one public subnet and one EC2 instance
– One AWS Transit Gateway with two VPC attachments
– VPC route table entries pointing to the transit gateway
– A security group rule allowing ICMP between the instances (for testing)
Expected final result: You can ping the private IP of the EC2 instance in VPC-B from the EC2 instance in VPC-A (and vice versa) over AWS Transit Gateway.
Cost note: This lab incurs TGW hourly charges while running. Clean up immediately after validation.
The steps below use the AWS CLI for repeatability. You can do the same in the console, but CLI makes verification and cleanup easier.
Set variables (choose your Region)
export AWS_REGION="us-east-1" # change as needed
export PROJECT="tgw-lab"
Step 1: Create two VPCs and subnets
What you’ll do: Create VPC-A and VPC-B with one subnet each.
# Create VPC A
VPC_A_ID=$(aws ec2 create-vpc \
--region "$AWS_REGION" \
--cidr-block 10.0.0.0/16 \
--tag-specifications "ResourceType=vpc,Tags=[{Key=Name,Value=${PROJECT}-vpc-a}]" \
--query 'Vpc.VpcId' --output text)
aws ec2 modify-vpc-attribute --region "$AWS_REGION" --vpc-id "$VPC_A_ID" --enable-dns-support "{\"Value\":true}"
aws ec2 modify-vpc-attribute --region "$AWS_REGION" --vpc-id "$VPC_A_ID" --enable-dns-hostnames "{\"Value\":true}"
# Create VPC B
VPC_B_ID=$(aws ec2 create-vpc \
--region "$AWS_REGION" \
--cidr-block 10.1.0.0/16 \
--tag-specifications "ResourceType=vpc,Tags=[{Key=Name,Value=${PROJECT}-vpc-b}]" \
--query 'Vpc.VpcId' --output text)
aws ec2 modify-vpc-attribute --region "$AWS_REGION" --vpc-id "$VPC_B_ID" --enable-dns-support "{\"Value\":true}"
aws ec2 modify-vpc-attribute --region "$AWS_REGION" --vpc-id "$VPC_B_ID" --enable-dns-hostnames "{\"Value\":true}"
Pick one Availability Zone to keep the lab small:
AZ1=$(aws ec2 describe-availability-zones --region "$AWS_REGION" --query 'AvailabilityZones[0].ZoneName' --output text)
echo "Using AZ: $AZ1"
Create one subnet in each VPC:
SUBNET_A_ID=$(aws ec2 create-subnet \
--region "$AWS_REGION" \
--vpc-id "$VPC_A_ID" \
--availability-zone "$AZ1" \
--cidr-block 10.0.1.0/24 \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=${PROJECT}-subnet-a}]" \
--query 'Subnet.SubnetId' --output text)
SUBNET_B_ID=$(aws ec2 create-subnet \
--region "$AWS_REGION" \
--vpc-id "$VPC_B_ID" \
--availability-zone "$AZ1" \
--cidr-block 10.1.1.0/24 \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=${PROJECT}-subnet-b}]" \
--query 'Subnet.SubnetId' --output text)
Expected outcome: Two VPCs exist with one subnet each.
Verification:
aws ec2 describe-vpcs --region "$AWS_REGION" --vpc-ids "$VPC_A_ID" "$VPC_B_ID" \
--query 'Vpcs[*].{VpcId:VpcId,Cidr:CidrBlock,Name:Tags[?Key==`Name`]|[0].Value}'
Step 2: Add internet gateways (for easy instance access) and public routing
What you’ll do: Create and attach an internet gateway to each VPC and add a default route.
Create and attach IGWs:
IGW_A_ID=$(aws ec2 create-internet-gateway --region "$AWS_REGION" \
--tag-specifications "ResourceType=internet-gateway,Tags=[{Key=Name,Value=${PROJECT}-igw-a}]" \
--query 'InternetGateway.InternetGatewayId' --output text)
aws ec2 attach-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_A_ID" --vpc-id "$VPC_A_ID"
IGW_B_ID=$(aws ec2 create-internet-gateway --region "$AWS_REGION" \
--tag-specifications "ResourceType=internet-gateway,Tags=[{Key=Name,Value=${PROJECT}-igw-b}]" \
--query 'InternetGateway.InternetGatewayId' --output text)
aws ec2 attach-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_B_ID" --vpc-id "$VPC_B_ID"
Find the main route tables:
RT_A_ID=$(aws ec2 describe-route-tables --region "$AWS_REGION" \
--filters "Name=vpc-id,Values=$VPC_A_ID" "Name=association.main,Values=true" \
--query 'RouteTables[0].RouteTableId' --output text)
RT_B_ID=$(aws ec2 describe-route-tables --region "$AWS_REGION" \
--filters "Name=vpc-id,Values=$VPC_B_ID" "Name=association.main,Values=true" \
--query 'RouteTables[0].RouteTableId' --output text)
Add default routes to the internet gateways:
aws ec2 create-route --region "$AWS_REGION" --route-table-id "$RT_A_ID" --destination-cidr-block 0.0.0.0/0 --gateway-id "$IGW_A_ID" || true
aws ec2 create-route --region "$AWS_REGION" --route-table-id "$RT_B_ID" --destination-cidr-block 0.0.0.0/0 --gateway-id "$IGW_B_ID" || true
Make subnets auto-assign public IPv4 addresses:
aws ec2 modify-subnet-attribute --region "$AWS_REGION" --subnet-id "$SUBNET_A_ID" --map-public-ip-on-launch
aws ec2 modify-subnet-attribute --region "$AWS_REGION" --subnet-id "$SUBNET_B_ID" --map-public-ip-on-launch
Expected outcome: Instances in each subnet can reach the internet (needed for patching and easy access in this lab).
Step 3: Create AWS Transit Gateway
What you’ll do: Create one transit gateway.
TGW_ID=$(aws ec2 create-transit-gateway \
--region "$AWS_REGION" \
--description "${PROJECT} transit gateway" \
--tag-specifications "ResourceType=transit-gateway,Tags=[{Key=Name,Value=${PROJECT}-tgw}]" \
--query 'TransitGateway.TransitGatewayId' --output text)
echo "TGW: $TGW_ID"
Wait until available:
aws ec2 describe-transit-gateways --region "$AWS_REGION" --transit-gateway-ids "$TGW_ID" \
--query 'TransitGateways[0].State' --output text
Expected outcome: TGW state becomes available.
Step 4: Attach both VPCs to the transit gateway
What you’ll do: Create a VPC attachment from each VPC to the TGW. You must specify at least one subnet per VPC for the attachment.
ATT_A_ID=$(aws ec2 create-transit-gateway-vpc-attachment \
--region "$AWS_REGION" \
--transit-gateway-id "$TGW_ID" \
--vpc-id "$VPC_A_ID" \
--subnet-ids "$SUBNET_A_ID" \
--tag-specifications "ResourceType=transit-gateway-attachment,Tags=[{Key=Name,Value=${PROJECT}-att-a}]" \
--query 'TransitGatewayVpcAttachment.TransitGatewayAttachmentId' --output text)
ATT_B_ID=$(aws ec2 create-transit-gateway-vpc-attachment \
--region "$AWS_REGION" \
--transit-gateway-id "$TGW_ID" \
--vpc-id "$VPC_B_ID" \
--subnet-ids "$SUBNET_B_ID" \
--tag-specifications "ResourceType=transit-gateway-attachment,Tags=[{Key=Name,Value=${PROJECT}-att-b}]" \
--query 'TransitGatewayVpcAttachment.TransitGatewayAttachmentId' --output text)
echo "Attachment A: $ATT_A_ID"
echo "Attachment B: $ATT_B_ID"
Wait for attachments to be available:
aws ec2 describe-transit-gateway-vpc-attachments --region "$AWS_REGION" \
--transit-gateway-attachment-ids "$ATT_A_ID" "$ATT_B_ID" \
--query 'TransitGatewayVpcAttachments[*].{Id:TransitGatewayAttachmentId,State:State,VpcId:VpcId}'
Expected outcome: Both attachments show available.
Step 5: Add VPC route table routes to reach the other VPC through TGW
What you’ll do: In each VPC’s route table, add a route for the other VPC’s CIDR block pointing to the transit gateway.
aws ec2 create-route --region "$AWS_REGION" \
--route-table-id "$RT_A_ID" \
--destination-cidr-block 10.1.0.0/16 \
--transit-gateway-id "$TGW_ID"
aws ec2 create-route --region "$AWS_REGION" \
--route-table-id "$RT_B_ID" \
--destination-cidr-block 10.0.0.0/16 \
--transit-gateway-id "$TGW_ID"
Expected outcome: Each VPC route table now forwards traffic to the other VPC CIDR via TGW.
Verification:
aws ec2 describe-route-tables --region "$AWS_REGION" --route-table-ids "$RT_A_ID" "$RT_B_ID" \
--query 'RouteTables[*].Routes[*].{Dest:DestinationCidrBlock,TargetTgw:TransitGatewayId,State:State}'
Step 6: Launch one EC2 instance in each VPC
What you’ll do: Create a security group allowing ICMP between the VPC CIDRs (for testing) and SSH from your IP (optional). Then launch Amazon Linux instances.
Get your public IP (manual step): – Find your current public IPv4 address (for SSH). If you prefer not to open SSH, you can use AWS Systems Manager Session Manager instead; that requires additional setup/roles and is outside the minimal scope here.
Create security groups:
SG_A_ID=$(aws ec2 create-security-group --region "$AWS_REGION" \
--group-name "${PROJECT}-sg-a" \
--description "SG for VPC A instance" \
--vpc-id "$VPC_A_ID" \
--query 'GroupId' --output text)
SG_B_ID=$(aws ec2 create-security-group --region "$AWS_REGION" \
--group-name "${PROJECT}-sg-b" \
--description "SG for VPC B instance" \
--vpc-id "$VPC_B_ID" \
--query 'GroupId' --output text)
Allow ICMP between the two VPC CIDRs:
aws ec2 authorize-security-group-ingress --region "$AWS_REGION" \
--group-id "$SG_A_ID" --ip-permissions "IpProtocol=icmp,FromPort=-1,ToPort=-1,IpRanges=[{CidrIp=10.1.0.0/16,Description='ICMP from VPC B'}]"
aws ec2 authorize-security-group-ingress --region "$AWS_REGION" \
--group-id "$SG_B_ID" --ip-permissions "IpProtocol=icmp,FromPort=-1,ToPort=-1,IpRanges=[{CidrIp=10.0.0.0/16,Description='ICMP from VPC A'}]"
(Optional) Allow SSH from your IP:
export MY_IP="x.x.x.x/32" # set this
aws ec2 authorize-security-group-ingress --region "$AWS_REGION" --group-id "$SG_A_ID" --protocol tcp --port 22 --cidr "$MY_IP"
aws ec2 authorize-security-group-ingress --region "$AWS_REGION" --group-id "$SG_B_ID" --protocol tcp --port 22 --cidr "$MY_IP"
Create or choose an EC2 key pair (optional if using SSH):
KEY_NAME="${PROJECT}-key"
aws ec2 create-key-pair --region "$AWS_REGION" --key-name "$KEY_NAME" --query 'KeyMaterial' --output text > "${KEY_NAME}.pem"
chmod 400 "${KEY_NAME}.pem"
Find a current Amazon Linux AMI ID using SSM public parameters (recommended approach):
AMI_ID=$(aws ssm get-parameter --region "$AWS_REGION" \
--name /aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-x86_64 \
--query 'Parameter.Value' --output text)
echo "AMI: $AMI_ID"
Launch instances:
INST_A_ID=$(aws ec2 run-instances --region "$AWS_REGION" \
--image-id "$AMI_ID" \
--instance-type t3.micro \
--subnet-id "$SUBNET_A_ID" \
--security-group-ids "$SG_A_ID" \
--key-name "$KEY_NAME" \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=${PROJECT}-ec2-a}]" \
--query 'Instances[0].InstanceId' --output text)
INST_B_ID=$(aws ec2 run-instances --region "$AWS_REGION" \
--image-id "$AMI_ID" \
--instance-type t3.micro \
--subnet-id "$SUBNET_B_ID" \
--security-group-ids "$SG_B_ID" \
--key-name "$KEY_NAME" \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=${PROJECT}-ec2-b}]" \
--query 'Instances[0].InstanceId' --output text)
echo "Instance A: $INST_A_ID"
echo "Instance B: $INST_B_ID"
Wait until running:
aws ec2 wait instance-running --region "$AWS_REGION" --instance-ids "$INST_A_ID" "$INST_B_ID"
Get private and public IPs:
aws ec2 describe-instances --region "$AWS_REGION" --instance-ids "$INST_A_ID" "$INST_B_ID" \
--query 'Reservations[*].Instances[*].{Name:Tags[?Key==`Name`]|[0].Value,InstanceId:InstanceId,PrivateIp:PrivateIpAddress,PublicIp:PublicIpAddress,VpcId:VpcId}' \
--output table
Expected outcome: Two running instances with private IPs in different VPCs.
Step 7: Test connectivity over AWS Transit Gateway
What you’ll do: SSH to instance A and ping instance B’s private IP.
SSH to instance A:
PUB_A=$(aws ec2 describe-instances --region "$AWS_REGION" --instance-ids "$INST_A_ID" --query 'Reservations[0].Instances[0].PublicIpAddress' --output text)
PRIV_B=$(aws ec2 describe-instances --region "$AWS_REGION" --instance-ids "$INST_B_ID" --query 'Reservations[0].Instances[0].PrivateIpAddress' --output text)
ssh -i "${KEY_NAME}.pem" ec2-user@"$PUB_A"
From instance A, run:
ping -c 3 <PRIVATE_IP_OF_INSTANCE_B>
Expected outcome: Ping succeeds with replies.
If you also want to test reverse direction, SSH to instance B and ping instance A’s private IP.
Validation
Use these checks to confirm the network is correctly configured:
1) Confirm attachments are available:
aws ec2 describe-transit-gateway-vpc-attachments --region "$AWS_REGION" \
--transit-gateway-attachment-ids "$ATT_A_ID" "$ATT_B_ID" \
--query 'TransitGatewayVpcAttachments[*].{Id:TransitGatewayAttachmentId,State:State,VpcId:VpcId,SubnetIds:SubnetIds}'
2) Inspect transit gateway route tables and routes (what you see depends on default association/propagation settings):
aws ec2 describe-transit-gateway-route-tables --region "$AWS_REGION" \
--filters "Name=transit-gateway-id,Values=$TGW_ID" \
--query 'TransitGatewayRouteTables[*].{Id:TransitGatewayRouteTableId,DefaultAssociation:DefaultAssociationRouteTable,DefaultPropagation:DefaultPropagationRouteTable}'
3) Confirm VPC routes to TGW exist:
aws ec2 describe-route-tables --region "$AWS_REGION" --route-table-ids "$RT_A_ID" \
--query 'RouteTables[0].Routes[*].{Dest:DestinationCidrBlock,Tgw:TransitGatewayId,State:State}'
Troubleshooting
Common issues and fixes:
1) Ping fails (timeouts) – Check security groups: ICMP must be allowed inbound on the destination instance security group. – Check NACLs: default NACL typically allows; custom NACLs might block ICMP. – Verify VPC route tables include the remote CIDR with target = transit gateway. – Ensure instances use private IPs for the test, not public IPs.
2) Routes exist in VPC route tables but still no connectivity
– Confirm TGW attachments are available.
– Confirm TGW route table has routes back to each VPC CIDR and that attachments are associated/propagated as intended.
– If defaults didn’t propagate routes, you may need to add TGW static routes or enable propagation explicitly (console/CLI). Verify with official docs for the exact CLI commands in your case.
3) SSH fails
– Ensure your MY_IP is correct.
– Ensure the subnet is public (IGW attached + default route + public IP assignment).
– Check your corporate firewall/VPN restrictions.
4) Cross-AZ surprises – In multi-AZ setups, make sure your VPC attachment includes a subnet in each AZ where workloads live. Otherwise traffic may traverse AZ boundaries inside the VPC.
Cleanup
Clean up immediately to stop charges.
Terminate instances:
aws ec2 terminate-instances --region "$AWS_REGION" --instance-ids "$INST_A_ID" "$INST_B_ID"
aws ec2 wait instance-terminated --region "$AWS_REGION" --instance-ids "$INST_A_ID" "$INST_B_ID"
Delete key pair (optional) and local file:
aws ec2 delete-key-pair --region "$AWS_REGION" --key-name "$KEY_NAME" || true
rm -f "${KEY_NAME}.pem"
Delete transit gateway attachments:
aws ec2 delete-transit-gateway-vpc-attachment --region "$AWS_REGION" --transit-gateway-attachment-id "$ATT_A_ID"
aws ec2 delete-transit-gateway-vpc-attachment --region "$AWS_REGION" --transit-gateway-attachment-id "$ATT_B_ID"
Wait until attachments are deleted (polling example):
aws ec2 describe-transit-gateway-vpc-attachments --region "$AWS_REGION" \
--transit-gateway-attachment-ids "$ATT_A_ID" "$ATT_B_ID" \
--query 'TransitGatewayVpcAttachments[*].State' --output text
Delete the transit gateway:
aws ec2 delete-transit-gateway --region "$AWS_REGION" --transit-gateway-id "$TGW_ID"
Delete internet gateways:
aws ec2 detach-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_A_ID" --vpc-id "$VPC_A_ID"
aws ec2 delete-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_A_ID"
aws ec2 detach-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_B_ID" --vpc-id "$VPC_B_ID"
aws ec2 delete-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_B_ID"
Delete subnets, security groups, and VPCs:
aws ec2 delete-security-group --region "$AWS_REGION" --group-id "$SG_A_ID" || true
aws ec2 delete-security-group --region "$AWS_REGION" --group-id "$SG_B_ID" || true
aws ec2 delete-subnet --region "$AWS_REGION" --subnet-id "$SUBNET_A_ID"
aws ec2 delete-subnet --region "$AWS_REGION" --subnet-id "$SUBNET_B_ID"
aws ec2 delete-vpc --region "$AWS_REGION" --vpc-id "$VPC_A_ID"
aws ec2 delete-vpc --region "$AWS_REGION" --vpc-id "$VPC_B_ID"
11. Best Practices
Architecture best practices
- Use hub-and-spoke intentionally: Keep the TGW as the backbone; avoid ad-hoc peerings that bypass governance.
- Design for segmentation: Create separate TGW route tables for domains like
prod,dev,shared,partner,egress,ingress. - Plan CIDRs early: Use non-overlapping RFC1918 ranges across VPCs and on-prem. This reduces routing conflicts and simplifies propagation.
- Attach subnets per AZ: For any VPC with workloads in multiple AZs, include TGW attachment subnets in those AZs to reduce cross-AZ charges and improve resilience.
- Use inspection VPC patterns carefully: Ensure return routing symmetry and understand how appliances scale.
IAM/security best practices
- Least privilege IAM: Restrict who can create attachments, modify TGW route tables, and change propagation/association.
- Separate duties: Network platform team owns TGW and route tables; workload teams request attachments via controlled workflows.
- Use AWS Organizations SCPs (where appropriate) to prevent unapproved TGW changes in workload accounts.
Cost best practices
- Tag everything: TGW, attachments, route tables. Use cost allocation tags.
- Control traffic patterns: Route only what you need; avoid routing large volumes unnecessarily across TGW.
- Monitor “data processed” growth: It’s often the largest TGW cost driver at scale.
- Avoid accidental cross-AZ routing: Misplaced attachments can create recurring cross-AZ data charges.
Performance best practices
- Prefer local traffic: Keep chatty microservices within the same VPC when possible.
- Use multiple AZ attachment subnets for better path locality.
- Test throughput and latency for inspection appliances and hybrid links; TGW itself is not the only performance factor.
Reliability best practices
- Use redundant connectivity for hybrid: Multiple VPN connections and/or Direct Connect with backup (design depends on requirements).
- Multi-AZ design: Attach subnets in multiple AZs for key VPCs.
- Change management: Treat route changes like production deployments; use approvals and staged rollouts.
Operations best practices
- Enable flow logs selectively: Use them for troubleshooting and security analytics; manage retention.
- Standardize naming: Attachments should include VPC name, environment, account, and purpose.
- Document route intent: Keep a living routing matrix that maps “who can talk to whom” and why.
Governance/tagging/naming best practices
- Suggested tag keys:
NameEnvironment(prod/dev/test)OwnerCostCenterNetworkDomainDataClassification- Use predictable names:
tgw-core-prodtgw-rt-prod,tgw-rt-sharedtgw-att-vpc-<app>-<env>
12. Security Considerations
Identity and access model
- TGW is controlled by IAM. Secure it like a high-impact shared service.
- Use:
- IAM roles for automation (CI/CD)
- MFA and privileged access management for admins
- CloudTrail for auditing changes
Encryption
- AWS Transit Gateway routes traffic at Layer 3; it is not an application-layer encryption service.
- For confidentiality:
- Use TLS at the application layer.
- For on-prem connectivity, use IPsec VPN or MACsec/other options where applicable (Direct Connect encryption options depend on design; verify in official docs).
- For logs:
- CloudWatch Logs and S3 support encryption options; enable encryption and manage KMS keys per policy.
Network exposure
- A TGW can unintentionally create broad connectivity if routes are permissive.
- Use route tables to enforce explicit connectivity:
- Only propagate or add routes that are required.
- Avoid “catch-all” routes to sensitive networks.
- Ensure security groups/NACLs still enforce least privilege inside each VPC.
Secrets handling
- TGW itself doesn’t store application secrets.
- If using VPNs, protect:
- Customer gateway configurations
- Pre-shared keys (where used)
- Router credentials Use AWS Secrets Manager or your enterprise secrets tooling for sensitive configuration data.
Audit/logging
- Enable and centralize:
- AWS CloudTrail for TGW API calls
- Transit Gateway Flow Logs (as needed)
- VPC Flow Logs (for key subnets or ENIs)
- Correlate logs with:
- AWS Config (if enabled) for configuration drift
- SIEM tools for detection
Compliance considerations
- Treat TGW as part of your network boundary.
- Maintain documentation for:
- Routing/segmentation intent
- Change approvals
- Logging retention
- Access controls and break-glass procedures
Common security mistakes
- Allowing automatic propagation from all attachments into a shared route table without review.
- Mixing dev/test with prod in the same route domain.
- Not monitoring attachment creation (a new attachment can change reachability).
- Forgetting return routes, leading to ad-hoc “quick fixes” that expand access too broadly.
Secure deployment recommendations
- Use a dedicated networking account and share TGW via AWS RAM.
- Establish a routing policy model (who can talk to whom).
- Use inspection VPCs for centralized control when required, but validate symmetry and scaling.
- Implement continuous guardrails:
- SCPs
- Config rules (custom/managed as applicable)
- Alerts on route table changes
13. Limitations and Gotchas
Always confirm the latest limits in Service Quotas and official docs; the list below focuses on common real-world issues.
Known limitations / design constraints (common)
- Regional scope: A transit gateway is created per Region; cross-Region requires peering and introduces data transfer considerations.
- Complex routing interactions: You must manage both VPC route tables and TGW route tables; missing either side breaks connectivity.
- Overlapping CIDRs: Overlapping address spaces can create ambiguous routing and restrict connectivity options. Avoid when possible.
- Appliance/inspection patterns are non-trivial: Symmetric routing, scaling, and failure modes must be designed and tested.
- DNS is not automatically solved: TGW routes packets; it does not provide DNS resolution across VPCs by itself. Use Route 53 Resolver endpoints and proper rules for cross-VPC name resolution.
Quotas
- Attachments per TGW
- Routes per TGW route table
- Route tables per TGW
- Prefix list and propagation behaviors (as applicable) Check Service Quotas: https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html
Regional constraints
- Some advanced features (for example, multicast or certain logging destinations) may have Region limitations. Verify in official docs for your Region.
Pricing surprises
- Data processing charges can grow quickly with chatty east-west traffic.
- Cross-AZ charges inside a VPC can occur if attachment subnets don’t align with workload AZs.
- Cross-Region transfer charges apply for inter-Region designs.
- Logging can become a meaningful cost if enabled broadly without retention controls.
Compatibility issues
- Third-party appliances and SD-WAN integrations require careful BGP/ASN planning and tunnel design.
- Hybrid routing (VPN + DX) introduces route preference and failover behavior that must be tested end-to-end.
Operational gotchas
- Changes to route propagation/association can have wide blast radius.
- Lack of a routing inventory leads to “mystery routes” and outages.
- Multi-account sharing demands consistent tagging and ownership models.
Migration challenges
- Moving from many VPC peerings to TGW requires:
- CIDR review
- staged cutovers
- careful route changes
- coordinated security group/NACL updates
- Expect to run mixed connectivity during migration.
14. Comparison with Alternatives
AWS Transit Gateway is not the only way to connect networks. The right tool depends on scale, security model, and connectivity type.
| Option | Best For | Strengths | Weaknesses | When to Choose |
|---|---|---|---|---|
| AWS Transit Gateway | Many VPCs, multi-account, hybrid hub-and-spoke | Central routing, segmentation with route tables, scalable attachments | Ongoing hourly + data processing cost; routing complexity | When you need a routed backbone and centralized control |
| VPC Peering (AWS) | Small number of VPC-to-VPC connections | Simple, low operational overhead for small meshes | Non-transitive; mesh complexity grows fast; harder segmentation | When connecting a few VPCs with minimal policy needs |
| AWS PrivateLink (AWS) | Exposing specific services privately | Service-level access, reduces lateral movement risk | Not for general routing; requires endpoint/service design | When consumers should access only a service, not a whole network |
| AWS Cloud WAN (AWS) | Large global networks with policy-based connectivity | Central policy model across Regions (built on AWS networking primitives) | Different operational model; may not fit smaller deployments | When you want managed global network policy and operations at scale |
| Self-managed routing appliances on EC2 | Highly custom routing, legacy protocols | Full control; can replicate legacy designs | More ops burden; HA complexity; patching; scaling | When you need features not supported natively and can operate routers safely |
| Azure Virtual WAN (Microsoft Azure) | Azure-centric hub-and-spoke / branch connectivity | Managed hub routing and connectivity model | Different cloud; not relevant unless multi-cloud | When building the equivalent architecture in Azure |
| GCP Network Connectivity Center (Google Cloud) | GCP hub connectivity | Central connectivity management in GCP | Different cloud; feature parity differs | When building the equivalent architecture in GCP |
Notes: – AWS Cloud WAN may be an alternative for some global enterprise designs, but it’s a different abstraction and operating model. Evaluate alongside TGW based on your governance and global routing needs. – PrivateLink is often a security-first alternative for service consumption, not a replacement for routed L3 connectivity.
15. Real-World Example
Enterprise example (regulated, multi-account, hybrid)
Problem A regulated enterprise has: – 200+ AWS accounts – Separate prod/dev environments – Two on-prem data centers They need consistent segmentation, centralized inspection, and hybrid connectivity without managing thousands of VPC peerings.
Proposed architecture
– Networking account hosts:
– One AWS Transit Gateway per Region
– Multiple TGW route tables: prod, dev, shared, partner, egress
– Security account hosts an inspection VPC:
– AWS Network Firewall and/or third-party appliances
– Central log storage (S3 + SIEM integration)
– Shared services account hosts:
– Route 53 Resolver inbound/outbound endpoints
– AD / identity services
– On-prem connects via:
– Direct Connect (primary) to DXGW associated with TGW
– Site-to-Site VPN (backup) to TGW (design varies; verify preferred hybrid reference architecture)
Why AWS Transit Gateway was chosen – Central hub for many VPCs across accounts – Strong segmentation via route tables – Fits hybrid design with VPN/DX connectivity – Can standardize onboarding via RAM sharing and automation
Expected outcomes – Faster onboarding of new VPCs/accounts – Reduced risk of accidental full-mesh connectivity – Centralized traffic inspection and logging – More predictable operations and troubleshooting
Startup/small-team example (growing SaaS)
Problem A startup begins with one VPC but grows to: – Separate VPCs for prod and staging – A shared tooling VPC (CI, monitoring) – A partner integration network via VPN VPC peering is still manageable but risks becoming messy as environments multiply.
Proposed architecture
– One TGW in the primary Region
– Route tables:
– prod routes to shared tooling and partner VPN (limited)
– staging routes only to shared tooling
– Add flow logs during incidents or early operations maturity
Why AWS Transit Gateway was chosen – Prevents scaling issues with peering as more VPCs are added – Provides clear segmentation as the team grows – Supports adding VPN/SD-WAN later without redesign
Expected outcomes – Clean separation between prod and non-prod – Faster expansion to new VPCs – A clear place to enforce network policy as security posture matures
16. FAQ
1) Is AWS Transit Gateway a router?
Conceptually yes: it provides managed Layer 3 routing between attached networks. You manage route tables and attachments; AWS manages the underlying infrastructure.
2) Is AWS Transit Gateway global?
A transit gateway is regional. You can connect Regions using transit gateway peering (inter-Region).
3) Do I still need VPC route table changes with TGW?
Yes. TGW routing alone is not enough; each VPC must route traffic destined for remote networks to the transit gateway.
4) What’s the difference between a TGW route table and a VPC route table?
- VPC route table: Controls how resources in a subnet send traffic out of the subnet (including to TGW).
- TGW route table: Controls how TGW forwards traffic between attachments.
5) Is routing transitive with AWS Transit Gateway?
Yes for attachments connected through the same TGW and permitted by TGW route tables and VPC routes. This is a key difference from VPC peering (which is non-transitive).
6) Can I share a transit gateway across AWS accounts?
Yes, using AWS Resource Access Manager (RAM). This is a common landing-zone pattern.
7) Can AWS Transit Gateway replace all VPC peering connections?
Often yes in large environments, but not always. Some teams keep peering for small, isolated connections or use PrivateLink for service exposure.
8) How does AWS Transit Gateway compare to AWS PrivateLink?
- TGW provides network routing (broad connectivity).
- PrivateLink provides service-level private access (more restrictive, often more secure for consumers).
9) Does TGW support IPv6?
IPv6 support exists for many AWS networking services, including TGW routing capabilities, but behavior and support can vary by attachment type and Region. Verify in official docs for your exact scenario.
10) Can I centralize internet egress with TGW?
Yes. A common design is to route 0.0.0.0/0 from spoke VPCs to an egress/inspection VPC. Ensure return routing and appliance symmetry are correct.
11) What’s Transit Gateway Connect used for?
It is commonly used to connect SD-WAN or routing appliances using GRE tunnels with BGP to dynamically exchange routes.
12) Do I need a NAT Gateway with TGW?
TGW is not NAT. If instances in private subnets need outbound internet access, NAT (or another egress design) is still required.
13) How do I troubleshoot connectivity issues through TGW?
Start with: – VPC route tables (source and destination) – TGW attachment state – TGW route table association/propagation and routes – Security groups and NACLs Then use logs (TGW flow logs, VPC flow logs) and reachability tools (VPC Reachability Analyzer).
14) Does TGW encrypt traffic between VPCs?
TGW does not provide application-layer encryption by itself. Use TLS at the application layer. For on-prem, use VPN IPsec or other encryption approaches as appropriate.
15) Is AWS Transit Gateway highly available?
AWS designs TGW to be highly available within a Region. For VPC attachments, you can improve resiliency and reduce cross-AZ routing by selecting subnets in multiple AZs.
16) What are the biggest cost drivers?
Typically: – Hourly charges for TGW and attachments – Data processing per GB – Cross-AZ and cross-Region data transfer – Centralized egress/inspection components (NAT, firewall, appliances, logging)
17) Can I connect overlapping CIDR VPCs to the same TGW?
Overlapping CIDRs create routing ambiguity. In general, avoid overlaps; if unavoidable, you may need NAT/translation patterns and careful routing design. Verify official guidance for your exact overlap scenario.
17. Top Online Resources to Learn AWS Transit Gateway
| Resource Type | Name | Why It Is Useful |
|---|---|---|
| Official documentation | AWS Transit Gateway User Guide: https://docs.aws.amazon.com/transitgateway/latest/ug/ | Authoritative feature definitions, routing behavior, configuration options |
| Official docs (concepts) | What is AWS Transit Gateway?: https://docs.aws.amazon.com/transitgateway/latest/ug/what-is-transit-gateway.html | Clear overview of purpose, components, and concepts |
| Official pricing | AWS Transit Gateway Pricing: https://aws.amazon.com/transit-gateway/pricing/ | Current pricing dimensions by Region and attachment type |
| Cost estimation | AWS Pricing Calculator: https://calculator.aws/ | Build estimates for attachments + data processing + transfer |
| Governance/multi-account | AWS Resource Access Manager (RAM): https://docs.aws.amazon.com/ram/latest/userguide/what-is.html | Learn how sharing TGW across accounts works |
| Monitoring | Amazon CloudWatch: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html | Metrics, alarms, and operational monitoring fundamentals |
| Auditing | AWS CloudTrail: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html | Audit TGW API actions and configuration changes |
| Architecture guidance | AWS Architecture Center: https://aws.amazon.com/architecture/ | Patterns and reference architectures, including networking foundations |
| Network visibility | AWS Network Manager: https://docs.aws.amazon.com/vpc/latest/tgw/what-is-network-manager.html (verify latest link in docs) | Higher-level network visibility and management for TGW-based networks |
| CLI reference | AWS CLI EC2 command reference: https://docs.aws.amazon.com/cli/latest/reference/ec2/ | Exact command syntax for TGW, VPC, route tables, and attachments |
18. Training and Certification Providers
| Institute | Suitable Audience | Likely Learning Focus | Mode | Website URL |
|---|---|---|---|---|
| DevOpsSchool.com | Cloud/DevOps engineers, architects | AWS networking, DevOps practices, hands-on labs | Check website | https://www.devopsschool.com/ |
| ScmGalaxy.com | Students, engineers | DevOps, SCM, automation foundations that support cloud operations | Check website | https://www.scmgalaxy.com/ |
| CLoudOpsNow.in | Cloud ops practitioners | Cloud operations, monitoring, reliability, operations playbooks | Check website | https://www.cloudopsnow.in/ |
| SreSchool.com | SREs, platform engineers | SRE principles, reliability engineering, ops practices | Check website | https://www.sreschool.com/ |
| AiOpsSchool.com | Ops teams exploring AIOps | Monitoring/analytics-oriented ops and AIOps concepts | Check website | https://www.aiopsschool.com/ |
19. Top Trainers
| Platform/Site | Likely Specialization | Suitable Audience | Website URL |
|---|---|---|---|
| RajeshKumar.xyz | DevOps/cloud training content (verify offerings) | Learners seeking guided coaching | https://www.rajeshkumar.xyz/ |
| devopstrainer.in | DevOps and cloud training (verify offerings) | Beginners to intermediate engineers | https://www.devopstrainer.in/ |
| devopsfreelancer.com | Freelance DevOps help/training (verify offerings) | Teams needing flexible, short-term expertise | https://www.devopsfreelancer.com/ |
| devopssupport.in | DevOps support and enablement (verify offerings) | Operations teams and project support | https://www.devopssupport.in/ |
20. Top Consulting Companies
| Company | Likely Service Area | Where They May Help | Consulting Use Case Examples | Website URL |
|---|---|---|---|---|
| cotocus.com | Cloud/DevOps consulting (verify scope) | Architecture, migration planning, operations setup | TGW hub-and-spoke design; landing zone networking; hybrid connectivity planning | https://www.cotocus.com/ |
| DevOpsSchool.com | DevOps and cloud consulting/training (verify scope) | Delivery + enablement | Implement TGW with multi-account sharing; build IaC modules; operational runbooks | https://www.devopsschool.com/ |
| DEVOPSCONSULTING.IN | DevOps consulting (verify scope) | DevOps processes and cloud operations | Network automation pipelines; logging/monitoring integration; cost governance workflows | https://www.devopsconsulting.in/ |
21. Career and Learning Roadmap
What to learn before AWS Transit Gateway
- Networking fundamentals: CIDR, routing tables, NAT, VPN concepts, BGP basics (especially for hybrid).
- Amazon VPC essentials: subnets, route tables, IGW/NAT, security groups, NACLs.
- AWS IAM basics: roles, policies, least privilege.
- Basic observability: CloudWatch, CloudTrail, log retention.
What to learn after AWS Transit Gateway
- Hybrid networking: Direct Connect design, VPN redundancy, routing preference and failover testing.
- Centralized inspection: AWS Network Firewall patterns, third-party appliance HA/scaling.
- Multi-account governance: AWS Organizations, SCPs, RAM strategies, tagging and chargeback.
- Advanced visibility: TGW flow logs analysis, VPC Reachability Analyzer, centralized SIEM integration.
- Global network operations: AWS Network Manager, multi-Region routing strategies.
Job roles that use it
- Cloud Network Engineer
- Solutions Architect
- Platform Engineer
- SRE / Reliability Engineer (in orgs with strong network foundations)
- Security Engineer / Network Security Engineer
- Cloud Operations Engineer
Certification path (AWS)
AWS Transit Gateway aligns strongly with: – AWS Certified Solutions Architect – Associate/Professional – AWS Certified Advanced Networking – Specialty Certification requirements and exam content change over time—verify on the official AWS Training and Certification site: https://aws.amazon.com/certification/
Project ideas for practice
- Build a multi-account TGW sharing lab with AWS RAM (one networking account, two workload accounts).
- Implement segmented TGW route tables for
prod,dev, andshared. - Add an inspection VPC and enforce egress through a firewall (validate symmetric routing).
- Create a Site-to-Site VPN attachment and exchange routes with BGP (requires a compatible on-prem router or lab appliance).
- Enable TGW flow logs and build a basic traffic report in Athena (if logs delivered to S3) or CloudWatch Logs Insights.
22. Glossary
- Attachment: A connection between AWS Transit Gateway and another network resource (VPC, VPN, peering, Connect).
- Association (TGW route table): Determines which TGW route table an attachment uses for route lookups.
- Propagation: Automatic insertion of routes learned from an attachment into a TGW route table.
- CIDR: Classless Inter-Domain Routing notation (e.g.,
10.0.0.0/16) describing IP ranges. - Hub-and-spoke: Network topology where spokes connect to a central hub rather than directly to each other.
- Route table: A set of rules that determines where network traffic is directed.
- TGW route table: Route table inside AWS Transit Gateway for forwarding decisions between attachments.
- VPC route table: Route table inside a VPC that controls subnet traffic routing.
- Site-to-Site VPN: AWS-managed IPsec VPN service that connects AWS to on-premises networks.
- Direct Connect (DX): Dedicated private connectivity from on-premises to AWS.
- Direct Connect Gateway (DXGW): A resource used to connect Direct Connect to multiple VPCs or TGWs (design-dependent).
- Inter-Region peering: Peering between transit gateways in different Regions.
- Flow logs: Metadata logs about network flows (source/destination, ports, bytes, etc.) used for troubleshooting and security analysis.
- Inspection VPC: A VPC dedicated to network security appliances or managed firewalls where traffic is inspected.
- BGP: Border Gateway Protocol used for dynamic route exchange (common in hybrid and Connect designs).
23. Summary
AWS Transit Gateway is AWS’s managed hub router for Networking and content delivery architectures that need scalable, governed connectivity between multiple VPCs and on-premises networks. It matters because it replaces brittle, hard-to-operate point-to-point meshes with a centralized attachment-and-routing model, supports multi-account sharing via AWS RAM, and enables production-grade segmentation using transit gateway route tables.
Cost-wise, plan for hourly charges and per-GB data processing, plus indirect costs like cross-AZ transfer, cross-Region transfer, NAT/firewall appliances, and logging. Security-wise, treat route tables and attachment permissions as high-impact controls: apply least privilege IAM, segment with separate TGW route tables, and enable auditing with CloudTrail and flow logs where appropriate.
Use AWS Transit Gateway when you need a scalable network backbone and centralized routing control. Prefer simpler options like VPC peering for small environments, or PrivateLink when you want service-level exposure without broad network reachability.
Next step: read the official AWS Transit Gateway user guide and then extend the lab into a multi-account shared TGW design with segmentation and logging enabled.