Category
Networking and content delivery
1. Introduction
What this service is
Amazon VPC (Amazon Virtual Private Cloud) is the AWS service that lets you create an isolated virtual network inside AWS. You control IP address ranges, subnets, routing, and network security—similar to building a small data center network, but delivered as a managed cloud construct.
One-paragraph simple explanation
If you want to run AWS resources (like Amazon EC2, Amazon RDS, or Amazon EKS) in a private network where you decide what is public, what is private, and how traffic flows, you use Amazon VPC. You create a VPC, split it into subnets, attach internet access if needed, and use security controls to limit who can talk to what.
One-paragraph technical explanation
Technically, Amazon VPC is a regional, logically isolated network boundary in AWS where you define IPv4/IPv6 CIDR blocks, create subnets mapped to Availability Zones (AZs), and manage traffic using route tables, gateways (Internet Gateway, NAT Gateway, egress-only Internet Gateway), and private connectivity constructs (VPC endpoints/PrivateLink). You enforce network security with security groups (stateful) and network ACLs (stateless), and you can observe traffic with VPC Flow Logs and analyze connectivity with tools like Reachability Analyzer.
What problem it solves
Amazon VPC solves the problem of safely and predictably networking your cloud workloads: – Isolation between environments (dev/test/prod) and between applications/tenants – Controlled inbound and outbound access to the internet – Private connectivity to AWS services without sending traffic over the public internet – Consistent network governance (routing, segmentation, logging) at scale
2. What is Amazon VPC?
Official purpose
Amazon VPC provides a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define. (See official docs: https://docs.aws.amazon.com/vpc/)
Core capabilities
Amazon VPC enables you to: – Define IP address space (IPv4 and optional IPv6) – Create subnets per Availability Zone – Control traffic routing (route tables, gateways) – Control network security (security groups and network ACLs) – Connect networks (VPC peering, AWS Transit Gateway integration, VPN, AWS Direct Connect integration) – Access AWS services privately (VPC endpoints / AWS PrivateLink) – Monitor and troubleshoot networking (VPC Flow Logs, Reachability Analyzer)
Major components
Common Amazon VPC building blocks include:
- VPC: Top-level network boundary within a region.
- Subnets: Segments of the VPC CIDR, each placed in exactly one AZ.
- Route tables: Control where traffic goes (local within VPC, internet, VPN, peering, endpoints, etc.).
- Internet Gateway (IGW): Enables internet connectivity for resources that have public IPs and proper routes.
- NAT Gateway: Enables outbound internet access for private subnets (without inbound internet exposure).
- Egress-only Internet Gateway: IPv6-only outbound internet gateway for private IPv6 resources.
- Security groups: Stateful instance/ENI-level firewall rules.
- Network ACLs (NACLs): Stateless subnet-level rules.
- VPC endpoints: Private connectivity to AWS services (gateway endpoints and interface endpoints/PrivateLink).
- Elastic Network Interfaces (ENIs): Virtual network cards attached to resources (not only EC2).
- DHCP options / DNS settings: Control DNS resolution and hostname behavior.
- Flow logs: Capture metadata about network traffic for auditing/troubleshooting.
Service type
Amazon VPC is a foundational networking control-plane service. You define virtual networking constructs; workloads (EC2, RDS, EKS, etc.) attach to them.
Scope (regional/global/zonal)
- VPCs are regional: A VPC exists within one AWS Region.
- Subnets are zonal: Each subnet maps to exactly one Availability Zone.
- Many VPC constructs are regional (e.g., route tables, IGWs), but are used to build multi-AZ architectures.
How it fits into the AWS ecosystem
Amazon VPC is the default network boundary for most AWS compute and data services: – Amazon EC2, Amazon EKS, Amazon ECS, AWS Lambda (VPC-enabled) attach to subnets/ENIs. – Amazon RDS, Amazon ElastiCache, Amazon OpenSearch Service typically live in private subnets. – AWS PrivateLink (VPC endpoints) enables private access to AWS services (and many partner/SaaS services). – AWS Transit Gateway and AWS Direct Connect integrate to connect multiple VPCs and on-premises networks.
3. Why use Amazon VPC?
Business reasons
- Environment isolation: Separate business units, applications, or environments with clear boundaries.
- Faster delivery with governance: Standardized VPC patterns enable teams to launch workloads safely and quickly.
- Hybrid readiness: VPC designs can accommodate VPN/Direct Connect connectivity to on-premises.
- Vendor ecosystem: Many enterprise network/security patterns are natively supported (firewalls, segmentation, logging).
Technical reasons
- Network segmentation: Use subnets and routing to separate public, private, and restricted tiers.
- Controlled internet access: Enable/disable inbound and outbound paths with gateways and routes.
- Private service access: Use endpoints/PrivateLink to keep traffic off the public internet.
- Scalable connectivity: Expand address space and connect multiple VPCs using hub-and-spoke patterns.
Operational reasons
- Repeatable architectures: Standard VPC modules (Terraform/CloudFormation) reduce drift.
- Observability: Flow logs and AWS CloudTrail provide auditability; Reachability Analyzer helps troubleshooting.
- Change management: Central network teams can manage shared VPCs and enforce policies.
Security/compliance reasons
- Least-privilege networking: Security groups and NACLs can restrict traffic to only what is needed.
- Reduced exposure: Keep databases and internal services in private subnets with no direct internet route.
- Audit trails: Track API changes via CloudTrail; record traffic metadata using Flow Logs.
- Regulatory alignment: Supports segmentation and logging patterns common in compliance frameworks (verify exact compliance requirements with your auditor and AWS Artifact docs).
Scalability/performance reasons
- Multi-AZ design: Spread workloads across AZs for resilience and throughput.
- High-bandwidth AWS backbone: Intra-region traffic can stay on AWS’s private network; PrivateLink can avoid internet paths.
- Elastic growth: Add subnets and routing as applications grow.
When teams should choose it
Choose Amazon VPC when you need: – Any non-trivial AWS workload with network segmentation/security needs – Private subnets for databases or internal services – Hybrid connectivity (VPN/Direct Connect) – Centralized governance over routing and access
When they should not choose it (or when to keep it minimal)
- If you are using AWS services that do not require VPC (some managed services are accessed over public endpoints and can be secured with IAM and service-level controls). Still, most production environments end up using VPC.
- If your use case is purely edge delivery, consider Amazon CloudFront (same “Networking and content delivery” category family) and keep VPC only for origins/workloads.
- If you’re over-segmenting too early: overly complex VPC designs can slow delivery and increase operational risk.
4. Where is Amazon VPC used?
Industries
Amazon VPC is used broadly across: – Financial services (segmentation, audit, hybrid connectivity) – Healthcare/life sciences (access control, logging, data residency patterns) – SaaS and internet companies (multi-tenant patterns, private service connectivity) – Media/gaming (multi-AZ throughput, controlled ingress/egress) – Manufacturing/IoT (hybrid connectivity to plants and devices)
Team types
- Platform engineering teams building standard “landing zones”
- Cloud/network engineering teams owning routing, IPAM, and connectivity
- DevOps/SRE teams deploying services into standardized VPCs
- Security teams defining segmentation, inspection, and logging requirements
Workloads
- 2-tier/3-tier web applications
- Microservices on EKS/ECS
- Databases (RDS/Aurora) in private subnets
- Batch and analytics workers with controlled egress
- Internal tools reachable only via VPN/SSO/bastion/zero-trust access
Architectures
- Public/private subnet architectures
- Hub-and-spoke (Transit Gateway)
- Shared services VPC (central DNS, directory services, CI/CD)
- Multi-account landing zone architectures (AWS Organizations + RAM sharing)
Real-world deployment contexts
- Single account: small teams, simpler governance
- Multi-account: enterprise governance, blast-radius reduction, delegated admin
- Hybrid: on-premises core network with AWS VPC extensions
Production vs dev/test usage
- Production: multi-AZ subnets, strict routing, centralized logging, least-privilege security groups, private endpoints, egress control, change control.
- Dev/test: smaller CIDRs, fewer subnets/AZs, simplified routing; still benefit from templates and guardrails to match production patterns.
5. Top Use Cases and Scenarios
Below are realistic scenarios where Amazon VPC is the central enabler.
1) Public web app with private database
- Problem: Expose a web tier to users but keep the database unreachable from the internet.
- Why Amazon VPC fits: Public subnets for load balancers/web; private subnets for DB; security groups restrict DB access.
- Example: ALB in public subnets → EC2/ECS in private subnets → RDS in isolated private subnets.
2) Microservices platform on Amazon EKS
- Problem: Run many services with controlled east-west traffic and safe egress.
- Why it fits: Subnets per AZ, security groups, endpoints, optional network firewall insertion patterns.
- Example: EKS nodes in private subnets, NAT for controlled outbound, VPC endpoints for ECR/S3/CloudWatch.
3) Hybrid connectivity to on-premises
- Problem: Connect AWS workloads to corporate data centers.
- Why it fits: Integrates with Site-to-Site VPN and Direct Connect; route propagation and segmentation.
- Example: On-prem → VPN/Direct Connect → Transit Gateway → multiple VPCs.
4) Multi-account shared network model
- Problem: Multiple teams need VPCs but networking must be centrally governed.
- Why it fits: VPC sharing via AWS Resource Access Manager (RAM) and centralized inspection.
- Example: Central network account shares subnets; workload accounts deploy EC2/EKS into shared subnets.
5) Private access to AWS services (no public internet)
- Problem: Prevent workloads from using the public internet while still accessing AWS APIs.
- Why it fits: VPC endpoints/PrivateLink; gateway endpoints for S3/DynamoDB; interface endpoints for many AWS services.
- Example: Private subnets without IGW/NAT; interface endpoints for SSM/CloudWatch; gateway endpoint for S3.
6) Multi-tier segmentation for compliance
- Problem: Separate regulated data systems from general application tiers.
- Why it fits: Layered subnets, route control, NACLs, security groups, centralized logging and inspection.
- Example: “Restricted” subnets with no egress except to specific inspection appliances/endpoints.
7) Controlled outbound (egress) with auditing
- Problem: Applications must reach the internet only to approved destinations and all egress must be logged.
- Why it fits: NAT Gateway + routing + security tools; Flow Logs for metadata.
- Example: Private subnets route 0.0.0.0/0 to NAT; combine with DNS filtering/proxy or firewall (verify exact product choices per requirement).
8) SaaS private connectivity using PrivateLink
- Problem: Connect to a SaaS provider without internet exposure.
- Why it fits: Interface endpoints support private connectivity to endpoint services.
- Example: Consumer VPC creates an interface endpoint to a partner’s PrivateLink endpoint service.
9) Blue/green network cutovers
- Problem: Migrate applications with minimal downtime.
- Why it fits: Parallel subnets/routes and controlled switching (often with load balancers and DNS).
- Example: Build new private subnets and route tables; gradually shift traffic via ALB target groups and DNS.
10) Centralized DNS and shared services
- Problem: Many VPCs need consistent DNS resolution for internal domains.
- Why it fits: Integrates with Route 53 Resolver endpoints and shared architectures.
- Example: Shared services VPC hosts Resolver endpoints; other VPCs forward queries to on-prem DNS.
11) IP address management at scale (IPAM)
- Problem: Avoid CIDR conflicts and manage IP allocations across multiple regions/accounts.
- Why it fits: VPC IP Address Manager (IPAM) provides governance, planning, and visibility.
- Example: Enterprise IPAM pools enforce standardized CIDR allocation.
12) Security inspection VPC with traffic steering
- Problem: Insert inspection (firewall/IDS) into network paths.
- Why it fits: Routing patterns with Transit Gateway and appliance subnets enable inspection architectures.
- Example: Spoke VPCs route egress to Transit Gateway → inspection VPC → NAT/IGW.
6. Core Features
VPC and CIDR management (IPv4/IPv6)
- What it does: Lets you define IPv4 address ranges (CIDR blocks) and optionally IPv6.
- Why it matters: Your CIDR plan drives scalability, segmentation, and hybrid compatibility.
- Practical benefit: Predictable IP allocation for subnets and workloads.
- Caveats: CIDR sizing mistakes are hard to unwind later. Plan for growth and mergers. Verify current CIDR association limits in Service Quotas.
Subnets across Availability Zones
- What it does: Divides your VPC into subnets, each mapped to a single AZ.
- Why it matters: AZ mapping is fundamental to high availability.
- Practical benefit: Place redundant instances across AZs while keeping routing/security consistent.
- Caveats: Subnets do not span AZs. Design per-AZ subnet pairs (public/private) for resilient architectures.
Route tables and routing targets
- What it does: Controls where traffic goes: within VPC (“local”), to IGW, NAT, Transit Gateway, peering, endpoints, etc.
- Why it matters: Routing defines reachability and segmentation.
- Practical benefit: Build patterns like “private subnets with outbound-only internet”.
- Caveats: Overlapping routes and incorrect associations are a common cause of outages. Changes propagate quickly—use change control.
Internet Gateway (IGW)
- What it does: Enables internet connectivity for resources in public subnets (with public IPs and correct routes).
- Why it matters: It’s the simplest way to support public inbound/outbound.
- Practical benefit: Public-facing ALBs, bastions (if used), and some workloads need it.
- Caveats: Attaching an IGW doesn’t automatically make things public. You still need routes + public IP + security rules.
NAT Gateway
- What it does: Allows instances in private subnets to initiate outbound internet connections without being reachable from the internet.
- Why it matters: Many workloads need patching, package downloads, or external API calls without inbound exposure.
- Practical benefit: Keeps private subnets private while enabling outbound.
- Caveats: NAT Gateway typically has hourly and per-GB processing charges (region-dependent). Also requires an Elastic IP and is AZ-scoped; for HA, use one per AZ and route accordingly.
Egress-only Internet Gateway (IPv6)
- What it does: Provides outbound-only internet access for IPv6 resources.
- Why it matters: IPv6 addresses are globally routable; you need a way to prevent inbound while allowing outbound.
- Practical benefit: IPv6 adoption with controlled exposure.
- Caveats: IPv6 security posture needs careful security group/NACL design; verify current IPv6 feature behavior in your region.
Security groups (stateful)
- What it does: Firewall rules at the ENI/resource level; return traffic is automatically allowed.
- Why it matters: Primary control for workload-to-workload and inbound access.
- Practical benefit: Strong segmentation without complex subnet ACLs.
- Caveats: Misconfigured security groups are a top cause of outages. Keep rules minimal; use referencing (SG-to-SG) where possible.
Network ACLs (stateless)
- What it does: Subnet-level allow/deny rules; you must explicitly allow return traffic.
- Why it matters: Adds an extra layer of subnet boundary control.
- Practical benefit: Useful for coarse-grained controls (e.g., block known-bad ranges).
- Caveats: Operationally tricky at scale; easy to break return paths. Many teams keep NACLs simple and rely on SGs.
VPC endpoints (AWS PrivateLink and gateway endpoints)
- What it does: Provides private connectivity from your VPC to AWS services:
- Gateway endpoints: route-table targets for supported services (notably S3 and DynamoDB).
- Interface endpoints (PrivateLink): ENIs in your subnets that connect privately to a service.
- Why it matters: Reduce internet exposure and simplify compliance.
- Practical benefit: Private subnets can access AWS APIs without NAT/IGW.
- Caveats: Interface endpoints have hourly and per-GB charges (region-dependent). Also consider DNS, endpoint policies, and security groups.
VPC peering
- What it does: Private routing connectivity between two VPCs.
- Why it matters: Simple point-to-point connectivity for small-scale network graphs.
- Practical benefit: Low operational overhead for a few VPCs.
- Caveats: No transitive routing. For many VPCs, Transit Gateway is usually more scalable.
DHCP options and VPC DNS settings
- What it does: Controls DNS resolution/hostnames and DHCP-provided options.
- Why it matters: DNS is foundational to application reliability.
- Practical benefit: Consistent domain suffix and resolver behavior.
- Caveats: DNS settings impact many services. Test changes carefully.
Elastic Network Interfaces (ENIs) and secondary IPs
- What it does: ENIs attach networking to instances and some managed services; support multiple IPs.
- Why it matters: Enables advanced networking patterns (multiple NICs, appliances, IP failover designs).
- Practical benefit: Flexible architecture for inspection or multi-homed workloads.
- Caveats: ENI limits depend on instance type; check quotas/instance docs.
VPC Flow Logs
- What it does: Captures traffic metadata (source/destination, ports, action, etc.) for ENIs, subnets, or VPCs.
- Why it matters: Essential for troubleshooting and audit trails.
- Practical benefit: Detect unexpected traffic, confirm security group behavior, support incident response.
- Caveats: Flow logs are metadata, not packet payloads. Costs depend on destination (CloudWatch Logs/S3/Kinesis) and volume.
Traffic Mirroring
- What it does: Mirrors network traffic from ENIs to security appliances for deep inspection.
- Why it matters: Useful for IDS/IPS and advanced threat detection.
- Practical benefit: Packet-level visibility for selected traffic.
- Caveats: Additional cost and operational complexity; ensure privacy/compliance approvals.
Reachability Analyzer (connectivity troubleshooting)
- What it does: Analyzes network paths between source and destination and explains why traffic is or isn’t reachable.
- Why it matters: Faster troubleshooting than manual route/SG/NACL inspection.
- Practical benefit: Pinpoints misconfigured routes/security groups.
- Caveats: Scope and pricing/availability can evolve—verify current details in official docs for your region.
VPC IP Address Manager (IPAM)
- What it does: Helps plan, allocate, and track IP space across accounts/regions.
- Why it matters: Prevents CIDR overlap and enables structured growth.
- Practical benefit: Governance for large enterprises and multi-region deployments.
- Caveats: Typically has its own pricing. Adopt when IP sprawl becomes a real risk.
7. Architecture and How It Works
Service architecture at a high level
Amazon VPC is primarily a control plane where you define: – Address space (CIDR) – Segmentation (subnets) – Reachability (route tables and gateways) – Security policy (security groups and NACLs) – Observability (flow logs)
Your workloads attach to the VPC through ENIs. Data-plane traffic flows according to routes and security rules. Many managed AWS services create ENIs in your subnets (for example, RDS in a DB subnet group, interface endpoints, and some VPC-enabled Lambda functions).
Request/data/control flow
- Control plane: You use the AWS Management Console, AWS CLI, SDKs, CloudFormation, or Terraform to create and modify VPC resources. These API calls are logged in AWS CloudTrail.
- Data plane: Packets flow between ENIs according to: 1. Local VPC routing 2. Subnet route table rules (including gateway/endpoint targets) 3. Security group and NACL evaluation 4. Gateway/attachment behavior (IGW, NAT, peering, Transit Gateway, VPN)
Integrations with related services
Common integrations include: – Amazon EC2: Instances and ENIs live in subnets. – Elastic Load Balancing (ALB/NLB): Load balancers are deployed in subnets (typically public for internet-facing). – Amazon RDS/Aurora: Database subnet groups use private subnets; access controlled via security groups. – Amazon EKS/ECS: Nodes and tasks/pods use VPC networking. – AWS Transit Gateway: Central routing hub for many VPCs and VPN/DX attachments. – VPC endpoints/PrivateLink: Private access to AWS services and third parties. – Amazon Route 53 Resolver: Inbound/outbound endpoints for hybrid DNS. – AWS Network Firewall (separate service) often used with VPC routing for inspection patterns. – Amazon CloudWatch / CloudWatch Logs: Flow logs and operational telemetry. – AWS Config: Configuration governance and drift detection. – AWS Organizations / Control Tower: Multi-account governance (not required, but common in enterprises).
Dependency services
Amazon VPC itself is a base AWS service, but practical deployments often depend on: – IAM (permissions, instance roles) – CloudTrail (audit) – CloudWatch Logs/S3 (for flow logs storage) – EC2 (to test connectivity) – Optional: Transit Gateway, Route 53, Network Firewall, Direct Connect, VPN
Security/authentication model
- API access: IAM policies govern who can create/modify VPC resources (
ec2:*actions for VPC-related APIs). - Network access: Enforced at multiple layers:
- Security groups (stateful)
- NACLs (stateless)
- Route tables (reachability)
- Optional centralized inspection (Network Firewall, appliances)
Networking model (how packets are decided)
At a high level: 1. A resource sends traffic from its ENI. 2. Security group egress rules must permit it. 3. Subnet route table determines next hop: – Another subnet (local) – IGW (internet) – NAT Gateway (private outbound) – VPC endpoint (private AWS service access) – Transit Gateway / peering / VPN attachment 4. NACL rules are evaluated at subnet boundary (inbound/outbound). 5. Return traffic is allowed automatically for security groups but must be explicitly allowed for NACLs.
Monitoring/logging/governance considerations
- CloudTrail: Logs VPC API changes (who changed routes, SGs, etc.).
- VPC Flow Logs: Logs traffic metadata to CloudWatch Logs/S3/Kinesis.
- Reachability Analyzer: Helps confirm why a path is reachable/unreachable.
- AWS Config: Track configuration changes and enforce rules (e.g., “no 0.0.0.0/0 to SSH”).
- Tagging: Essential for cost allocation and ops ownership (VPC, subnets, route tables, gateways, endpoints).
Simple architecture diagram (Mermaid)
flowchart TB
Internet((Internet))
IGW[Internet Gateway]
VPC[VPC (Region)]
PubSubnet[Public Subnet (AZ-a)]
EC2Web[EC2 Instance]
RT[Route Table: 0.0.0.0/0 -> IGW]
SG[Security Group]
Internet --- IGW --- VPC
VPC --- PubSubnet --- EC2Web
PubSubnet --- RT
EC2Web --- SG
Production-style architecture diagram (Mermaid)
flowchart LR
Users((Users))
Internet((Internet))
ALB[ALB (Public)]
VPC[VPC (Region)]
PubA[Public Subnet AZ-a]
PubB[Public Subnet AZ-b]
PrivA[Private App Subnet AZ-a]
PrivB[Private App Subnet AZ-b]
DataA[Private DB Subnet AZ-a]
DataB[Private DB Subnet AZ-b]
NATa[NAT GW AZ-a]
NATb[NAT GW AZ-b]
IGW[Internet Gateway]
AppA[ECS/EKS/EC2 App]
AppB[ECS/EKS/EC2 App]
RDS[(RDS/Aurora)]
S3[S3]
VPCEndpoint[Gateway Endpoint (S3)]
FlowLogs[VPC Flow Logs -> CloudWatch/S3]
Users --> Internet --> ALB
Internet --> IGW --> VPC
VPC --- PubA --- ALB
VPC --- PubB --- ALB
PubA --> NATa --> IGW
PubB --> NATb --> IGW
ALB --> AppA
ALB --> AppB
AppA --> RDS
AppB --> RDS
AppA --> VPCEndpoint --> S3
AppB --> VPCEndpoint --> S3
VPC --> FlowLogs
VPC --- PrivA --- AppA
VPC --- PrivB --- AppB
VPC --- DataA --- RDS
VPC --- DataB --- RDS
8. Prerequisites
Account requirements
- An active AWS account with billing enabled.
- Ability to create IAM roles and networking resources.
Permissions / IAM roles
Minimum recommended permissions for the lab:
– VPC and EC2: ec2:CreateVpc, ec2:CreateSubnet, ec2:CreateRouteTable, ec2:CreateRoute, ec2:CreateInternetGateway, ec2:AttachInternetGateway, ec2:RunInstances, ec2:CreateSecurityGroup, ec2:AuthorizeSecurityGroupEgress, etc.
– IAM for instance role: iam:CreateRole, iam:AttachRolePolicy, iam:CreateInstanceProfile, iam:AddRoleToInstanceProfile
– Flow logs to CloudWatch: permissions to create log groups and roles/policies, plus ec2:CreateFlowLogs
If you are in an enterprise environment, these may be controlled by platform teams. Use least privilege and follow your organization’s change process.
Billing requirements
- Amazon VPC itself has no “VPC creation” charge, but the lab uses:
- Amazon EC2 instance (compute)
- Potential CloudWatch Logs ingestion/storage for flow logs
- If you enable NAT gateways or interface endpoints, costs can increase quickly (we keep them optional).
CLI/SDK/tools needed
Choose one: – AWS Management Console (web browser), or – AWS CLI v2 (recommended for repeatability): https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
Optional but useful: – AWS CloudShell (browser-based shell with AWS CLI preinstalled) – SSH client (not required if using AWS Systems Manager Session Manager)
Region availability
Amazon VPC is available in all standard AWS regions. Specific features (or quotas) can vary by region—verify in official docs if you are using newer features (IPAM, certain endpoints, etc.).
Quotas/limits
Amazon VPC has multiple quotas (VPCs per region, subnets per VPC, route tables, routes, security group rules, etc.). – Check Service Quotas in the AWS console for authoritative values in your account/region: – Service Quotas → Amazon Virtual Private Cloud (VPC) – Some quotas can be increased by request.
Prerequisite services
For the hands-on lab we will use: – Amazon EC2 – AWS IAM – Amazon CloudWatch Logs – AWS Systems Manager (Session Manager) for instance access (uses SSM agent and IAM role)
9. Pricing / Cost
Pricing model (accurate, usage-based)
Amazon VPC has a mixed cost model: – Many VPC constructs are free to create (VPC, subnets, route tables, security groups, NACLs, IGW attachment). – Costs typically come from data processing, hourly-managed networking components, and logging.
Always confirm in the official pricing pages because pricing is region-dependent and changes over time.
Official pricing starting points: – Amazon VPC pricing: https://aws.amazon.com/vpc/pricing/ – Data transfer pricing: https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer – AWS Pricing Calculator: https://calculator.aws/
Key pricing dimensions
Common cost dimensions in VPC-centered architectures:
-
NAT Gateway – Typically charged per hour and per GB processed. – Often one NAT Gateway per AZ for high availability.
-
VPC interface endpoints (PrivateLink) – Typically charged per endpoint-hour and per GB processed. – You may need multiple endpoints across AZs/subnets for resilience.
-
VPC gateway endpoints – Commonly used for S3/DynamoDB and generally do not have the same hourly pricing model as interface endpoints; verify current pricing and any data processing considerations in official docs.
-
Site-to-Site VPN – Typically charged per VPN connection-hour and data transfer.
-
Traffic Mirroring – Charges can apply (data processing/collection). Verify current rates.
-
VPC Flow Logs – The cost depends mainly on the destination:
- CloudWatch Logs ingestion + storage + queries (Logs Insights)
- S3 storage and retrieval
- Kinesis Data Firehose ingestion/delivery
- High-traffic environments can generate large log volumes.
-
Inter-AZ and inter-region data transfer – Architectures that spread traffic across AZs can incur inter-AZ data charges (depending on traffic direction and service). – Inter-region traffic is typically charged and can be significant.
Free tier considerations
- AWS Free Tier generally applies to certain EC2 usage and other services, not “Amazon VPC” itself.
- Some labs can fit in Free Tier, but networking add-ons (NAT, endpoints, logs) are often not fully covered. Verify Free Tier details for your account: https://aws.amazon.com/free/
Cost drivers (what usually surprises teams)
- NAT Gateway per-GB processing: frequent large downloads (patching, container pulls) can be costly.
- Interface endpoints sprawl: many endpoints across many VPCs/AZs becomes an hourly bill line item.
- CloudWatch Logs ingestion: flow logs at scale can be expensive without sampling/filters or S3-based pipelines.
- Cross-AZ traffic: chatty microservices or misbalanced load can create cross-AZ transfer costs.
- Egress to the internet: outbound traffic charges add up for media-heavy apps or data exports.
How to optimize cost (practical tactics)
- Minimize NAT usage:
- Prefer VPC endpoints for AWS services (S3, DynamoDB, ECR, CloudWatch, SSM endpoints as needed).
- Keep workloads that need heavy internet egress in public subnets only if security posture allows (often it doesn’t).
- Control flow logs:
- Enable where needed (critical subnets/ENIs), not everywhere by default.
- Consider S3 for long-term retention and use lifecycle policies.
- Design for AZ locality:
- Keep dependent tiers in the same AZ when possible and safe (while maintaining HA).
- Consolidate endpoints:
- In multi-account environments, consider shared endpoint patterns (when supported) and standardized endpoint sets.
- Tag everything:
- Use cost allocation tags for NAT, endpoints, flow logs, and gateways.
Example low-cost starter estimate (conceptual)
A minimal learning setup can be low cost if you: – Create a VPC + subnet + IGW (no direct cost) – Run a single small EC2 instance briefly (compute cost) – Enable limited VPC Flow Logs for a short time (CloudWatch Logs ingestion/storage)
Because exact prices vary by region and change over time, use the AWS Pricing Calculator for your region and expected runtime.
Example production cost considerations
In production, typical recurring networking costs often include: – NAT Gateways (per AZ) + data processing – Interface endpoints (multiple services × AZs) – Flow logs at VPC/subnet/ENI scope + retention – VPN/Direct Connect port charges (for hybrid) – Inter-AZ data transfer due to load balancing or service-to-service traffic
For a production estimate, model: – Number of AZs – Expected egress GB/month (internet, endpoints, NAT) – Expected log GB/day – Endpoint count per region – Hybrid bandwidth and uptime
10. Step-by-Step Hands-On Tutorial
Objective
Build a basic Amazon VPC with a public subnet and internet routing, launch a small EC2 instance, connect securely using AWS Systems Manager Session Manager (no inbound SSH), enable VPC Flow Logs to CloudWatch Logs, generate traffic, and validate that logs are captured.
This lab teaches: – VPC + subnet + route table + Internet Gateway fundamentals – Security group basics (no inbound required for SSM) – VPC Flow Logs for visibility – Safe cleanup to avoid ongoing charges
Lab Overview
You will create: – 1 VPC (IPv4 CIDR) – 1 public subnet in one AZ – 1 Internet Gateway and route table association – 1 security group (minimal) – 1 EC2 instance with an IAM role for SSM – 1 CloudWatch log group and VPC Flow Log
You will verify: – Instance has internet egress (via IGW route and public IP) – You can connect through Session Manager – Flow logs show traffic records
Cost note: This lab avoids NAT Gateway and interface endpoints to keep costs low. EC2 and CloudWatch Logs may still incur charges.
Step 1: Choose region, set naming, and confirm identity
If using AWS CLI (recommended), open CloudShell or your terminal and run:
aws sts get-caller-identity
aws configure get region
Set a region explicitly if needed:
export AWS_REGION="us-east-1" # change to your preferred region
Define a naming prefix for resources:
export PREFIX="vpc-lab"
Expected outcome – You see your AWS account ID and principal. – You have a region selected.
Step 2: Create the VPC and enable DNS
Create a VPC (example CIDR 10.20.0.0/16):
export VPC_CIDR="10.20.0.0/16"
VPC_ID=$(aws ec2 create-vpc \
--region "$AWS_REGION" \
--cidr-block "$VPC_CIDR" \
--tag-specifications "ResourceType=vpc,Tags=[{Key=Name,Value=${PREFIX}-vpc}]" \
--query "Vpc.VpcId" --output text)
echo "VPC_ID=$VPC_ID"
Enable DNS support and DNS hostnames (helps with instance DNS names):
aws ec2 modify-vpc-attribute --region "$AWS_REGION" --vpc-id "$VPC_ID" --enable-dns-support
aws ec2 modify-vpc-attribute --region "$AWS_REGION" --vpc-id "$VPC_ID" --enable-dns-hostnames
Expected outcome – A new VPC exists and has DNS enabled.
Verification
aws ec2 describe-vpcs --region "$AWS_REGION" --vpc-ids "$VPC_ID"
Step 3: Create a public subnet in one Availability Zone
Pick one AZ in your region:
AZ=$(aws ec2 describe-availability-zones --region "$AWS_REGION" \
--query "AvailabilityZones[0].ZoneName" --output text)
echo "AZ=$AZ"
Create a subnet (example 10.20.1.0/24):
SUBNET_ID=$(aws ec2 create-subnet \
--region "$AWS_REGION" \
--vpc-id "$VPC_ID" \
--availability-zone "$AZ" \
--cidr-block "10.20.1.0/24" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=${PREFIX}-public-${AZ}}]" \
--query "Subnet.SubnetId" --output text)
echo "SUBNET_ID=$SUBNET_ID"
Enable auto-assign public IPv4 addresses for this subnet (so instances get public IPs by default):
aws ec2 modify-subnet-attribute \
--region "$AWS_REGION" \
--subnet-id "$SUBNET_ID" \
--map-public-ip-on-launch
Expected outcome – One subnet exists in a specific AZ and auto-assigns public IPs.
Verification
aws ec2 describe-subnets --region "$AWS_REGION" --subnet-ids "$SUBNET_ID"
Step 4: Create and attach an Internet Gateway, then add routes
Create an Internet Gateway:
IGW_ID=$(aws ec2 create-internet-gateway \
--region "$AWS_REGION" \
--tag-specifications "ResourceType=internet-gateway,Tags=[{Key=Name,Value=${PREFIX}-igw}]" \
--query "InternetGateway.InternetGatewayId" --output text)
echo "IGW_ID=$IGW_ID"
Attach it to the VPC:
aws ec2 attach-internet-gateway \
--region "$AWS_REGION" \
--internet-gateway-id "$IGW_ID" \
--vpc-id "$VPC_ID"
Create a route table:
RTB_ID=$(aws ec2 create-route-table \
--region "$AWS_REGION" \
--vpc-id "$VPC_ID" \
--tag-specifications "ResourceType=route-table,Tags=[{Key=Name,Value=${PREFIX}-public-rtb}]" \
--query "RouteTable.RouteTableId" --output text)
echo "RTB_ID=$RTB_ID"
Add a default route (0.0.0.0/0) to the IGW:
aws ec2 create-route \
--region "$AWS_REGION" \
--route-table-id "$RTB_ID" \
--destination-cidr-block "0.0.0.0/0" \
--gateway-id "$IGW_ID"
Associate the route table with the subnet:
aws ec2 associate-route-table \
--region "$AWS_REGION" \
--route-table-id "$RTB_ID" \
--subnet-id "$SUBNET_ID"
Expected outcome – The subnet has a route to the internet through the IGW.
Verification
aws ec2 describe-route-tables --region "$AWS_REGION" --route-table-ids "$RTB_ID"
You should see:
– a local route for 10.20.0.0/16
– a 0.0.0.0/0 route to the Internet Gateway
Step 5: Create a security group (minimal inbound)
Create a security group with no inbound rules (SSM does not require inbound access):
SG_ID=$(aws ec2 create-security-group \
--region "$AWS_REGION" \
--group-name "${PREFIX}-sg" \
--description "Minimal SG for SSM lab" \
--vpc-id "$VPC_ID" \
--tag-specifications "ResourceType=security-group,Tags=[{Key=Name,Value=${PREFIX}-sg}]" \
--query "GroupId" --output text)
echo "SG_ID=$SG_ID"
By default, security groups allow all outbound. That’s fine for this lab. In production you often restrict egress.
Expected outcome – You have a security group ready to attach to the instance.
Verification
aws ec2 describe-security-groups --region "$AWS_REGION" --group-ids "$SG_ID"
Step 6: Create an IAM role for Session Manager (SSM)
Create a trust policy file:
cat > /tmp/ec2-trust-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "ec2.amazonaws.com" },
"Action": "sts:AssumeRole"
}
]
}
EOF
Create the role:
ROLE_NAME="${PREFIX}-ec2-ssm-role"
aws iam create-role \
--role-name "$ROLE_NAME" \
--assume-role-policy-document file:///tmp/ec2-trust-policy.json
Attach the AWS-managed policy required for SSM:
aws iam attach-role-policy \
--role-name "$ROLE_NAME" \
--policy-arn "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
Create an instance profile and add the role:
PROFILE_NAME="${PREFIX}-ec2-profile"
aws iam create-instance-profile --instance-profile-name "$PROFILE_NAME"
aws iam add-role-to-instance-profile --instance-profile-name "$PROFILE_NAME" --role-name "$ROLE_NAME"
Wait briefly for IAM propagation (often needed):
sleep 15
Expected outcome – An EC2 instance profile exists that grants SSM access.
Verification
aws iam get-instance-profile --instance-profile-name "$PROFILE_NAME"
Step 7: Launch an EC2 instance in the public subnet
Find a current Amazon Linux AMI via SSM Parameter Store (recommended approach because AMI IDs vary by region). For Amazon Linux 2023, AWS publishes parameters; verify the exact parameter name in official docs if needed.
Common approach:
AMI_ID=$(aws ssm get-parameter \
--region "$AWS_REGION" \
--name "/aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-x86_64" \
--query "Parameter.Value" --output text)
echo "AMI_ID=$AMI_ID"
Launch a small instance (choose an instance type available to you, e.g., t3.micro):
INSTANCE_ID=$(aws ec2 run-instances \
--region "$AWS_REGION" \
--image-id "$AMI_ID" \
--instance-type "t3.micro" \
--subnet-id "$SUBNET_ID" \
--security-group-ids "$SG_ID" \
--iam-instance-profile Name="$PROFILE_NAME" \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=${PREFIX}-ec2}]" \
--query "Instances[0].InstanceId" --output text)
echo "INSTANCE_ID=$INSTANCE_ID"
Wait for it to become running:
aws ec2 wait instance-running --region "$AWS_REGION" --instance-ids "$INSTANCE_ID"
Expected outcome – EC2 instance is running with a public IP and can reach AWS SSM endpoints via the internet route.
Verification
aws ec2 describe-instances --region "$AWS_REGION" --instance-ids "$INSTANCE_ID" \
--query "Reservations[0].Instances[0].{State:State.Name,PublicIp:PublicIpAddress,Subnet:SubnetId,Vpc:VpcId}"
Step 8: Connect using Session Manager and generate traffic
In the AWS Console:
1. Go to EC2 → Instances
2. Select the instance ${PREFIX}-ec2
3. Click Connect
4. Choose Session Manager
5. Click Connect
Run a few commands to generate outbound traffic:
curl -I https://example.com
curl -I https://aws.amazon.com
(Optional) Confirm your instance has DNS working:
getent hosts example.com
Expected outcome
– You successfully open a shell without SSH.
– curl requests succeed (HTTP headers returned).
If Session Manager doesn’t work, see Troubleshooting below.
Step 9: Enable VPC Flow Logs to CloudWatch Logs
Create a CloudWatch log group:
LOG_GROUP="/aws/vpc/${PREFIX}-flowlogs"
aws logs create-log-group --region "$AWS_REGION" --log-group-name "$LOG_GROUP" || true
Create an IAM role for flow logs to publish to CloudWatch Logs.
Trust policy:
cat > /tmp/flowlogs-trust-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "vpc-flow-logs.amazonaws.com" },
"Action": "sts:AssumeRole"
}
]
}
EOF
Create the role:
FLOWLOGS_ROLE="${PREFIX}-flowlogs-role"
aws iam create-role \
--role-name "$FLOWLOGS_ROLE" \
--assume-role-policy-document file:///tmp/flowlogs-trust-policy.json
Attach permissions. AWS provides guidance on the required permissions; the policy typically allows logs:CreateLogStream and logs:PutLogEvents on the log group. Create an inline policy:
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
cat > /tmp/flowlogs-policy.json << EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:${AWS_REGION}:${ACCOUNT_ID}:log-group:${LOG_GROUP}:*",
"arn:aws:logs:${AWS_REGION}:${ACCOUNT_ID}:log-group:${LOG_GROUP}"
]
}
]
}
EOF
aws iam put-role-policy \
--role-name "$FLOWLOGS_ROLE" \
--policy-name "${PREFIX}-flowlogs-to-cw" \
--policy-document file:///tmp/flowlogs-policy.json
Create the flow log for the VPC (capture ALL traffic; in production you may choose ACCEPT/REJECT or specific subnets/ENIs):
FLOW_LOG_ID=$(aws ec2 create-flow-logs \
--region "$AWS_REGION" \
--resource-type VPC \
--resource-ids "$VPC_ID" \
--traffic-type ALL \
--log-destination-type cloud-watch-logs \
--log-group-name "$LOG_GROUP" \
--deliver-logs-permission-arn "arn:aws:iam::${ACCOUNT_ID}:role/${FLOWLOGS_ROLE}" \
--query "FlowLogIds[0]" --output text)
echo "FLOW_LOG_ID=$FLOW_LOG_ID"
Expected outcome – Flow logs are enabled and will start delivering records to CloudWatch Logs.
Verification
aws ec2 describe-flow-logs --region "$AWS_REGION" --flow-log-ids "$FLOW_LOG_ID"
Now generate a bit more traffic from your instance (repeat curl), wait ~1–3 minutes, then check logs.
Step 10: Query Flow Logs in CloudWatch Logs Insights
In the AWS Console:
1. Go to CloudWatch → Logs → Log groups
2. Open the log group: /aws/vpc/vpc-lab-flowlogs (or your prefix)
3. Click Logs Insights
4. Use a query similar to:
fields @timestamp, srcAddr, dstAddr, srcPort, dstPort, protocol, action
| sort @timestamp desc
| limit 50
The exact fields depend on the flow log format/version used by AWS. If fields differ, inspect a raw log event and adjust the query accordingly.
Expected outcome – You can see ACCEPT/REJECT records showing your instance’s outbound connections.
Validation
Use this checklist:
- Routing
– Public subnet route table has
0.0.0.0/0→ IGW - Instance connectivity
– EC2 has a public IP
–
curl https://example.comworks inside Session Manager - Flow logs – CloudWatch log group contains flow log events – Logs Insights query returns recent traffic entries
Optional CLI checks:
aws ec2 describe-internet-gateways --region "$AWS_REGION" --internet-gateway-ids "$IGW_ID"
aws logs describe-log-streams --region "$AWS_REGION" --log-group-name "$LOG_GROUP" --max-items 5
Troubleshooting
Common issues and fixes:
-
Session Manager “Not connected” / instance not showing as managed – Ensure the instance has the IAM role with
AmazonSSMManagedInstanceCore. – Ensure the instance can reach the internet (IGW route + public IP + outbound SG). – Confirm SSM Agent is installed/running (Amazon Linux usually includes it). – Verify your user has permissions for SSM Session Manager (ssm:StartSession, etc.). -
No internet from instance – Check subnet route table association is correct. – Confirm
0.0.0.0/0route targets the IGW. – Confirm instance has a public IPv4 address (or Elastic IP), and the subnet maps public IPs on launch. – Check security group outbound rules allow traffic. – Check NACL isn’t blocking outbound/return traffic. -
Flow logs created but no events – Wait a few minutes; delivery is not instant. – Generate traffic again (curl, yum/dnf update metadata). – Ensure the flow logs role policy includes the correct log group ARN. – Confirm the log group exists in the same region. – Check
describe-flow-logsfor errors in delivery status (if shown). -
CLI AMI parameter not found – Parameter names can differ by architecture/region. – Use the AWS console to pick an Amazon Linux AMI, or verify the correct SSM public parameter in official docs.
Cleanup
To avoid ongoing charges, delete resources in reverse order:
1) Terminate EC2 instance:
aws ec2 terminate-instances --region "$AWS_REGION" --instance-ids "$INSTANCE_ID"
aws ec2 wait instance-terminated --region "$AWS_REGION" --instance-ids "$INSTANCE_ID"
2) Delete flow logs:
aws ec2 delete-flow-logs --region "$AWS_REGION" --flow-log-ids "$FLOW_LOG_ID"
3) Delete CloudWatch log group (optional; do this to avoid log storage costs):
aws logs delete-log-group --region "$AWS_REGION" --log-group-name "$LOG_GROUP"
4) Detach and delete Internet Gateway:
aws ec2 detach-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_ID" --vpc-id "$VPC_ID"
aws ec2 delete-internet-gateway --region "$AWS_REGION" --internet-gateway-id "$IGW_ID"
5) Delete route table (only if it’s not the main route table and has no associations):
# You may need to disassociate first if you created extra associations
# Find association IDs if needed:
aws ec2 describe-route-tables --region "$AWS_REGION" --route-table-ids "$RTB_ID"
aws ec2 delete-route-table --region "$AWS_REGION" --route-table-id "$RTB_ID"
6) Delete subnet:
aws ec2 delete-subnet --region "$AWS_REGION" --subnet-id "$SUBNET_ID"
7) Delete security group:
aws ec2 delete-security-group --region "$AWS_REGION" --group-id "$SG_ID"
8) Delete VPC:
aws ec2 delete-vpc --region "$AWS_REGION" --vpc-id "$VPC_ID"
9) Delete IAM instance profile and roles:
aws iam remove-role-from-instance-profile --instance-profile-name "$PROFILE_NAME" --role-name "$ROLE_NAME"
aws iam delete-instance-profile --instance-profile-name "$PROFILE_NAME"
aws iam detach-role-policy --role-name "$ROLE_NAME" --policy-arn "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
aws iam delete-role --role-name "$ROLE_NAME"
aws iam delete-role-policy --role-name "$FLOWLOGS_ROLE" --policy-name "${PREFIX}-flowlogs-to-cw"
aws iam delete-role --role-name "$FLOWLOGS_ROLE"
Expected outcome – All lab resources are removed, minimizing the chance of ongoing charges.
11. Best Practices
Architecture best practices
- Use a standard VPC pattern: Most orgs use:
- Public subnets (ingress/egress components like ALB, NAT)
- Private application subnets
- Private data subnets (more restricted)
- Design multi-AZ from the start for production:
- At least two AZs
- Duplicate subnets per AZ
- AZ-aligned NAT Gateways if using NAT
- Avoid overly large blast radius:
- Use multiple VPCs or accounts for strong isolation (especially for prod vs dev).
- Plan IP addressing:
- Avoid overlapping CIDRs if hybrid or multi-VPC connectivity is expected.
- Reserve space for future subnets and expansion.
- Consider IPAM if you operate at enterprise scale.
IAM/security best practices
- Least privilege for VPC changes:
- Restrict who can modify route tables, IGWs, peering, and security groups.
- Use permission boundaries or separate “network admin” roles.
- Use AWS Config and SCPs (in AWS Organizations) for guardrails:
- Prevent public exposure patterns when not allowed (e.g., restrict IGW creation in certain accounts).
- Security groups over NACL complexity:
- Keep NACLs simple unless you have a specific requirement.
Cost best practices
- Reduce NAT reliance via VPC endpoints where appropriate.
- Right-size logging:
- Enable Flow Logs strategically (critical subnets/ENIs).
- Use retention policies in CloudWatch Logs or lifecycle rules in S3.
- Watch cross-AZ traffic:
- Keep chatty services AZ-local when possible.
- Tag NAT gateways, endpoints, and log groups for cost attribution.
Performance best practices
- Keep traffic inside the AWS network where possible:
- Use VPC endpoints/PrivateLink instead of internet routes for AWS service access.
- Avoid unnecessary hops:
- Overly complex routing (multiple appliances) can add latency and failure modes.
- Use correct load balancer placement:
- Internet-facing ALBs in public subnets; internal ALBs in private subnets.
Reliability best practices
- No single AZ dependencies:
- If you use NAT, deploy one per AZ and configure route tables accordingly.
- Ensure both subnets and routing exist per AZ.
- Change control for routing:
- Route changes can cause instant widespread outages. Use staged rollouts and review.
Operations best practices
- Centralize observability:
- Use Flow Logs + CloudTrail; aggregate to a security/observability account when feasible.
- Use Reachability Analyzer for troubleshooting:
- Make it part of incident response runbooks.
- Document subnet intent:
- Clearly label (and tag) subnets:
public,private-app,private-db,inspection, etc.
Governance/tagging/naming best practices
Adopt consistent tags:
– Name
– Environment (dev/test/prod)
– Owner or Team
– CostCenter
– DataClassification
– Application
Use consistent names:
– vpc-prod-us-east-1
– subnet-prod-public-use1a
– rtb-prod-private-use1a
12. Security Considerations
Identity and access model
- Amazon VPC resources are controlled by IAM permissions on EC2/VPC APIs.
- Use:
- Separate roles for network admins vs application deployers
- CloudTrail to monitor changes
- AWS Config rules to detect drift and risky configurations
Encryption
- VPC networking itself is about routing and segmentation, not data-at-rest encryption.
- Use:
- TLS for data in transit at application level
- Service-level encryption for storage (EBS, RDS, S3, etc.)
- VPN/Direct Connect MACsec where applicable (verify service capability and region)
Network exposure
Key security levers:
– Route tables: If a subnet has a default route to IGW and instances have public IPs, it is effectively “public.”
– Security groups: Avoid 0.0.0.0/0 inbound to administrative ports.
– NACLs: Use carefully; they can block or permit broad ranges.
Secrets handling
- Do not store secrets in user data or AMIs.
- Use AWS Secrets Manager or SSM Parameter Store (with encryption) and tight IAM policies.
Audit/logging
- CloudTrail for control-plane changes (routes, SGs, IGW attachments, endpoints).
- VPC Flow Logs for traffic metadata.
- Consider central log retention and access controls.
Compliance considerations
- Segmentation and logging requirements often map to common compliance controls, but the exact mapping depends on your framework.
- Verify required configurations and retention with your compliance team and AWS documentation.
Common security mistakes
- Public subnets accidentally used for databases
- Overly permissive security groups (“allow all” inbound)
- No egress control (malware can call home)
- No flow logs (harder to investigate incidents)
- CIDR overlap preventing future connectivity or forcing risky workarounds
Secure deployment recommendations
- Default to private subnets for compute where possible.
- Use inbound entry points (ALB/API Gateway) rather than exposing instances.
- Prefer SSM Session Manager over SSH/bastions where feasible.
- Use VPC endpoints for AWS services to reduce internet dependency.
- Implement guardrails with Config/SCPs and continuous monitoring.
13. Limitations and Gotchas
Known limitations / quotas
- Amazon VPC has many quotas: VPCs per region, subnets per VPC, routes per route table, SG rules, ENIs per instance type, etc.
- Quotas vary and can change—use Service Quotas as the source of truth.
Regional constraints
- VPC is regional; subnet is AZ-specific.
- Some advanced networking capabilities (or specific endpoint availability) may vary by region—verify in official docs.
Pricing surprises
- NAT Gateway hourly + per-GB processing is a frequent surprise.
- Interface endpoints (PrivateLink) add hourly costs that scale with AZ count and number of services.
- Flow logs to CloudWatch Logs can generate significant ingestion costs at scale.
- Inter-AZ data transfer can become significant in microservices architectures.
Compatibility issues
- CIDR overlap can block:
- VPC peering
- Transit Gateway routing
- Hybrid connectivity
- Some managed services require subnets in multiple AZs for high availability (service-specific).
Operational gotchas
- Security group changes are immediate—good for response, risky for mistakes.
- Route table changes can instantly cut off connectivity for many resources.
- NACL stateless behavior causes “it should work but it doesn’t” scenarios (return traffic blocked).
- DNS settings affect many services; misconfiguration can mimic “network outages.”
Migration challenges
- Expanding or changing CIDR plans after the fact is complex.
- Migrating between VPCs often requires:
- Re-IP or dual-stack changes
- DNS migration
- Load balancer and endpoint updates
- Connectivity changes (peering/TGW)
Vendor-specific nuances
- AWS networking is built around ENIs, security groups, and route tables—similar concepts exist in other clouds, but behavior differs. Avoid assuming Azure/GCP semantics are identical.
14. Comparison with Alternatives
Amazon VPC is the core AWS network boundary, but there are adjacent options for different scopes.
Comparison table
| Option | Best For | Strengths | Weaknesses | When to Choose |
|---|---|---|---|---|
| Amazon VPC | Isolated networking for AWS workloads in a region | Mature primitives (subnets, routes, SGs), deep AWS integration, strong patterns | Requires good IP/routing/security design; can become complex at scale | Any workload needing controlled networking in AWS |
| AWS Transit Gateway | Connecting many VPCs and on-prem networks | Scalable hub-and-spoke routing, centralized control | Added cost and design complexity | When VPC peering becomes unmanageable or you need transitive routing |
| VPC endpoints (PrivateLink) | Private access to AWS services/SaaS | Reduce internet exposure, strong security posture | Interface endpoint costs; endpoint sprawl | For private subnets and compliance-focused architectures |
| Site-to-Site VPN | Encrypted connectivity to on-prem | Fast to set up, good for dev/backup | Internet-based variability, throughput limits | Quick hybrid connectivity or backup to Direct Connect |
| AWS Direct Connect | Dedicated connectivity to AWS | More consistent performance, private connectivity | Lead time, port costs, operational coordination | Enterprise hybrid with predictable bandwidth/latency needs |
| Amazon CloudFront | Content delivery and edge caching | Global edge network, DDoS protection integration | Not a replacement for VPC networking | When serving content globally; VPC hosts origins |
| Azure Virtual Network (VNet) | Networking in Microsoft Azure | Similar constructs, deep Azure integration | Different routing/security semantics | When your workloads are on Azure |
| Google Cloud VPC | Networking in Google Cloud | Global VPC design (provider-specific), GCP integration | Different model than AWS regional VPC | When workloads are on GCP |
| On-prem VLAN/VRF + firewalls | Traditional data center segmentation | Full control of hardware | CapEx, slower provisioning, scaling limits | When workloads stay on-prem or require specialized hardware control |
| OpenStack Neutron (self-managed) | Private cloud networking | Customization, open ecosystem | High operational burden | When running private cloud and you accept the ops cost |
15. Real-World Example
Enterprise example: Multi-account regulated platform with centralized inspection
Problem A regulated enterprise runs dozens of applications with strict separation between prod and non-prod, requires centralized logging, controlled egress, and hybrid connectivity to on-premises services (identity, SIEM, legacy databases).
Proposed architecture – AWS Organizations with separate accounts: – Network account (shared services and connectivity) – Security/logging account – Workload accounts per application/team – Amazon VPC per environment (prod, non-prod), multi-AZ: – Public subnets: ALBs, NAT Gateways (per AZ) – Private app subnets: EKS nodes / ECS tasks – Private data subnets: RDS/Aurora, internal datastores – Connectivity: – Transit Gateway as hub – VPN and/or Direct Connect attachments – Private access: – VPC endpoints for S3, ECR, CloudWatch, SSM, etc. (service set depends on workloads) – Observability/governance: – VPC Flow Logs to centralized S3/CloudWatch – CloudTrail org trail – AWS Config rules + SCP guardrails
Why Amazon VPC was chosen – Strong isolation boundary and mature primitives – Integrates with Transit Gateway, endpoints/PrivateLink, and enterprise governance tooling – Supports security segmentation patterns and auditability
Expected outcomes – Reduced internet exposure (more private traffic paths) – Faster onboarding of new workloads via standardized VPC templates – Easier audits with centralized logs and controlled network change permissions
Startup/small-team example: Two-tier SaaS with minimal ops
Problem A startup needs a simple, secure network for an API service and a managed database, with minimal administrative overhead and a clear path to scale.
Proposed architecture – One Amazon VPC in a single region, two AZs – Public subnets: – Internet-facing ALB – Private subnets: – App tier (ECS on Fargate or EC2) – Database (RDS) in DB subnet group – Security: – Security groups tightly scoped (ALB → app; app → DB) – No direct instance exposure; use SSM Session Manager if EC2 is used – Observability: – Basic VPC Flow Logs on critical subnets – Cost management: – Avoid NAT if possible by using endpoints for AWS services and limiting internet egress needs (or use NAT carefully)
Why Amazon VPC was chosen – It’s the default secure networking boundary for AWS workloads – Simple public/private subnet architecture supports secure-by-default designs
Expected outcomes – Clear separation of public ingress from private data – Straightforward scaling by adding services and AZ capacity – Reduced operational risk by avoiding ad-hoc networking changes
16. FAQ
1) Is Amazon VPC free?
Creating a VPC, subnets, route tables, security groups, and attaching an Internet Gateway is generally not charged directly. Costs usually come from NAT Gateways, interface endpoints, VPN, traffic mirroring, flow logs destinations, and data transfer. Confirm on the official pricing page: https://aws.amazon.com/vpc/pricing/
2) Is a VPC global?
No. A VPC is regional. Subnets are Availability Zone–specific.
3) What makes a subnet “public”?
A subnet is typically considered public if:
– Its route table has a route to an Internet Gateway (e.g., 0.0.0.0/0 -> igw-...), and
– Instances have public IPs (or Elastic IPs), and
– Security rules allow the traffic.
4) What makes a subnet “private”?
A private subnet has no direct route to an Internet Gateway. It may still have outbound access via NAT Gateway, VPN, Direct Connect, or private endpoints.
5) Do security groups block outbound traffic by default?
By default, new security groups usually allow all outbound and deny all inbound. You can (and often should) restrict outbound in production.
6) What’s the difference between a security group and a network ACL?
- Security groups: stateful, applied at ENI/resource level.
- NACLs: stateless, applied at subnet level; must allow return traffic explicitly.
7) When should I use NAT Gateway?
Use NAT Gateway when private subnet workloads must reach the public internet (patching, external APIs) but must not accept inbound internet connections. Be mindful of cost and AZ design.
8) How can private subnets access S3 without NAT?
Use a VPC gateway endpoint for S3 and update route tables and policies accordingly. This keeps S3 traffic private within AWS networking.
9) What is AWS PrivateLink?
AWS PrivateLink uses interface VPC endpoints to provide private connectivity to supported AWS services and third-party endpoint services without exposing traffic to the public internet.
10) Can I connect two VPCs together?
Yes. Common options: – VPC peering (simple, non-transitive) – Transit Gateway (scalable, transitive) – VPN between VPCs (less common) Choose based on scale and routing needs.
11) What happens if my VPC CIDR overlaps with another network?
Overlapping CIDRs can prevent peering, Transit Gateway routing, and hybrid connectivity. Avoid overlap by planning your IP space (or use IPAM at scale).
12) Are VPC Flow Logs packet captures?
No. Flow logs capture traffic metadata (5-tuple-ish data and action), not payload. For packet-level visibility, consider Traffic Mirroring (with cost and compliance review).
13) How do I see who changed a route table or security group?
Use AWS CloudTrail to audit API calls and see the actor, time, and change.
14) Do I need a VPC for AWS Lambda?
Not always. Lambda can run without VPC attachment. If you attach a Lambda function to a VPC, it gets ENIs in your subnets, and you must consider subnet routing and egress carefully.
15) How do I design VPCs for multiple environments?
Common approaches: – Separate VPCs per environment (and often separate AWS accounts) – Standardized CIDR blocks and subnet layouts – Centralized connectivity and logging (for enterprises) The right approach depends on governance and blast-radius requirements.
16) What is the safest way to access EC2 instances?
Prefer AWS Systems Manager Session Manager (no inbound ports), plus IAM/MFA controls, logging, and least privilege. If SSH is required, restrict by source IP and consider using a dedicated access path.
17) How do I prevent accidental public exposure?
Combine:
– Guardrails (SCPs/Config rules)
– Minimal IAM permissions for route/IGW changes
– Security group standards (no 0.0.0.0/0 to admin ports)
– Automated scanning and continuous monitoring
17. Top Online Resources to Learn Amazon VPC
| Resource Type | Name | Why It Is Useful |
|---|---|---|
| Official documentation | Amazon VPC Docs — https://docs.aws.amazon.com/vpc/ | Authoritative reference for concepts, APIs, and features |
| Official pricing | Amazon VPC Pricing — https://aws.amazon.com/vpc/pricing/ | Explains cost model for endpoints, NAT, VPN, etc. |
| Pricing tool | AWS Pricing Calculator — https://calculator.aws/ | Build region-specific cost estimates |
| Official data transfer pricing | EC2 On-Demand Pricing (Data Transfer section) — https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer | Understand network transfer charges that affect VPC architectures |
| Getting started | Getting started with Amazon VPC — https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html | Good entry point and navigation into subtopics |
| Official workshops | AWS Workshops (search VPC) — https://workshops.aws/ | Hands-on labs maintained by AWS and the community; validate lab freshness |
| Architecture guidance | AWS Architecture Center — https://aws.amazon.com/architecture/ | Reference architectures and best practices (search for networking/VPC patterns) |
| Security best practices | AWS Security Documentation — https://docs.aws.amazon.com/security/ | Broader security patterns including network controls |
| Observability | VPC Flow Logs Docs — https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html | Deep dive into flow logs formats, destinations, and use |
| Troubleshooting tool | Reachability Analyzer Docs — https://docs.aws.amazon.com/vpc/latest/reachability/ | Connectivity analysis concepts and workflows |
| Video learning | AWS YouTube Channel — https://www.youtube.com/user/AmazonWebServices | Talks and re:Invent sessions on VPC, PrivateLink, TGW, etc. |
| Reference implementations | AWS Samples on GitHub — https://github.com/aws-samples | Look for VPC, PrivateLink, and networking IaC examples (verify relevance) |
| Community (reputable) | Well-Architected Framework — https://docs.aws.amazon.com/wellarchitected/latest/framework/ | Principles and reviews that influence VPC design decisions |
18. Training and Certification Providers
| Institute | Suitable Audience | Likely Learning Focus | Mode | Website URL |
|---|---|---|---|---|
| DevOpsSchool.com | Beginners to experienced engineers | DevOps/cloud fundamentals, AWS networking basics, hands-on practices | Check website | https://www.devopsschool.com/ |
| ScmGalaxy.com | Students and early-career professionals | Software engineering/DevOps learning paths including cloud basics | Check website | https://www.scmgalaxy.com/ |
| CLoudOpsNow.in | Cloud/ops practitioners | Cloud operations, infrastructure, and operational readiness | Check website | https://www.cloudopsnow.in/ |
| SreSchool.com | SREs, DevOps, platform teams | Reliability engineering, operations, observability, incident response | Check website | https://www.sreschool.com/ |
| AiOpsSchool.com | Ops and platform teams | AIOps concepts, automation, monitoring-oriented practices | Check website | https://www.aiopsschool.com/ |
19. Top Trainers
| Platform/Site | Likely Specialization | Suitable Audience | Website URL |
|---|---|---|---|
| RajeshKumar.xyz | DevOps/cloud training content (verify current offerings) | Individuals seeking guided training | https://www.rajeshkumar.xyz/ |
| devopstrainer.in | DevOps training and mentoring (verify course catalog) | Beginners to intermediate practitioners | https://www.devopstrainer.in/ |
| devopsfreelancer.com | Freelance DevOps guidance and services (verify scope) | Teams or individuals needing practical help | https://www.devopsfreelancer.com/ |
| devopssupport.in | DevOps support/training resources (verify scope) | Engineers needing troubleshooting and enablement | https://www.devopssupport.in/ |
20. Top Consulting Companies
| Company | Likely Service Area | Where They May Help | Consulting Use Case Examples | Website URL |
|---|---|---|---|---|
| cotocus.com | Cloud/DevOps consulting (verify exact services) | Platform setup, automation, cloud operations | VPC landing zone design, IaC modules, migration planning | https://cotocus.com/ |
| DevOpsSchool.com | DevOps and cloud consulting/training | Enablement, process, and implementation support | Standard VPC patterns, security group baselines, CI/CD for IaC | https://www.devopsschool.com/ |
| DEVOPSCONSULTING.IN | DevOps consulting (verify current scope) | DevOps transformation and ops support | Network governance workflows, observability setup, cost controls | https://www.devopsconsulting.in/ |
21. Career and Learning Roadmap
What to learn before Amazon VPC
- Basic networking:
- IPv4 CIDR, subnets, routing, NAT
- DNS fundamentals
- TCP/UDP and common ports
- AWS foundations:
- IAM basics (users/roles/policies)
- AWS Regions and Availability Zones
- EC2 basics (instances, security groups)
What to learn after Amazon VPC
- Advanced AWS networking:
- VPC endpoints/PrivateLink design
- Transit Gateway architectures
- Hybrid connectivity (VPN, Direct Connect)
- Route 53 Resolver and hybrid DNS
- Security:
- AWS Network Firewall (where appropriate)
- Zero-trust access patterns (SSM, identity-aware proxies)
- Centralized logging and threat detection workflows
- Infrastructure as Code:
- CloudFormation/CDK or Terraform modules for standardized VPCs
- Operations:
- Flow logs pipelines, CloudWatch Logs Insights, incident runbooks
- AWS Config compliance-as-code
Job roles that use it
- Cloud Engineer / Cloud Administrator
- DevOps Engineer / Platform Engineer
- Site Reliability Engineer (SRE)
- Network Engineer (Cloud)
- Security Engineer / Cloud Security Engineer
- Solutions Architect
Certification path (AWS)
Amazon VPC appears heavily in AWS certifications. Common paths: – AWS Certified Cloud Practitioner (foundation) – AWS Certified Solutions Architect – Associate (strong VPC coverage) – AWS Certified SysOps Administrator – Associate – AWS Certified Advanced Networking – Specialty (deep networking focus)
Verify current certification names and exam guides on the official AWS certification site: https://aws.amazon.com/certification/
Project ideas for practice
- Build a 2-AZ VPC with public/private subnets and ALB → app → RDS
- Add VPC endpoints for S3 and SSM; remove NAT and validate private operation
- Implement Flow Logs to S3 with lifecycle policies and query with Athena (requires additional services)
- Build hub-and-spoke networking using Transit Gateway (multi-VPC lab)
- Create a shared VPC model using RAM (multi-account lab)
22. Glossary
- Amazon VPC: AWS service for creating isolated virtual networks in a region.
- CIDR: Classless Inter-Domain Routing; notation for IP ranges (e.g.,
10.0.0.0/16). - Subnet: A slice of a VPC CIDR in one Availability Zone.
- Availability Zone (AZ): A physically separate location within an AWS region.
- Route table: Set of rules that determines where network traffic is directed.
- Internet Gateway (IGW): Gateway that enables internet connectivity for a VPC.
- NAT Gateway: Managed service enabling outbound internet for private subnets.
- Egress-only Internet Gateway: IPv6-only outbound internet gateway.
- Security group (SG): Stateful firewall rules applied to ENIs/resources.
- Network ACL (NACL): Stateless subnet-level network rules.
- ENI: Elastic Network Interface; virtual network card in a VPC.
- VPC endpoint: Private connection to AWS services; includes gateway endpoints and interface endpoints (PrivateLink).
- PrivateLink: AWS technology for private service connectivity using interface endpoints.
- VPC Flow Logs: Traffic metadata logs for VPC/subnet/ENI.
- Reachability Analyzer: Tool to analyze network paths and explain reachability.
- Transit Gateway (TGW): Service for connecting multiple VPCs and networks with central routing.
- Site-to-Site VPN: Encrypted tunnel between AWS and on-premises or other networks.
23. Summary
Amazon VPC is AWS’s foundational service in the Networking and content delivery category for building private, controlled networks in the cloud. It provides regional network isolation with subnets per AZ, routing primitives, and layered security controls (security groups and NACLs). It matters because nearly every production AWS workload depends on predictable segmentation, safe internet access, and private connectivity to AWS services and other networks.
Cost is typically driven not by “having a VPC,” but by add-ons and traffic patterns—especially NAT Gateways, interface endpoints/PrivateLink, flow logs storage/ingestion, and data transfer (including cross-AZ and internet egress). Security success depends on strong IAM control of network changes, careful route design, least-privilege security groups, and logging (CloudTrail + Flow Logs).
Use Amazon VPC whenever you need secure, auditable networking for AWS workloads. Start next by learning VPC endpoints/PrivateLink and multi-VPC connectivity (Transit Gateway) and by implementing your VPC patterns with infrastructure as code for repeatability and governance.