AWS Certified Solutions Architect - Professional
Deep Dive -AWS Direct Connect and VPNs:
AWS Direct Connect:
Creating a VPC
Deploying and configuring a NAT instance
Configuring a NAT Gateway
Route 53 Private Hosted Zone
Extend On-premises DNS to EC2 Instances
HPC used by oil & gas, pharmaceuticals, research, automotive, and other industries
Batch processing of compute intensive workloads
Requires high performance CPU, network, and storage
Jumbo Frames are typically required
- HPC workloads typically need access to a shared filesystem, and will use a lot of disk I/O
Help significantly because they can carry up to 9000 bytes of data
Supported on AWS through enhanced networking
Enhanced networking is enabled through single rout I/O virtualization (SR-IOV) on supported instances
Enhanced networking is only supported on Hardware Virtualization (HVM) instances. Not supported on Paravirtulized(PV) instanced
Enabling Enhanced Networking on Linux Instances in a VPC:
Enabling Enhanced Networking on Windows Instances in a VPC:
Placement Groups and supported instances:
AWS Storage Options:
Amazon RDS for SQL server now supports Windows Authentication
A set of properties that guarantee that database transactions are processed reliably.
The ACID concept is described in ISO/IEC 10026-1:1992 Section 4
Requires that each transaction be "all or nothing" If one part of the transaction fails, the entire transaction fails, and the database state is left unchanged.
Ensures that any transaction will bring the database from one valid state to another.
Ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially.
i.e., one after the other
Ensures that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors
DynamoDB is a fully managed, highly available and scalable NoSQL database
Automatically and synchronously replicates data across three Availability Zones
SSDs and limiting indexing on attributes provides high throughput and low latency
ElastiCachecan be used in front of DynamoDB in order to offload high amounts of reads for non-frequently changed data
Ideal for existing or new applications that need:
Stores structured data in tables, indexed by a primary key
Tables are a collection of items and Items are made up of attributes (columns)
Primary key can be:
Simple AD is a stand-alone, managed directory that is powered by Samba 4 Active Directory Compatible Server
Setting up Simple AD
Setting up AD Connector
Managed Microsoft Active Directory, powered by Windows Server 2012 R2
Designed to support up to 50,000 users (approximately 200,000 directory objects including users, groups and computers)
Run directory-aware Windows workload
Create trust relationships between Microsoft AD domains in the AWS cloud, and on-premises
Microsoft AD is deployed across multiple Azs
Automatic monitoring detects and replaces DCs that fail
Data replication and automated daily snapshots are configured for you
No software to install
AWS handles all of the patching and software updates
Setting up Microsoft AD
Federated temporary access to AWS resources
Enterprise identity federation
Web identity federation:
Cross Account Access
A web service that records AWS API calls for your account and delivers log files to you
A history of API calls for your AWS account
API history enables security analysis, resource change tracking, and compliance auditing
Logs API calls made via:
A monitoring service for AWS cloud resources and the applications you run on AWS
Monitor AWS resources such as:
Gain system-wide visibility into resource utilization
By default, CloudWatch Logs will store your log data indefinitely
Alarm history is stored for 14 days
CloudTrail logs can be sent to CloudWatch Logs for real-time monitoring
CloudWatch Logs metric filters can evaluate CloudTrail logs for specific terms, phrases, or values
You can assign CloudWatch metrics to the metric filers
You can create CloudWatch alarms
Do not store logs on non persistent disks:
Best practice is to store logs in CloudWatch Logs or S3
CloudTrail can be used across multiple AWS accounts while being pointed to a single S3 bucket (requires cross account access)
CloudWatch Logs subscription can be used across multiple AWS accounts (requires cross account access)
A physical computing device that safeguards and manages digital keys for strong authentication and provides cryptoprocessing.
HSMs can be used in any application that uses digital keys
Used to protect high value keys
HSM uses are as follows:
HSMs are also deployed to manage Transparent Data Encryption (TDE) keys for databases
TDE Automatically encrypts the data before it is written to the underlying storage device and decrypts when it is read from the storage device
Oracle requires key storage outside of KMS and integrates with CloudHSM
SQL Server requires a key but is managed by RDS after enabling TDE
A type denial of service attack where multiple compromised systems are used to target a single system
Reduce the number of necessary Internet entry points
Eliminate non-critical Internet entry points
Separate end user traffic from management traffic
Obfuscate necessary Internet entry points to the level that untrusted end users cannot access them
Decouple Internet entry points to minimize the effects of attacks
Design your infrastructure to scale out and scale up
Attackers have to expend more resources to scale up the attack
Attack is spread over a larger area
Scaling buys you time to analyze the attack and respond
Scaling provides you more redundancy
Validate the architecture and select the techniques that work for your infrastructure and application
Evaluate the costs for increased resiliency and understand the goals of your defense
Know who to contact when an attack happens
An intrusion detection system (IDS) inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system.
An Intrusion Prevention System (IPS) is a network security/threat prevention technology that examines network traffic flows to detect and prevent vulnerability exploits.
AWS Best Practices for DDoS Resiliency:
Switch Availability Zones within the same region
Change the instance size within the same instance type
Instance type modifications are supported only for Linux. Due to licensing differences, Linux RIs cannot be modified to RedHator SUSE
You cannot change the instance size of Windows Reserved Instances
Move between AZs in the same Region
Are available for Multi-AZ deployments
Can be applied to Read Replicas provided the DB Instance class and Region are the same
Enabled per account per region
You can consolidate all logs into a single S3 bucket:
A resource group is a collection of resources that share one or moretagsor portions of tags, and can be managed as a single group rather than move from one AWS service to another for each task.
Use a single page to view and manage your resources
Combines information about multiple resources, such as metrics, alarms, and configuration details
Create a custom console that organizes and consolidates the information you need based on your project and the resources you use
Quickly identify resources that are not tagged
Can be shared among users in the same AWS account by sharing a URL
Users of the same AWS account can have different resource groups
Creating Resource Groups
Working with Resource Groups
Gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.
You don’t need to figure out the order for provisioning AWS services
You don’t need to worry about making dependencies work
Modify and update templates in a controlled and predictable way
Visualize your templates as diagrams and edit them using a drag-and-drop interface with the AWS Cloud Formation Designer
Provides several built-in functions that help you manage your stacks
Assign values to properties that are not available until runtime
Declaration: "Fn::GetAtt" : [ "logicalNameOfResource", "attributeName" ] Example: "Fn::GetAtt" : [ “ELB" , "DNSName" ]
Automatic rollback on error is enabled by default
You will be charged for resources provisioned even if there is an error
CloudFormation is free
A service for deploying and scaling web applications and services. Upload your code and Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, auto-scaling to application health monitoring.
Cloud Formation supports Elastic Beanstalk
Elastic Beanstalk does not provision CloudFormation templates
Elastic Beanstalk is ideal for developers with limited cloud experience that need to deploy environments fast
Elastic Beanstalk is ideal if you have a standard PHP, Java, Python, Ruby, Node.js, .NET, Go, or Docker application that can run on an app server with a database.
A configuration management service that helps you automate operational tasks like software configurations, package installations, database setups, server scaling, and code deployment using Chef.
Use the AWS Management Console
Consists of two elements: Stack and Layers
Stacks are containers of resources (EC2, RDS, ELB) that you want to manage collectively
Every Stack contains one or more layers:
Layers automate the deployment of packages for you
A global CDN service. It integrates with other AWS products to give developers and businesses an easy way to distribute content to end users with low latency, high data transfer speeds, and no minimum usage commitments.
Used to deliver an entire website using a global network of edge locations
Requests for content is automatically routed to the nearest edge location for best possible performance
Optimized to work with other Amazon Web Services
Open-source in-memory caching engines
Master / Slave replication and Multi-AZ
Performance at Scale with Amazon ElastiCache:
Enables you to build custom applications that process or analyze streaming data for specialized needs. It can continuously capture and store TB of data per hour from thousands of sources such as website clickstreams, financial transactions, social media feeds, IT logs, and location-tracking events.
By default data is stored for 24 hours, but can be increased to 7 days
A uniquely identified group of data records in a stream
A stream is composed of one or more shards, each of which provides a fixed unit of capacity
Can support up to 5 transactions per second for reads
Max total data read rate of 2 MB/s
Up to 1,000 records per second for writes
Max total data write rate of 1 MB/s (including partition keys)
If your data rate increases, add more shards to increase the size of your stream. Remove shards if the data rate decreases.
Used to group data by shard within a stream
Stream service segregates data records belonging to a stream into multiple shards
Use partition keys associated with each data record to determine which shard a given data record belongs to
Specified by the applications putting the data into a stream
The data your producer adds to a stream. The maximum size of a data blob (the data payload after Base64-decoding) is 1 megabyte (MB).
Consumers get records from Amazon Kinesis Streams and process them. These consumers are known as Amazon Kinesis Streams Applications.
The ability to send push notification messages directly to apps on mobile devices.
The time it takes after a disruption to restore a business process to its service level, as defined by the operational level agreement (OLA). For example, if a disaster occurs at 12:00 PM and the RTO is eight hours, the DR process should restore the business process to the acceptable service level by 8:00 PM.
The acceptable amount of data loss measured in time. For example, if a disaster occurs at 12:00 PM and the RPO is one hour, the system should recover all data that was in the system before 11:00 AM. Data loss will span only one hour, between 11:00 AM and 12:00 PM.
Different levels of off-site duplication of data and infrastructure
Critical business services are set up and maintained on this infrastructure and tested at regular intervals
DR environment’s location and the production infrastructure should be a significant physical distance apart
AWS Storage Gateway enables snapshots of your on-premises data volumes to be transparently copied into Amazon S3 for backup. You can subsequently create local volumes or Amazon EBS volumes from these snapshots.
Storage -cached volumes allow you to store your primary data in Amazon S3, but keep your frequently accessed data local for low-latency access. You can snapshot the data volumes for highly durable backups. In the event of DR, you can restore the cache volumes either to a second site running a storage cache gateway or to Amazon EC2.
You can use the gateway-VTL configuration as a backup target for your existing backup management software. This can be used as a replacement for traditional magnetic tape backup.
Set up Amazon EC2 instances to replicate or mirror data.
Ensure that you have all supporting custom software packages available in AWS.
Create and maintain AMIs of key servers where fast recovery is required.
Regularly run these servers, test them, and apply any software updates and configuration changes.
Consider automating the provisioning of AWS resources.
Start your application Amazon EC2 instances from your custom AMIs.
Resize existing database/data store instances to process the increased traffic.
Add additional database/data store instances to give the DR site resilience in the data tier; If you are using Amazon RDS, turn on Multi-AZ to improve resilience.
Change DNS to point at the Amazon EC2 servers.
Install and configure any non-AMI based systems, ideally in an automated way
Set up Amazon EC2 instances to replicate or mirror data.
Create and maintain AMIs.
Run your application using a minimal footprint of Amazon EC2 instances or AWS infrastructure.
Patch and update software and configuration files in line with your live environment.
Increase the size of the Amazon EC2 fleets in service with the load balancer (horizontal scaling).
Start applications on larger Amazon EC2 instance types as needed (vertical scaling).
Either manually change the DNS records, or use Amazon Route 53 automated health checks so that all traffic is routed to the AWS environment.
Consider using Auto Scaling to right-size the fleet or accommodate the increased load.
Add resilience or scale up your database.
Set up your AWS environment to duplicate your production environment.
Set up DNS weighting, or similar traffic routing technology to distribute incoming requests to both sites.
Configure automated failover to re-route traffic away from the affected site
Either manually or by using DNS failover, change the DNS weighting so that all requests are sent to the AWS site.
Have application logic for failover to use the local AWS database servers for all queries.
Consider using Auto Scaling to automatically right-size the AWS fleet.
Using Amazon Web Services for Disaster Recovery:
Large data sets
Can be used to import to:
Can only export from:
If bucket versioning is enabled, only most recent version will be exported
A web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals.
Create, access, and manage using:
Supported compute services:
Supported services to store data:
Define the location and type of data that a pipeline activity uses as input or output
A pipeline component that defines the work to perform
AWS Data Pipeline provides pre-packaged activities such as moving data from one location to another or running Hive queries
Custom scripts support endless combinations
Pipeline component containing conditional statements that must be true before an activity can run
Support for Pre-packaged preconditions and custom scripts
Two types of preconditions:
Steps that a pipeline component takes when certain events occur, such as success, failure, or late activities
Following actions are supported:
AWS Data Pipeline relies on Amazon SNS notifications as the primary way to indicate the status of pipelines and their components
A schedule defines the timing of a scheduled event, such as when an activity runs. AWS Data Pipeline exposes this functionality through the Schedule pipeline component.