Architecting on AWS (associate certification)

This tutorial aims to summarize subjects regarding this certification. Also, you will find important links, questions and examples.

It is based on 6.5.2 training course book.

In order to support lab exercises, this repo provides Cloud Formation templates to avoid hand made work, allowing creating and destroying architectures in a efficient and quickly way.

Tag Name: aws-certification-architect-<cfn-template>

Introduction
AWS Global Architecture
The AWS-Well Architected Framework
S3
EC2
RDMS
VPC
IAM
Automation: Cloudformation

Introduction

AWS started selling their services in 2006 which are built on top of their infrastructures providing a HA, scalable and reliable architectures.

Advantages:

Programmable Resources
Dynamic Abilities
Pay as you go. Benefit from massive economies at scale
Stop guessing about capacity
Increase speed and agility
Focus o what matters
Go global in minutes

AWS Global Architecture

This sections tries to describe AWS global architecture view.

AWS Data Centers

AWS data centers are built in clusters in various global regions. Larger data centers are undesirable; all data centers are online and serving customers. No data center is “cold” in case of failure, automated processes move customer data traffic away from the affected area.

Core applications are deployed in an N+1 configuration, so that in the event of a data center failure, there is sufficient capacity to enable traffic to be load-balanced to the remaining sites.

AWS Availability Zones

AWS data centers are organized into Availability Zones. Each Availability Zone comprises one or more data centers, with some Availability Zones having as many as six data centers. However, no data center can be part of two Availability Zones.

Each Availability Zone is designed as an independent failure zone. This means that Availability Zones are physically separated within a typical metropolitan region and are located in lower-risk flood plains (specific flood-zone categorization varies by region).

In addition to having discrete uninterruptible power supply and onsite backup generation facilities, they are each fed via different grids from independent utilities to further reduce single points of failure.

Availability Zones are all redundantly connected to multiple tier-1 transit providers. You are responsible for selecting the Availability Zones where your systems will reside. Systems can span multiple Availability Zones. You should design your systems to survive temporary or prolonged failure of an Availability Zone if a disaster occurs.

Distributing applications across multiple Availability Zones allows them to remain resilient in most failure situations, including natural disasters or system failures.

Summarizing, each Availability Zone is:

Made up of one or more data Centers
Designed for fault isolation
Interconnecting with other AZ using high speed private links

AWS recommends replicating across different AZs for resiliency

AWS Regions

AZs are further grouped into AWS Regions. Each region contains two or more AZs.

When you distribute applications across multiple AZs, be aware of location-dependent privacy and compliance requirements, such as the EU Data Privacy Directive.

When you store data in a specific region, it is not replicated outside that region. AWS never moves your data out of the region you put it in. It is your responsibility to replicate data across regions, if your business needs require that.

AWS provides information about the country, and—where applicable—the state where each region resides; you are responsible for selecting the region to store data in based on your compliance and network latency requirements.

AWS Regions are connected to multiple Internet Service Providers (ISPs) as well as to a private global network backbone, which provides lower cost and more consistent cross-region network latency when compared with the public Internet.

AWS has 18 regions worldwdde and communicate them using AWS backbone network infrastructure

Japan has one one AZ

Edge Locations

To deliver content to end users with lower latency, Amazon CloudFront uses a global network of 150 Points of Presence (139 Edge Locations and 11 Regional Edge Caches) in 65 cities across 29 countries.

Edge locations are located in:

North America
Europe
Asia
Australia
South America

and support AWS services like Amazon Route 53 and Amazon CloudFront.

Regional Edge Caches

Regional edge caches, used by default with Amazon CloudFront, are utilized when you have content that is not accessed frequently enough to remain in an edge location. They absorb this content and provide an alternative to that content having to be fetched from the origin server.

The AWS-Well Architected Framework

This tool provides the on-demand best practice to build a solid AWS architecture. It was developed to help architects build secure, ha, resilient and efficient application infrastructure.

The five pilars are:

Security
Reliability
Cost Optimization
Performance Efficiency
Operation Excellence

Security

Security deals with protecting information and mitigating possible damage. Your architecture will present a much stronger security presence by implementing some basic security measures, like:

Strong identity foundation
Enabling traceability
Applying security at all layers
Automating security best practices
Protecting data in transit and at rest

Reliability

Ensuring reliability can be difficult in a traditional environment. Issues arise from single points of failure, lack of automation, and lack of elasticity. When applying these ideas, you will be able to prevent many of these issues.

Properly designing your architecture in respect to high availability, fault tolerance, and overall redundancy will be helpful for you and your customer.

Your architecture will be reliable by implementing some basics like:

Dynamically acquire computing resources to meet demand
Recover quickly from infrastructure or services failures
Mitigate disruptions such as:
- Misconfigurations
- Transient Network Issues

Cost Optimization

Cost optimization is an ongoing requirement of any good architectural design. The process is iterative and should be refined and improved throughout your production lifetime. Understanding how efficient your current architecture is in relation to your goals will ultimately help with removing unneeded expense.

Consider using managed services as they operate at cloud scale and can offer a lower cost per transaction or service.

Your architecture will optimize costs if:

Measure efficiency
Eliminate unneeded expense
Consider using managed service --> Not really trusty

Operation Excellence

When creating a design or architecture, you must be aware of how it will be deployed, updated, and operated. It is imperative that you work towards defect reductions and safe fixes and enable observation with logging instrumentation.

In AWS, you can view your entire workload (applications, infrastructure, policy, governance, and operations) as code. It can all be defined in and updated using code. This means you can apply the same engineering discipline that you use for application code to every element of your stack.

Your architecture will be operational if:

Provides ability to run and monitor systems
To continually improve supporting processes and procedures

Performance Efficiency

When considering performance, you want to maximize your performance by using computation resources efficiently and maintain that efficiency as the demand changes.

It is also important to democratize advanced technologies. In situations where technology is difficult to implement yourself, consider using a vendor. In implementing the technology for you, the vendor takes on the complexity and knowledge, freeing your team to focus on more value-added work.

Mechanical sympathy: Use the technology approach that aligns best to what you are trying to achieve. For example, consider data access patterns when you select database or storage approaches.

Your architecture will be efficient and get performance if:

Choose efficient resources
Maintain efficiency as demand changes
Democratize advanced technologies
Mechanical Sympathy

S3

Amazon S3 is object-level storage, which means that if you want to change a part of a file, you have to make the change and then re-upload the entire modified file.

Amazon S3 allows you to store as much data as you want. Individual objects cannot be larger than 5 TB; however, you can store as much total data as you need.

By default, data in Amazon S3 is stored redundantly across multiple facilities and multiple devices in each facility.

Amazon S3 can be accessed via the web-based AWS Management Console, programmatically via the API and SDKs, or with third-party solutions (which use the API/SDKs).

Amazon S3 includes event notifications that allow you to set up automatic notifications when certain events occur, such as an object being uploaded to or deleted from a specific bucket. Those notifications can be sent to you, or they can be used to trigger other processes, such as AWS Lambda scripts.

By default, all Amazon S3 resources—buckets, objects, and related sub-resources (for example, lifecycle configuration and website configuration) are private: only the resource owner, an AWS account that created it, can access the resource. The resource owner can grant access permissions to others by writing an access policy.

Amazon S3 buckets are protected by default. The only entities with access to a newly created, unmodified bucket are the account administrator and root user. Modifications to bucket policies can enable additional access, and AWS provides a number of different tools to enable developers to configure buckets for a wide variety of workloads.S3 includes a block public access feature, which acts as an additional layer of protection to prevent accidental exposure of customer data.

In the public access settings for a bucket, customers can specify the following four options:

Block new public ACLs and uploading public objects.
Remove public access granted though public ACLs.
Block new public bucket policies.
Block public and cross-account access to buckets that have public policies.

Features

Serving static content
Versioning
CORS
Data store form large-scale analytic
Backup
MultiPart upload
Transfer Acceleration with Edge Location
Special Service for moving data: AWS Snowball

When to use

Write once, read many
Spinking data access
Growing dataset
Large number of user and diverse amounts of content

When no to use

Block store requirements
Frequently changing data
Long term archival storage

Cost

Specific costs may vary depending on region and the specific requests made. As a general rule, you only pay for transfers that cross the boundary of your region, which means you do not pay for transfers to Amazon CloudFront edge locations within that same region.

Pay only for what use

GBs per month
transfer out to other regions or internet
Any API request

Free for

Transfer IN to S3
Transfer out EC2 in the same region, or Cloudfront

Types of s3 storage

S3 standard
S3 standard IA (infrequently access)
S3 one zone IA
Glacier: Archival data, cheapest available storage tier.

In order to archive effective costs, Lifecycle policies allow you to move elements based on age.

Choosing a region in function of

data compliance
user latency
cost effectiveness

Storage Class Analysis

You can analyze storage access patterns and transition the right data to the right storage class.

This new S3 Analytics feature automatically identifies the optimal lifecycle policy to transition less frequently accessed storage to S3 Standard-Infrequent Access.

Once an infrequent access pattern is observed,you can easily create a new lifecycle age policy based on the results.

You can configure a storage class analysis policy to monitor an entire bucket, a prefix, or object tag.

Storage class analysis also provides daily visualizations of your storage usage in the AWS Management Console. You can export these to an S3 bucket to analyze using the business intelligence tools of your choice, such as Amazon QuickSight

EC2

Amazon EC2 is just like your traditional on-premises server, but it is available in the cloud. It can support workloads such as web hosting, applications, databases, authentication services, and anything else a server can do.

An Amazon Machine Image (AMI) provides the information required to launch an instance, which is a virtual server in the cloud. You must specify a source AMI when you launch an instance. You can launch multiple instances from a single AMI when you need multiple instances with the same configuration. For example, you can use a single AMI to launch a cluster of instances (identical except for their IP address) to be placed beneath a load balancer.

Where I get my AMI

Pre built by AWS
Marketplace
Self created
Community

Types

m5.large means: m name of family, 5 generation name and large is the size.

T instances are burstable performance instances that provide a baseline level of CPU performance with the ability to burst above the baseline.

C instances are optimized for compute-intensive workloads and deliver very cost-effective high performance at a low price per compute ratio.

R instances are optimized for memory-intensive applications.

P instances are intended for general-purpose GPU compute applications.

H instances feature up to 16 TB of HDD-based local storage, deliver high disk throughput, and a balance of compute and memory.

User Data

User data can automate the completion of the instance launch. For example, it might patch and update the instance AMI, fetch and install software license keys, or install additional software.

User data is implemented as a shell script or cloud-init directive that executes with root or Administrator privilege after the instances starts but before it becomes accessible on the network.

In order for User Data to complete the launch of a new EC2 instance, it may need to look up information about the instance itself: The Instance Metadata Service can provide that information.

EBS

EBS (elastic block storage) volumes provide durable, detachable, block-level storage (like an external hard drive) for your EC2 instances. Because they are directly attached to the instances, they can provide extremely low latency between where the data is stored and where it might be used on the instance. For this reason, they can be used to run a database with an Amazon EC2 instance. EBS volumes can also be used to back up your instances into AMIs, which are stored in S3 and can be reused to create new EC2 instances later.

An instance store provides temporary block-level storage for your instance. This storage is located on disks that are physically attached to the host computer. Instance store is ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.

Shared Filesystem: S3, EFS & FSx

S3 is one option, but what if you need the performance and read-write consistency of a network file system? Amazon Elastic File System (EFS) may be your best option.

EFS for linux can be shared between accounts, vpc, region and AZs, while FSx for windows can nly be shared between AZs

Note the following restrictions

You can mount an Amazon EFS file system on instances in only one VPC at a time.
Both the file system and VPC must be in the same AWS Region.

Pricing

There are three options

On Demand Instances: Pay per second
Reserve Instances: Pre-pay capacity with a significant discount
Spot Instances: Purchase unused capacity. Can be terminated with 2 minutes in advance
Dedicated: Physical EC2 server

RDMS

Bullshit for software engineers.

VPC

@TODO

Subnet

Internet Gateway

Route Table

NAT Gateway

IAM

@TODO

user

group

role

Automation: CloudFormation

Long manual processes takes significant time and energy to build a large-scale computing environment.

CloudFormation provisions resources in a safe, repeatable manner, allowing you to build and rebuild your infrastructure and applications without having to perform manual actions or write custom scripts. Using cfn allows you to treat your infrastructure as code.

Here are some questions to consider:

Where do you want to put your efforts: the design or the implementation?
What are the risks of manual implementations?
How do you update production servers?
How are you going to roll out deployments across multiple geographic regions?
When things break, and they will, how do you manage the rollback?
What about debugging deployments?
How do you find what’s wrong and then fix it so that it says fixed?
How will you manage dependencies on the various systems and subsystems in your organization?
Finally, can you do all of this by hand?

Risk from manual processes

Does not scale
No version control
Lack of audit trails
Inconsistent data management

Cross-stack references are useful to separate AWS infrastructure into logical components grouped by stack (e.g. a network stack, an application stack, etc.) and who need a way to loosely couple stacks together as an alternative to nested stacks.

CFN Conditions

Your production environment and development environment must be built from the same stack. This ensures that your application works in production the way it was designed and developed. Additionally, your development environment and testing environment must use the same stack.

All environments will have identical applications and configurations.

You might need several testing environments for functional testing, user acceptance testing, and load testing.

Creating those environments manually comes with great risk.

You can use a Conditions statement in CloudF templates to ensure that,while different in size and scope, develop, qa & prod are configured identically.

If you need additional insight about the changes CloudF is planning to perform when it updates a stack, you can use a change set. Change sets will let you preview the changes, allow you to verify they are in line with expectations, and then approve the update before proceeding.

This is the basic workflow for using change sets:

Create a change set by submitting changes for the stack that you want to update.
View the change set to see which stack settings and resources will change.
If you want to consider other changes before you decide which changes to make, create additional change sets.
Execute the change set. CloudF updates your stack with those changes.

Layered Architecture

A common layered architecture pattern applies here too:

Front End
Back End
Internal: paas, dbs, security group, load balancer
Networking: vpc, subnet, internet gateway, vpn, nat
Identity: iam user, group, roles

Quick Start

The Quick Start is made of a CloudF template and associated scripts to create the environment in your AWS account.

It deals with all the bootstrapping and deploying on your behalf.

There is also a deployment guide that will tell you how everything was created.

You will be charged for the resources used in the creation of, and running of this environment.

Elastic Beanstalk

AWS product to provision and operates the infrastructure and manages the app stack for you.

Completely transparent while automatically scales your app.

Caching

@TODO

Elasticity, HA and Monitoring

@TODO

Decoupled Architectures

@TODO

Serverless

@TODO

RTO/RPO

@TODO

Optimization

@TODO

Exercises

https://github.com/awslabs/aws-cloudformation-templates/tree/master/community/solutions/StaticWebSiteWithPipeline

How to run

IAM group is created with at least EC2, S3, VPC and Route53 permissions.
User belongs to above group.

References

Author

javigs82

License

MIT

javigs82/architecting-aws.md

Architecting on AWS (associate certification)

Table of Contents

Introduction

AWS Global Architecture

AWS Data Centers

AWS Availability Zones

AWS Regions

Edge Locations

Regional Edge Caches

The AWS-Well Architected Framework

Security

Reliability

Cost Optimization

Operation Excellence

Performance Efficiency

S3

Cost

Storage Class Analysis

EC2

User Data

EBS

Shared Filesystem: S3, EFS & FSx

Pricing

RDMS

VPC

Subnet

Internet Gateway

Route Table

NAT Gateway

Links

IAM

user

group

role

Automation: CloudFormation

CFN Conditions

Layered Architecture

Quick Start

Elastic Beanstalk

Caching

Elasticity, HA and Monitoring

Decoupled Architectures

Serverless

RTO/RPO

Optimization

Exercises

How to run

References

Author

License