Skip to content

Instantly share code, notes, and snippets.

@ross-humphrey
Last active January 30, 2020 08:33
Show Gist options
  • Save ross-humphrey/080bbc67b4f56e975daa0c513245ec67 to your computer and use it in GitHub Desktop.
Save ross-humphrey/080bbc67b4f56e975daa0c513245ec67 to your computer and use it in GitHub Desktop.
AWS Well Architected Framework Notes
Help identify best practices and core strategies for architecting systems in the cloud.
The well architected framework is a set of foundational questions to understand if an architecture
aligns well with cloud best practices.
AWS offers the AWS Well Archicted Tool (AWS WA Tool) that can be used to review and measure your architecture
using the well architected framework.The AWS Well-Architected Labs provides you with a repository of code and
documentation to give experience implementing best practices.
The Five Pillars of the AWS Well-Architected Framework
Operational Excellence:
> Ability to run and monitor systems to deliver business value and continually improve supporting processes
and procedures
Security:
> The ability to protect information, systems and assets - whilst delivering business value through risk
assessments and mitigation strategies
Reliability:
> Ability of a system to recover from infrastructure disruptions, dynamically aquire compute resources
to meet demand and mitigate disruptions such as misconfigurations or transient network issues.
Performance Efficiency:
> Ability to use compute efficiently to meet requirements and maintain efficiency as demand changes
and technology evolves
Cost Optimization:
> Ability to run systems to deliver business value at lowest price point.
Capabilities are distributed amongst team - without a central core team. To mitigate standards issues:
> Practices - focused on enabling each team to have capability + experts on teams
> Mechanisms - carry out automated checks to ensure standards are being met.
"Customer obsessed teams build products in response to customer need"
General Design Principles:
> Stop guessing your capacity needs
> Test systems at production scale
> Automate to make architectural experimentation easier
> Allow for evolutionary architecture
> Drive architectures using data
> Improve through game days (simulating events in production)
Five Pillars:
Operational Excellence: (Key service is AWS CloudFormation)
> Perform operations as code
> Annotate documentation
> Make frequent, small, reversible changes
> Refine operations procedures frequently
> Anticpate failure (pre-mortem)
> Learn from all operational failures
Three best practice areas for operational excellence:
> Prepare (AWS Config + Config rules = Create Standards)
> Operate (AWS Cloudwatch)
> Evolve (Amazon Elasticsearch Service) - analyze logs data to gain insights
Security:
> Implement a strong identity foundation (principle of least privalege)
> Enable traceability
> Apply security at all layers (defense in depth i.e - edge, VPC, subnet, load balancer, os and app)
> Automate security best practice (managed as code in version-controlled templates)
> Protect data in transit and at rest
> Keep people away from data
> Prepare for security events
Five best practice areas for security in cloud
> Identity and access management (Define users, groups, services and roles and take action in account)
> IAM
> AWS Organizations
> Detective Controls - Identify a potential threat or security threat, make sure your doing good log management
> AWS CloudTrail
> AWS Config (Inventory)
> AWS GuardDuty - threat detections service
> AWS Cloudwatch
> Infrastructure protection - Use Amazon VPC to create a private, secured and scalable environment
> VPC
> Cloudfront (Delivery Network)
> AWS Shield (DDOS mitigation)
> AWS WAF - web application firewall on Cloudfront or Application Load Balancer
> Data protection - Classify and protect data + data in transit.
> ELB
> EBS
> S3
> RDS
(all include envrpytion capabilities)
> AWS KMS - create and control keys
> Incident Response - Logging and trigger tools that automate responses through AWS APIs. 'Clean rooms - using
AWS CloudFormation
> IAM
> CloudFormation + Cloudwatch Events to trigger auomtated responses in Lambda
Reliability:
Ability for a system to recover from infrastructure or service disruptions, aquire compute and mitigate disruption
Design Principles:
> Test recovery procedures
> Automaticall recover from failure
> Scale horizontally to increase aggregate system availability
> Stop guessing capacity
> Manage change in automation
Best Practice areas for reliability in cloud:
Key service is Cloudwatch
> Foundations (AWS covers most of these - but things like data connection bandwidth is an example)
> IAM
> VPC
> AWS Cloud
> AWS trusted advisor
> AWS shield (managed DDOS service protection)
> Change Management
> CloudTrail
> AWS Config
> Amazon auto scaling
> Failure Management
> Cloudformation
> S3 ( highly durable)
> Glacier (durable archives)
> KMS
Performance Efficiency:
> Meet system requirements by using compute efficiently
Design Principles:
> Democratize advanced technologies (technology as a service)
> Go global in minutes
> Use serverless architectures (storage as a web server for example)
> Experiment more often - virtual and automatable resources
Four best practice areas for performance efficiency in the cloud
Take a data driven approach to selecting a high performance architecture - AWS Cloudwatch is key
> Selection - data driven approach for picking the services and compute types required
> Compute: Auto scaling
> Storage: EBS, S3
> Database: RDS, DynamoDB
> Network - Route 53 = latency based routing. VPC endpoints and AWS Direct connect reduce network distance/jitter
> Review - Once implemented evolve the workload with the new technologies offered by AWS
> AWS blog/ What's new
> Monitoring - use metrics and alarms. Automate issues using triggers through Kinesis, SQS and Lambda
> Cloudwatch
> Lambda to trigger actions
> Tradeoffs
> Amazon ElastiCache, Amazon Cloudfront, AWS Snowball, Read replicas in Amazon RDS to scale heavy workloads
Cost Optimization
Design Principles:
> Adopt a consumption model: Pay only for what you require and decrease and increase with business requirements
> Measure overall efficiency: Measure outputs of workload and cost
> Stop spending money on data center ops (AWS does this for you)
> Analyze and attribute expenditure
> Use managed and app level services to reduce cost of ownership
Four best practice areas for cost optimization:
> Expenditure awareness
> AWS Cost explorer
> AWS Budgets - if spend exceeds actual or forecasted amount
> Cost-effective resources: Use time vs cost as a marker on what to use
> Cost explorer - reserved instance recommendations
> Matching supply and demand: Auto scaling is best way of achieving this
> Auto scaling
> Optimizing over time - Does new service offer significant cost savings?
> AWS Blog and What's new
> Trusted advisor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment