Kubernetes Well Architected Review (WAR)

Customer: Midas Touch

Date: February 02, 2020

tags: `meeting` `kubernetes` `EKS` `WAR`

Intent:

Increase awareness of architectural best practices
Addresses foundational areas that are often neglected
Provide consistent approach to evaluating architectures
Influence future architectures

Areas:

Application checklist for Kubernetes
Cluster ready checklist for Kubernetes
Operational consideration for Kubernetes

Standard WAR pillars:

Security
Cost Optimizationm
Operational Excellence
Performance
Reliability

Application Checklist for Kubernetes

Cluster Ready Checklist for Kubernetes

Operational Considerations for Kubernetes

Security

Includes the ability to protect information, systems, and assets while delivering business value through risk assessments and mitigation strategies

Apply security at all layers
Enable traceability
Implement a principle of least privilege
Focus on securing your system
AWS Shared Responsibility Model
Automate security best practices
- Detective Controls
- Infrastructure Protection
- Data Protection
- Incident Response

Reliability

The ability of a system to recover from infrastructure or service failures, dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues

Test recovery procedures
Automatically recover from failure
Scale horizontally to increase aggregate system availability
Stop guessing capacity
Manage change using automation

Performance

The ability to use computing resources efficiently to meet system requirements, and to maintain that efficiency as demand changes and technologies evolve

Democratize advanced technologies
Go global in minutes
Use serverless architectures
Experiment more often
Mechanical sympathy

Cost Optimization

The ability to avoid or eliminate unneeded cost or suboptimal resources while meeting your functional requirements

Cost-effective resources
Matching supply with demand
Expenditure awareness
Optimizing over time

Operational Excellence

The ability to run and monitor systems to deliver business value and to continually improve supporting processes and procedures

Preparation
Operation
Response

What best practices for cloud operations are you using?
How are you doing configuration management for your workload?
How are you evolving your workload while minimizing the impact of change?
How do you monitor your workload to ensure it is operating as expected?
How do you respond to unplanned operational events?
How is escalation managed when responding to unplanned operational events?

cgswong/kubernetes-war.md