Skip to content

Instantly share code, notes, and snippets.

@nikkaroraa
Created September 18, 2020 07:36
Show Gist options
  • Save nikkaroraa/5c83748bf74ff22bbb7d25504d43e158 to your computer and use it in GitHub Desktop.
Save nikkaroraa/5c83748bf74ff22bbb7d25504d43e158 to your computer and use it in GitHub Desktop.
Foundations of EC2 (Elastic Compute Cloud)

Amazon EC2 (Elastic Compute Cloud)

Credits

Video: https://www.youtube.com/watch?v=kMMybKqC2Y0

Overall layout of EC2

  • Each AWS region contains multiple availability zones

  • Each availability zone has got its own networking domain, power domain, cooling subsystems

  • Each availability zone consists of multiple datacenter buildings

  • These racks of compute servers that we deploy very often have dozens of servers

  • Each server has 4 critical hardware resources that are provisioned as instances

    • Compute
    • Memory
    • Storage
    • Networking
  • Then a hypervisor is used to provision and slice up these resources and provide them to you as virtual machines which we call as instances

EC2 Instance Characteristics

  • CPU
  • Memory
  • Storage
  • Network Performance

M5d.xlarge

  • M: Instance family
  • 5: Instance generation
  • d: Addition capabilities: variations to the core instance. different local instance, options for processor (intel vs amd)
  • xlarge: Instance size

Amazon Machine Images (AMIs)

What runs on these EC2 instances? That's AMI.

AMI contains:

  • Host operating system
  • Hardware drivers needed for EC2 instance to function at optimum performance

AMIs can be:

  • Amazon maintained

    • Broad set of Linux and Windows images
    • Kept up to date by Amazon in each region
    • Amazon Linux 2 with five years of long-term support
  • Marketplace maintained

    • Managed and maintained by AWS marketplace partners
  • Your own machine images

    • AMIs you have created
    • Can keep private, share with other accounts, or publish to the community

General purpose workloads

  • Web/App servers
  • Enterprise apps
  • Gaming servers
  • Caching fleets
  • Analytics applications
  • Dev/Test environments

EC2 instances

General Purpose workloads

Types:

  • M5 instances

    • Balance of compute, memory and network resources
    • "4:1" memory to vCPU ratio
  • Realisation: Most instances aren't very busy, which gave birth to T3 instances

  • T3 instances

    • Baseline level of CPU performance with the ability to burst above the baseline for workloads that don't require sustained performance
    • T3 instances accumulate CPU credits when a workload is operating below baseline threshold. Each earned CPU credit provides the T3 instance the opportunity to burst with the performance of a full CPU core for one minute when needed. T3 instances can burst at any time for as long as required in Unlimited mode.
  • A1 instances

    • Workloads that can scale out across multiple cores, fit within memory, run on ARM instructions

Memory Intensive workloads

Examples:

  • In-memory caches
  • High performance databases
  • Big data analytics

Types:

  • R5 instances

    • Accelerate performance for workloads that process large data sets in memory
    • "8:1" memory to vCPU ratio
  • X1 / X1e instances

    • For memory-intensive workloads and very large in-memory workloads
    • "16:1" and "32:1" memory to vCPU ratio
  • High memory instances

    • Extreme memory needs
    • Certified to run SAP HANA
    • From 6 to 24 TB of memory

Compute Intensive workloads

Sensitive to how fast the CPU can run. Optimized on the frequency the CPU is working at.

Examples:

  • Batch processing
  • Distributed analytics
  • High performance computing (HPC)
  • Ad serving
  • Multiplayer gaming
  • Video encoding

Types:

  • C5 instances

    • High performance at a low price per vCPU ratio
    • 2:1 memory to vCPU ratio
  • z1d instances

    • High single thread performance
    • Fastest processor in the cloud at 4.0 GHz
    • 8:1 memory to vCPU ratio

Storage-intensive workloads

Examples:

  • High IO

    • High performance databases
    • Real time analytics
    • Transactional workloads
    • No SQL databases
  • Dense storage

    • Big data
    • Data warehousing
    • Kafka
    • HDFS
    • MapReduce
    • Log processing

Types:

  • I3/I3en instances

    • I/O optimized for high transaction workloads, low latency workloads
  • D2 instances

    • Lowest cost per storage ($/GB)
    • Supports high sequential disk throughput
  • H1 instances

    • Designed for applications that required low cost, high disk throughput and high sequential disk I/O access to very large data sets
    • More vCPUs and memory per TB of disk than D2

Accelerated Computing workloads

Applications that benefit from hardware acceleration.

  • What is hardware acceleration? Hardware acceleration refers to the process by which an application will offload certain computing tasks onto specialized hardware components within the system, enabling greater efficiency than is possible in software running on a general-purpose CPU alone.

Examples:

  • Machine learning / AI

    • Image and Video recognition
    • NLP
    • Autonomous vehicle systems
    • Personalization & Recommendation
  • High performance computing

    • Computational Fluid Dynamics
    • Financial and Data Analytics
    • Genomics
    • Computational chemistry
  • Graphics

    • Virtual Graphic workstation
    • 3D Modeling and Rendering
    • Video Encoding
    • AR / VR
CPUs vs GPUs vs FPGA vs ASICs for compute acceleration
  • CPU

    • 10s - 100s of processing cores
    • Pre-defined instruction set and datapath widths
    • Optimized for general purpose computing
  • GPU

    • 1,000s of processing cores
    • Pre-defined instruction set and datapath widths
    • Highly effective at parallel execution
  • FPGA (Field Programmable Gate Arrays)

    • Millions of programmable digital logic cells
    • Blank silicon that is user programmable
    • No predefined instruction set of datapath widths
    • Hardware timed execution
  • ASICs (Application Specific Integrated Circuits)

    • Chips that are not general purpose but they have been custom designed for a particular application type
    • Optimized & custom design for particular use / function
    • Predefined software experience exposed through API

Types:

  • P-series (P2/P3 instances)

    • GPU compute instance for use cases including deep learning training, HPC stimulations, financial computing and batch rendering
    • Feature latest NVIDIA high-end GPUs including Volta V100
  • G-series (G3/G4 instances)

    • GPU graphics instance designed for workloads such as 3D rendering, remote graphics workstations, video encoding and AR / VR
    • Feature NVIDIA mid-range GPUs such as Turing T4 GPUs with Grid Virtual Workstation features and license
  • FPGA instances (F1 instances)

    • Customer programmable FPGAs that provide dramatic performance improvements for applications such as financial computing, genomics, accelerated search, and image processing
    • Feature Xilinx Virtex UltraScale+ VU9P FPGAs in a single instance
    • Programmable via VHDL, Verilog, or OpenCL
  • Inf1 instances

    • 40% lower cost per inference than any Amazon EC2 GPU instance
    • 2x higher inference throughput with up to 2,000 TOPS at sub-millisecond latency
    • Integration with popular ML frameworks such as TensorFlow, PyTorch and MXNet

EC2 Bare Metal

  • No hypervisor, access to the entire server
  • We can load our own hypervisor, if that is what we need
  • Run bare metal workloads on EC2 with all the elasticity, security, scale and services of AWS
  • Designed for workloads that are not virtualized, require specific types of hypervisors, or have licensing models that restrict virtualization

All started with Nitro platform

Modular building blocks for rapid design and delivery of EC2 instances:

  • Nitro Card
    • Local NVMe storage
    • Amazon Elastic Block Storage
    • Networking, monitoring and security
  • Nitro Security Chip
    • Integrated into motherboard
    • Protects hardware resources
  • Nitro Hypervisor
    • Lightweight hypervisor
    • Memory and CPU allocation
    • Bare metal-like performance

Purchase Options

  • On demand

    • Pay for compute capacity by the second with no long term commitments
  • Reserved Instances

    • Make a 1- or 3-year commitment and receive a significant discount off On-Demand prices
    • Similar to when you say that "I'm committing to using M5 for the next year"
  • Savings Plan

    • Same great discounts as EC2 RIs with more flexibility
    • Not committing to the instance level, but the price level
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment