Created
January 19, 2017 15:26
-
-
Save ma3574/c4a5fdd5cc16dcfb43ca3a2829081a2a to your computer and use it in GitHub Desktop.
Notes from my AWS Training for the AWS Certified Developer Associate Exam
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
===== | |
===== | |
Intro | |
===== | |
===== | |
========== | |
Networking | |
========== | |
VPC - Virtual Private Cloud | |
- Think of it as your own data centre in AWS | |
Direct Connect | |
- Connect into AWS environment without internet connection. | |
- A dedicated connection from your data centre to AWS | |
Route53 | |
- Amazon's DNS | |
======= | |
Compute | |
======= | |
EC2 | |
- Virtual server | |
EC2 Container Service | |
- EC2 with Docker | |
Elastic Beanstalk | |
- Upload code and Amazon will provision servers etc | |
Lambda | |
- Run code without managing servers | |
- Only pay for compute time | |
S3 | |
- Store files | |
- Object based storage | |
- Heavily features on exam | |
Cloud Front | |
- Amazon CDN | |
- Links in with edge locations | |
- Heavily features on exam | |
Glacier | |
- Archival solution for data | |
EFS <- (Should this say EBS?) (???) | |
- NAS in the cloud | |
- Centralise EC2 storage | |
- Block storage | |
- Not necessarily in exam | |
Snowball | |
- Send in hard disks to Amazon | |
- Storage box for petabyte level stuff | |
Storage Gateway | |
- On platform and cloud integration | |
- A VM you run in your data centre to copy | |
- Can be hot exam topic | |
========== | |
Databases | |
========== | |
RDS | |
- MySQL, Postgres etc | |
DynamoDB | |
- Most important for exam | |
- NoSQL DB servers | |
- Have the best understanding of | |
Elasticache | |
- Cache in memory queries | |
- Redis and MemcacheD | |
Redshift | |
- Business intelligence service | |
DMS | |
- Databases migration services | |
- migrate DB from legacy places to open source DBs (e.g. Oracle -> RDS) | |
========= | |
Analytics | |
========= | |
EMR | |
- Elastic Map Reduce | |
- Process big data | |
Data Pipeline | |
- Move data from one area to another | |
Elastic Search | |
- Elastic Search in the cloud | |
- Not in the exam | |
Kinesis | |
- Streaming data | |
- connected devices etc | |
- Has started appearing in exams | |
Machine Learning | |
- E.g. You bought x what about y | |
Quick Sight | |
- E.g. Like Cognos | |
- Cloud powered BI service | |
- Not currently in exam | |
====================== | |
Security and Identity | |
====================== | |
IAM | |
- Identity Access Management | |
- Big part of exam | |
- Users and groups etc | |
- Password policies | |
Directory Service | |
- Diff types | |
Inspector | |
- Allows you to install agents on to your system and will allow you to check for vulnerabilities | |
- e.g. to stop you doing root log on remotely | |
- Very new so not on exam | |
WAF | |
- New | |
Cloud HSM | |
- Hardware security module | |
- Not on Associate exam | |
KMS | |
- Key Management Service | |
================ | |
Management Tools | |
================ | |
Cloud Watch | |
- Monitoring AWS env | |
- Performance monitoring (CPU etc) | |
Cloud Formation | |
- Script your infrastructure | |
- Solution Architect and DevOps - likely to come up | |
Cloud Trail | |
- For auditing | |
- Logs changes to your env | |
Opsworks | |
- Config management service | |
- Uses chef | |
- Comes up in exam | |
Config | |
- New | |
- Not in exam | |
- AWS resource inventory | |
- Config rules that check your AWS resources (e.g. EBS volumes are encrypted) | |
- Traffic light system | |
Service Catalog | |
- Not on exam | |
- Keep track of approved IT systems for use on AWS | |
Trusted Advisor | |
- Can come up in the exam | |
- Automated service that scans your environment and advises you: | |
- Save money on x | |
- Secure using y | |
==================== | |
Application Services | |
==================== | |
API Gateway | |
- Manage / Create and monitor APIs at any level | |
- Could appear in exam | |
AppStream | |
- Stream Windows apps from the cloud | |
- Fairly new | |
CloudSearch | |
- Managed service | |
- Search solution for your site | |
- Not on exam | |
Elastic Transcoder | |
- Media transcoding service in the cloud | |
- Transcode to formats for mobiles and PCs etc | |
SES | |
- Send transactional emails | |
- Can receive emails | |
- Can come up in the exam | |
SQS | |
- Good for decoupling | |
- Comes up on exam a lot | |
SWF | |
- Background tasks | |
- Amazon use it to collect your order and send it to you | |
- Comes up in exam | |
- Don't confuse with SQS | |
=============== | |
Developer Tools | |
=============== | |
CodeCommit | |
- Source control service | |
- Git managed for you | |
CodeDeploy | |
- Automates deployments | |
- Can deploy to cloud EC2 or on premise | |
CodePipeline | |
- Continuous deployments | |
- Like Jenkins? | |
=============== | |
Mobile Services | |
=============== | |
Most not on exam | |
Mobile Hub | |
- Building and testing and monitoring your apps | |
Cognito | |
- Save mobile user data | |
- No need to manage back end | |
Device Farm | |
- Test against real smartphones | |
Mobile Analytics | |
- New / old users | |
- Track | |
SNS | |
- Will be on the exam | |
- Send notifications from the cloud | |
======================= | |
Enterprise Applications | |
======================= | |
Workspaces | |
- Know what is at high level | |
- VDI | |
- Desktop in cloud | |
WorkDocs | |
- Secure enterprise storage services | |
- Dropbox for enterpris | |
- NOE | |
WorkMail | |
- Exchange equivalent | |
- NOE | |
Internet of Things | |
- Very new and broad | |
GameLift | |
- Managed service for deploying and operating session-based multiplayer games | |
Elastic Load Balancing | |
- Is this needed? Not in course materials? | |
============ | |
============ | |
Main Content | |
============ | |
============ | |
=== | |
IAM | |
=== | |
Intro | |
- Centralised control of AWS account | |
- Shared access to your AWS account | |
- Granular rights and permissions | |
- Identity federation using third parties like FB or Active Directory | |
- Multi factor authentication | |
- Provide temporary access to users and services | |
- Password policies | |
- Integrates with AWS tools | |
- User: End user | |
- Group: collection of users under one set of permissions (e.g. HR) | |
- Roles: Create and assign to AWS resources (e.g. S3 role) (???) | |
- Policies: Doc that defines one or more permissions, basically permissions. Can attach to user, group or role | |
- PCI DSS compliant | |
- By default new users don't have any permissions | |
- Not recommended to access via root account normally | |
- Roles allow one area of AWS to access resources in another area of AWS | |
Active Directory Federation | |
- ADFS server inside corporate network | |
- You get a SAML assertion | |
- Point to AWS to log into AWS Console | |
- Can authenticate with Active directory using SAML | |
- Remember you authenticate against AD first then given a temporary login | |
Web Identity Federation with Mobile Applications | |
- Can authenticate against 3rd parties | |
- Outside scope of exam | |
- Web Identity Facebook playground available | |
- ARN - Amazon Resource Name | |
- Call FB | |
- You get temp security credentials | |
- Call AssumeRoleWithWebIdentity | |
- Don't need to know how to do this using programming | |
What have we learnt so far? | |
- IAM is universal and is not region specific | |
- Root account is the account created when you set up AWS account | |
- New user have no permissions | |
- Access Key ID and Secret Access Keys also can be created for new users | |
- Not used for logging into console but for API calls | |
- Can't view them again | |
- Always set up MFA on root account | |
- Can create custom rotation policies | |
=== | |
EC2 | |
=== | |
What is it? | |
- Quick servers online | |
- No need to manage physical hardware | |
- Provision as and when you need | |
Pricing Options | |
- On Demand | |
- Pay fixed rate by hour with no commitment | |
- Users that want low cost and flexibility with no up front costs | |
- Applications with short term or spiky or unpredictable workloads that cannot be interrupted | |
- Applications being developed or tested on Amazon EC2 for the first time | |
- Reserved | |
- Reserve an instance for a one to three year term | |
- Massive discounts | |
- Applications with steady state and predictable usage | |
- Able to make up front payment costs | |
- Spot | |
- Price moves around | |
- You set a bid price what you want to pay by the hour | |
- For flexible start and end times | |
- Good for Hadoop style processing | |
- Can check spot prices online | |
- Scenarios will be given | |
- Flexible start and end times | |
- Users with urgent computing needs for large amounts of additional capacity | |
- If spot is terminated by you, you will be charged for partial hour of usage. If Amazon kill the instance you won't be charged for that partial hour. | |
- What is the most commercially feasible: Spot | |
- No down time: On Demand | |
Instance Types | |
- D for density | |
- I for IOPS | |
- R for RAM | |
- T for cheap general purpose (T2 micro) | |
- M for main choice for general purpose apps | |
- C for compute | |
- G for Graphics | |
- + 3 more from latest Keynote: | |
- X - (???) | |
- F - FCGA | |
- P - (???) | |
=== | |
EBS | |
=== | |
- Elastic block storage | |
- Can attach these to EC2 instances | |
- Disk in the cloud | |
- Operating system etc on it | |
- Can attach multiple EBS to one EC2 | |
- Can't attach one EBS to multiple EC2 | |
- Types: | |
- General purpose SSD(GP2) | |
- Designed for 99.999% availability | |
- 3 IOPS per Gig | |
- Can burst to 3000 IOPS for short periods | |
- Provisioned IOPS SSD (IO1) | |
- Designed for I/O intensive applications | |
- If you need more than 10,000 IOPS | |
- Magnetic (Standard) | |
- Lowest cost per Gig | |
- Workload where data is accessed infrequently | |
- Where storage cost is important | |
- Good for file servers | |
========= | |
Exam Tips | |
========= | |
- Know differences between pricing models | |
- Know EBS types | |
======================== | |
======================== | |
Storage | |
======================== | |
======================== | |
<Some sections missing here, sorry> (???) | |
=============== | |
Storage Gateway | |
=============== | |
- Asynchronous sync | |
- Copies data from your on premise data centre to Storage Gateway | |
- Storage Gateway software is a virtual machine image that is installed on a host in on premise data centre | |
- Supports VMWare ESXI or Microsoft Hyper-V | |
- Connect to your AWS account and then use AWS Management Console to create the storage gateway option suited to you | |
- 4 types of storage gateways | |
- File Gateway (NFS) - Flat files (pdf, word, etc.) for S3 | |
- Volume Gateway (iSCSI) - Block based storage for what you would run your OS system on | |
- Stored volumes - where you store entire data set on premise | |
- Cached volumes - Only store frequently accessed data on premise and rest goes to Amazon | |
- Tape Gateway (VTL) - Backup and archive solution (virtual tapes for backup) | |
- File Gateway | |
- Files stored in S3 buckets | |
- Accessed through NFS mount point | |
- Flat files | |
- Volume Gateway | |
- Block based storage | |
- Uses iSCSI block protocol | |
- Data written to these volumes can be asynchronously backed up as point in time snapshots | |
- Snapshots stored as Amazon EBS snapshots | |
- Snapshots are incremental (capture changed blocks) | |
- Snapshot storage is compressed to save you money | |
- Essentially virtual hard disks | |
- Two types: | |
- Stored volumes | |
- All data set locally | |
- Asynchronously back up to AWS | |
- Data written to the virtual hard disk is stored locally and to AWS | |
- 1GB - 16TB in size | |
- Cached volumes | |
- S3 is main store of data | |
- Only recently read data on premises | |
- 1GB - 32TB in size | |
- Tape Gateway | |
- Backups | |
- Virtual Tape Library | |
- Physical tapes replaced with virtual tape and you use same software | |
======== | |
Snowball | |
======== | |
- Data into AWS | |
- Used to be called import/export | |
- Import to S3 or export | |
- If data is in Glacier, will need to be moved to S3 | |
- Snowball | |
- 80TB snowballs in all regions | |
- Tamper resistant enclosures | |
- AES 256 | |
- Come with Kindles that track your snowball | |
- Securely erased after | |
- Snowball Edge | |
- 100TB of data | |
- Come with on-board storage and compute ability | |
- Can run Lambda functions | |
- Use case: Aeroplane - Data collection with lambda functions and then sent to AWS to be put on S3 | |
- Mini data centre in a box | |
- Snowmobile | |
- Peta and Exabyte levels | |
- Literally a truck | |
- 100PB of data transfer | |
======================== | |
S3 Transfer Acceleration | |
======================== | |
- Think reverse Cloudfront | |
- People upload to nearest edge location and then data gets transferred to your s3 bucket | |
- You get a distinct url: | |
- muhammed.s3-accelerate.amazonaws.com | |
=============== | |
Storage Summary | |
=============== | |
- S3 | |
- Object based (not for O/S) | |
- 0 Bytes to 5 TB | |
- Universal bucket name space | |
- s3-<region>.amazonaws.com | |
- Read after write consistency | |
- Eventual consistency for overwrite puts and deletes | |
- Storage Classes: | |
- Normal - Immediate and durable, frequently accessed | |
- Infrequently Access - Durable, immediately available, infrequent access with some latency | |
- Reduced Redundancy Storage - Data that can be recreated | |
- Glacier - Archival (3 - 5 hour wait) | |
- Core: | |
- Key (name) | |
- Value (data) | |
- Version ID | |
- Metadata | |
- Access control lists | |
- Versioning | |
- Good backup | |
- Charged for all versions as using space | |
- Once enabled can only be suspended and cannot be disabled | |
- Integrates with lifecycle rules | |
- MFA on delete available | |
- Cross region replication requires versioning to be enabled on source/destination bucket | |
- Lifecycle management | |
- Can be used with or without versioning | |
- Transition to S3 IA | |
- Only files larger than 128Kb | |
- Only after 30 days from creation | |
- Archive to Glacier | |
- 30 days after S3IA | |
- Next day if nor using IA | |
- Permanently delete | |
- Cloudfront | |
- Edge locations cache locations | |
- 78 edge locations as of today (19/01/17) | |
- Origin | |
- Source of files distributed by CDN | |
- S3 bucket | |
- EC2 instance | |
- Elastic Load Balancer | |
- Route53 | |
- Distribution | |
- Name given to collection of Edge locations that will do the distributing of files | |
- Two types: | |
- Web distribution - Typically used for websites | |
- RTMP - Used for media streaming | |
- Edge locations can be written to | |
- Objects cached for TTL (in seconds) | |
- Default is 24 hours | |
- Can delete cache on distribution but will cost | |
- Security | |
- Buckets are private on creation | |
- Can set up access control to buckets using: | |
- Bucket policies (???) | |
- Access control lists (properties in AWS console like viewable by everyone) | |
- Access logs can be written by S3 bucket to itself or another bucket | |
- Encryption | |
- In transit: | |
- SSL/ TLS (https) | |
- At rest: | |
- Server side: S3 Managed Keys - SSE-S3 | |
- AES 256 | |
- S3 managed keys | |
- Each object is encrypted using a unique key | |
- Key is encrypted using Master key | |
- Amazon manages this | |
- AWS Key Management Service, Managed Keys SSE-KMS | |
- Similar to S3 one | |
- Envelope keys available for your keys (???) | |
- Audit trail of who has used your keys and when | |
- Server side encryption with customer provided keys - SSE-C | |
- You manage keys and AWS just uses them | |
- Client side encryption | |
- Encrypted before hits AWS | |
- Storage Gateway: | |
- Gateway Stored: | |
- All cached locally | |
- Gateway Cached: | |
- All on S3 and recently accessed cached locally | |
- Gateway Virtual Tape Library (VTL): | |
- Backup using existing software | |
- Import / Export | |
- Into EBS | |
- Into S3 | |
- Into Glacier | |
- ONLY export from S3 | |
- Snowball | |
- Import to S3 | |
- Export to S3 | |
- Misc | |
- Elastic load balancers are not free (per hour and per Gb) | |
- Cloudformation, Elastic beanstalk, Autoscaling, Opsworks free but resources provisioned not | |
- Http | |
- 200 - success | |
- 300 - redirection | |
- 400 - client errors | |
- 500 - server errors | |
- S3 Links | |
- s3-<region>.amazonaws.com/mybucket | |
- mybucket-website.s3-website-<region>.amazonaws.com (???) | |
- S3 Data storage | |
- Data stored in alphabetical order | |
- Try not to use the same starting key | |
========= | |
========= | |
Databases | |
========= | |
========= | |
- RDS - Managed relational database | |
- DynamoDB - Managed NoSQL database | |
- Elasticache - In memory database | |
- Redshift - Data warehousing | |
- DMS - Managed Database Migration Service | |
=== | |
RDS | |
=== | |
- For OLTP | |
- Relational Databases available on RDS: | |
- TODO Check this list on AWS | |
- SQL Server | |
- Oracle | |
- MySQL Server | |
- PostgreSQL | |
- Aurora | |
- MariaDB | |
======== | |
DynamoDB | |
======== | |
- Is in the exam! | |
- Document orientated | |
- Database: | |
- Collection - Table | |
- Document - Row | |
- Key Value Pairs Field | |
================ | |
Data Warehousing | |
================ | |
- Amazon Redshift | |
- For OLAP | |
- Used for BI | |
- For large and complex data sets | |
- OLTP (Online Transaction Processing) | |
e.g. Get Order 1234 | |
- OLAP (Online Analytics Processing) | |
e.g. Statistics such as # sold etc. | |
- Warehousing Databases use a totally different architecture | |
=========== | |
Elasticache | |
=========== | |
- Web services for in-memory cache in the cloud | |
- Fast caches for your web apps | |
- Supports: | |
- Memcached | |
- Redis | |
- Caches the most queried part of your DB | |
=== | |
DMS | |
=== | |
- Allows migration of prod database onto AWS | |
- AWS manages it all for you | |
- Sorts things like type transformation | |
- Even looks at changes that happened to your database whilst migration happening and copies them over | |
- Sorts parallel loading of prod database to AWS | |
- Can handle schema conversion including custom views and stored procedures | |
============ | |
DynamoDB 101 | |
============ | |
- For consistency | |
- Single digit latency | |
- Fully managed | |
- Supports: | |
- Document | |
- Key/Value | |
- SSD storage | |
- 3 geographically distinct data centres | |
- Written to one location and then replicated | |
- Two models for consistency: | |
- Eventual consistency (default) | |
- Copies all data and reads after 1s will return new data from all centres | |
- Strongly consistent reads | |
- Returns result that reflects all writes that received a successful response prior to the read | |
- Tables (???) | |
- Items (Rows) | |
- Attributes (Columns) | |
- Start with Primary Key - 35 levels of nesting allowed | |
- First 25 GB is free | |
- Read and Write capacity units: | |
- Read capacity unit: | |
- Eventual consistency: 2 per second | |
- Strongly consistent: 1 per second | |
- Write capacity units: | |
- Up to 1KB in size: 1 per second | |
- Free tier | |
- 25 read units | |
- 25 write units | |
- Primary keys: | |
- Single attribute (think unique id) | |
- Called partition key / hash key | |
- one attribute | |
- Composite (multiple attributes) | |
- Made up of partition key & sort key (2 attributes) | |
- Partition key used internally by DynamoDB to figure out where on disk stored | |
- Partition Key and sort key | |
- Two items can have the same partition key but must have a different sort key | |
- All items with same partition key are stored together and sorted using sort key value | |
- Local secondary index | |
- Same partition key | |
- Different sort key | |
- Local secondary index cannot be created later, needs to be added on table creation | |
- Global secondary index | |
- Different partition key | |
- Different sort key | |
- Can be added later if needed | |
- Dynamo DB Streams | |
- Capture any kind of modification to DynamoDB | |
- e.g. - New item added | |
- Before/after attributes | |
- item deleted, catches item before item deleted | |
- Stored for 24 hours | |
- Can define triggers to use lambda functions to do whatever | |
- replicate data to another region | |
- Send email | |
- Query | |
- Finds items in a table using only primary key attribute | |
- You provide partition attribute name and a distinct value to search for | |
- Optionally provide a sort key and use comparison operators | |
- By default a query returns all of the data attributes (columns) | |
- Can use ProjectionExpressions to only return some attributes | |
- Items will be sorted by the sort key ASC by default (can use ScanIndexForward to false for DESC) | |
-Scan | |
- Examines every item in the table | |
- Returns all the attributes similarly to a query | |
- Can use ProjectionExpressions | |
- Queries are generally more efficient | |
- Scans go over the whole table for the requested values and can use up provision throughput in one single operation | |
- Avoid scans on large tables | |
- Design tables to use Get or BatchGet API endpoints | |
- Provisioned throughput | |
- Read and write perspective | |
- Will definitely appear on exam | |
- Some allowed to use pen and paper | |
- Read provisioned throughput | |
- All reads rounded up to 4KB in size | |
- Eventually consistent = 2 reads per second | |
- Strongly consistent = 1 read per second | |
- Write provisioned throughput | |
- All writes are 1KB | |
- All writes consist of 1 write per second | |
- Read throughput formula: | |
- (size of read rounded to nearest 4KB chunk / 4KB) * no of item = read throughput | |
- Eventually consistent then /2 | |
- e.g. | |
A) 10 items of 6KB means: | |
(8 / 4) * 10 = | |
20 for strongly consistent | |
10 for eventual consistent | |
B) 5 items of 10KB: | |
(12 / 4) * 5 = | |
15 for strongly consistent | |
7.5 -> 8 for eventual consistent | |
- Write throughputs: | |
items * size in KB | |
- Question: Are throughputs averaged out // TODO | |
- If exceeded provisioned throughput: | |
- ProvisionedThroughputExceededException | |
- Means you exceeded the maximum allowed provisioned throughput for a table or | |
- You exceeded provisioned throughput for one or more global secondary indexes // TODO DynamoDB Provisioned Throughput | |
- Web Identity Providers | |
- Facebook, etc. OpenID suppliers | |
- Web Identity API | |
- Flow: | |
- User authenticates with OpenID supplier | |
- Gets a web identity token | |
- Uses AssumeRoleWIthWebIdentity request with AWS | |
- AWS issues temporary security credentials (1 hour default) - 15 min to 1 hour | |
- AccessKeyID, SecretAccessKey, SessionToken | |
- Expiry time | |
- AssumeRoleID | |
- SubjectFromWebIdentityToken (i.e id of IAM policy for this identity provider) | |
- Remember to add permissions to allow the connection from Facebook etc | |
- Misc Facts: | |
- Conditional updates are idempotent | |
- Question: Do we have to define these? | |
- Atomic counters | |
- UpdateItem (inc / decrement) | |
- Regardless of current value | |
- Counters are not idempotent | |
- Banking - atomic counters not suitable | |
- BatchGetItem | |
- Can retrieve up to 1MB of data | |
- Can contain up to 100 items | |
- Single BatchGetItem request can get items from multiple tables | |
- Summary | |
- Most important topic in Developer exam | |
- Fast flexible NoSQL database service | |
- Single digit latency at any scale | |
- Fully managed | |
- Document and key/value stores | |
- Suitable for many use cases | |
- SSD storage | |
- 3 different data centres | |
- Two read models: | |
- Eventual consistent reads (default) | |
- Consistency reached within 1 second normally | |
- Strongly consistent reads | |
- Tables | |
- Items (row of data) | |
- Attributes (cols) | |
- Primary Keys | |
- Single attribute: | |
- Partition key / hash key | |
- Composite keys: | |
- Unique id + sort/range key | |
- composed of two attributes | |
- Think forum comments model | |
- DDB uses partition key as value in internal hash function | |
- Determines location on disk | |
- No two partition keys can have the same value | |
- Partition key + sort key | |
- Hash and then all sort keys for given partition key stored in same location sorted by sort key values | |
- Indexes | |
- Local secondary index | |
- Same partition key but using a different sort key | |
- Can only be created at time of table creation and can't be deleted | |
- Global secondary index: | |
- Different partition key and different sort key | |
- Can be added later and deleted | |
- DDB Streams | |
- Old / new captured (inserts / updates / deletes) | |
- Query | |
- Finds items in table using only primary key attribute values | |
- Scan | |
- Examines every single item in the table | |
- Returns all the attributes by default in the table for all items | |
- Can use projections to return only some of the items | |
- More efficient to use a query | |
- Provision throughput calculations | |
- Reads: (up to next 4KB boundary / 4KB) * no of items (x1 - strongly consistent, /2 - eventually consistent) | |
- Writes: Everything 1KB (size in KB of item) * no of items per seconds | |
- Error code: 400 ProvisionThroughputExceededException | |
- Auth with WebIndentityProvider: | |
- Auth with FB and get token | |
- AssumeRoleWithWebIdentity API with ARN for IAM role | |
- Credentials get back (1 hour default) | |
- Conditional writes | |
- If x is y then update otherwise don't | |
- Idempotent | |
- Atomic counters | |
- Not idempotent | |
- Inc / Dec values | |
- Applied in order received | |
- Batch operations: | |
- BatchGetItem NOT BatchGetItems <-Notice you don't add the 's' at the end | |
- 100 items limit | |
- 1MB limit | |
- Span multiple tables | |
- TODO Read the Dynamo FAQ!!! | |
- Using the AWS portal, you are trying to Scale DynamoDB past its preconfigured maximums. Which service can you increase by raising a ticket to AWS support? = Provision throughput limits | |
==================== | |
==================== | |
Simple Queue Service | |
==================== | |
==================== | |
- Message queue waiting to be processed | |
- Distributed buffer | |
- Good for decoupling and autoscaling | |
- 256KB of text in any format (billed at 64KB chunks) | |
- Queue is good for different speeds and connectivity of producer / consumer | |
- SQS offers at least once delivery | |
- Single queue can be used by many distributed application components | |
- Always available | |
- Does not guarantee FIFO on normal queues | |
- You would need to add sequencing yourself | |
- You PULL messages from queues | |
- 12 hours visibility timeout | |
- 120,000 inflight messages per queue | |
- For FIFO queues, there can be a maximum of 20,000 inflight messages per queue | |
- Normally received once but your app should be able to handle receiving the same message twice or more | |
- 1 million SQS requests free | |
- Single request can have between 1 to 10 messages up to total payload of 256KB | |
- Decouple = SQS | |
- Priority - Two queues (low/high) if you need that | |
- 30 seconds for visibility timeout | |
- Long poll saves you money - long poll max timeout is 20 seconds | |
- Queues can subscribe to SNS topics | |
- Message sent to SNS topic will be fanned out to subscribing SQS queues for that topic | |
- ChangeMessageVisibility to increase length of time to process the job | |
- What is the maximum retention period for an SQS message? 14 days | |
- Default retention period is 4 days | |
=========================== | |
=========================== | |
Simple Notification Service | |
=========================== | |
=========================== | |
- Send notifications from the cloud | |
- Pub / sub paradigm using push | |
- No polling needed | |
- Pushes messages to: | |
- SMS | |
- To SQS (Fan out etc.) | |
- Any http end point | |
- SNS messages are stored across availability zones | |
- Topics | |
- Allows you to group multiple recipients using topics | |
- Topic is a "Access point" to allow multiple people to receive identical copies of messages | |
- One topic can support multiple endpoints: e.g. iOS, Android and SMS | |
- You publish once and SNS delivers correctly formatted message to each subscriber | |
- Benefits | |
- Instantaneous push delivery | |
- Simple API | |
- Flexible message delivery over multiple transport protocols | |
- Inexpensive pay as you go | |
- AWS console allows GUI interaction with service | |
- SNS vs SQS | |
- SNS pushes | |
- SQS pulls | |
- Both messaging systems | |
- Pricing depends on delivery mechanism used (SNS, HTTP, SMS, Email) | |
- Data format of emails is JSON | |
- Subject | |
- Message | |
- Timestamp | |
- Signature | |
- Topic | |
- Message ID | |
- Signatures | |
- Unsubscribe URL | |
- Summary | |
- Topic + Subscribers = SNS | |
- Push | |
- No Polling | |
- Protocols: | |
- Https | |
- Http | |
- Email-JSON | |
- SQS | |
- Application | |
- Messages can be be customised for each protocol | |
======================= | |
======================= | |
Simple Workflow Service | |
======================= | |
======================= | |
- Web service to coordinate across distributed components | |
- e.g. media processing, business processing flows | |
- Tasks - Processing steps that can be performed by executable code, web calls, people, etc. | |
- Workers | |
- Programs that interact with SWF to get tasks, | |
- process received tasks | |
- return results | |
- Decider | |
- Controls co-ordination of tasks | |
- Ordering of tasks | |
- Concurrency | |
- Scheduling based on application logic | |
- SWF brokers the interaction between workers and deciders | |
- Keeps view of what tasks are where and which workers are free | |
- Ensures that a task is only assigned once and only once | |
- Task is never duplicated | |
- Maintains state so workers/deciders don't need to keep track of state and just get on with what they're doing | |
- Domain | |
- A scope | |
- Think of a container | |
- Help isolate types, executions and task lists from others in your account | |
- Can create a domain using AWS console | |
- Can use API RegisterDomain action to create a domain as well | |
name, description, workflowExecutionRetentionPeriodInDays | |
- Max workflow can be one year | |
- Max time is measured in seconds | |
- SWF vs SQS | |
- SWF: | |
- Task oriented API | |
- Keeps track of state | |
- SQS: | |
- Message orientated | |
- Have to handle duplication | |
=============== | |
=============== | |
Cloud Formation | |
=============== | |
=============== | |
- A lot of content in this but covers high level Cloud Formation info | |
- You deploy CF in stacks | |
- Can use sample templates to speed things up | |
- Templates are JSON | |
- Params | |
- DB setup | |
- You fill in parameters that scripts need | |
- Can create a SNS topic for when it's done | |
- Can do things like Rollback on failure | |
- Cloudformation is free but underlying resources provisioned are NOT free | |
- In console: | |
- Events tab will show us what's being created | |
- Outputs tab will show us the items you define in the output section of your template | |
- e.g. instance/balancer urls | |
- Fn::GetAtt gets all these values | |
- If you have rollback enabled and there's an error at one point, it deletes resources it's created | |
- Don't have to enable rollback | |
================= | |
================= | |
Elastic Beanstalk | |
================= | |
================= | |
- Again free but resources provision cost | |
- Predefined configs: | |
- Node.js | |
- PHP | |
- Python | |
- Ruby | |
- Tomcat | |
- IIS (.net) | |
- Java | |
- Go | |
- Preconfigured - Docker configs: | |
- Glasssfish | |
- Python | |
- Go | |
- Generic configs: | |
- Docker | |
- Multi container docker | |
- Can auto create load balancers | |
- Allows you to configure what instance types etc | |
- Can have an application health check url | |
- Can configure root volumes etc | |
- Create Cloudwatch alarms so you can keep an eye on your beanstalk | |
- Can provide params such as AWS Access Key ID and Secret Key so provisioned resources can be accessed by another one of your applications | |
============================ | |
============================ | |
Virtual Private Clouds (VPC) | |
============================ | |
============================ | |
- Comes up in all 3 exams | |
- Need to know inside out | |
- VPC is a logical datacentre | |
- VPCs do not span regions | |
- VPCs can span availability zones | |
- Launch AWS resources in your own virtual network environment | |
- You control subnets and IPs | |
- Routing tables | |
- Network gateways | |
- You can also create hardware VPN links between your corporate datacenter and your VPC to use AWS cloud | |
- i.e. hybrid cloud | |
- Private IP address ranges: | |
- 10.0.0.0 - 10.255.255.255 (10/8 prefix) | |
- 172.16.0.0 - 172.31.255.255 (172.16/12 prefix) | |
- 192.168.0.0 - 192.168.255.255 (192.168/16 prefix) | |
- Amazon gives us max /16 networks so that's what we will use | |
- Internet Gateway - how the web reaches our servers | |
- Virtual Private Gateway - The termination point for the VPN connection from our data centre | |
- Network is routed next according routing tables | |
- Then go into Network ACL | |
- Then into subnets that have security groups | |
- Public - Internet visible | |
- Private - Hidden from internet | |
- IMPORTANT 1 subnet = 1 availability zone | |
- Can't span a subnet across availability zones | |
- Can span multiple subnets: | |
- Route table | |
- Network ACL | |
- Security groups | |
- You can launch instances into a subnet of your choosing | |
- You assign custom IP address ranges in each subnet | |
- Configure route tables between subnets (i.e. set if public or private) | |
- We can create internet gateways and attach them to our VPC | |
- You can ONLY have 1 internet gateway per VPC | |
- Allows you to have better security over your resources | |
- Can create instance security groups | |
- Stateful. If you allow http in default you allow it out also | |
- Subnet ACLs (Access Control Lists) | |
- Stateless. If you allow http in you have to create a rule specifically out | |
- Default VPC | |
- You get a default VPC in every region around the world when you create an AWS account | |
- Makes it easy to deploy your EC2 instances straight away | |
- All subnets in the default VPC have a route out to the internet automatically | |
- Each Instance in default VPC will have both public and private IP address | |
- If you delete your default VPC you will need to raise a ticket to get it back | |
- You can have multiple VPC connected to each other (VPC Peering) | |
- Can have multiple VPCs in a region | |
- Uses direct network route using private IP addresses | |
- Doesn't go via internet, does it privately | |
- Instances behave as if they're on the same private network | |
- Can peer VPCs with other AWS accounts as well as others in your own account | |
- Peering is always done in a star configuration | |
- e.g. 1 central VPC peers with 4 others | |
- No transitive peering takes place. | |
- e.g. A is connected to B, B is connected to C. You can't send packets from A -> C | |
- Have to set up peering between A and C as long as no overlapping CIDR blocks | |
- Summary | |
- VPC is a logical data centre in AWS | |
- VPCs consist of: | |
- Virtual Private Gateways | |
- Internet Gateways (1) | |
- Route Tables | |
- Network ACLs | |
- Subnets | |
- Security groups | |
- 1 subnet = 1 availability zone | |
- Security groups are stateful | |
- Network Access Control Lists are stateless | |
- No transitive peering | |
======= | |
VPC Lab | |
======= | |
- Start | |
- Name it | |
- CIDR block - The ip address range | |
- Tenancy | |
- On creation it: | |
- creates a new route table for your VPC | |
- creates a new security group | |
- creates a new network ACL | |
- doesn't create new subnets by default | |
- doesn't create new gateways | |
- Subnets: | |
- Subnets needed to deploy anything in a VPC | |
- CIDR block needed: e.g. 10.0.1.0/24 | |
- Some special ip addresses: | |
- 10.0.0.0 network address | |
- 10.0.0.1 AWS reserved for VPC router | |
- 10.0.0.2 AWS reserved for DNS server (always network address + 2) | |
- 10.0.0.3 AWS reserved for future AWS use | |
- 10.0.0.255 network broadcast address | |
- Second subnet would be e.g. 10.0.2.0/24 | |
- Internet gateways | |
- can be attached to the VPC | |
- Only one internet gateway per VPC | |
- Route tables | |
- Best to keep auto generated route table | |
- make the main route table private | |
- Create new ones and associate with your vpc | |
- Can create routes that allow outbound internet connections: | |
0.0.0.0/0 nameOfYourInternetGateway | |
- When you create a new subnet it will be auto associated with your main route table hence you don't want to add rules to it such as allowing all access to internet as any new subnets will get that by default | |
- Subnet associations: | |
- pick and add | |
- Subnet action: | |
- Enable auto assign ip to allow internet to access that are in that subnet | |
- If you forget to assign an IP address to an instance you can create a new elastic ip address and associate it later | |
- ICMP traffic is things like pinging a server | |
========================== | |
NAT Instances and Gateways | |
========================== | |
- A way to allow the server in the private subnet to be able to connect to internet and get updates etc that it might require | |
- NAT instance | |
- Deploy to your public subnet | |
- Just an EC2 instance | |
- Need to be behind a security group unlike NAT gateways | |
- By default an EC2 instance is the source or destination for traffic, for our NAT instance we want to disable the source/destination check in the EC2 console | |
- Add route to the main Route Table to allow connections to the NAT instance | |
- NAT Gateways: | |
- Have to specificy which subnet | |
- Placed in public subnet | |
- Auto allocate an elastic ip to your NAT gateway | |
- Again need to update main route table for VPC to allow connections to the NAT gateway | |
- No need to disable source/destination check | |
- No need to place behind a security group | |
- You don't have to maintain a NAT gateway yourself | |
- Amazon manages updates to the NAT | |
- Supports bursts of up to 10Gbps | |
- Summary | |
- NAT Instances very old form EC2 days | |
- Disable src/dest checks | |
- Must be in a public subnet | |
- Must be a route out from the private subnet to the NAT | |
- Needs a public IP | |
- Amount of traffic depends on instance size | |
- Can create high availability by using autoscaling groups and multiple subnets in different availability zones and scripts to automate failover | |
- NAT gateways | |
- New and preferred | |
- Scale to 10Gbps automatically | |
- No need to patch | |
- No security groups needed | |
- Auto assigned a public IP | |
- Still have to update your routes | |
- No need to disable src/dest checks | |
============================ | |
Network Access Control Lists | |
============================ | |
- Security groups act at instance level (1st layer of defence) | |
- Network ACLs act at the subnet level (2nd layer of defence) | |
- NACLs are more fine grained (e.g. can block a specific ip) | |
- In security groups, everything is denied by default and you open up ports | |
- In NACLs you can allow or deny things. e.g. deny user from ip address x via ssh | |
- Security groups are stateful (return traffic allowed automatically) | |
- NACLs are stateless (have to open in and out yourself) | |
- In security groups all rules checked but in NACLs they are evaluated in numerical order | |
- Security group only applies if an instance is applied before it | |
- NACLs apply to anything in a subnet | |
- Default network ACL allows inbound and outbound traffic automatically | |
- When you create a custom ACL all inbound and outbound traffic is blocked | |
- 1 subnet = 1 availability zone and also 1 ACL max | |
- If you change what ACL a subnet is associated with the old one will be disconnected | |
- Increments of 100 is best practice | |
- Ephemeral ports to allow connections | |
- 1024 - 65535 | |
- Numbers smallest to highest | |
- E.g. 100 is checked first then 101 | |
- Summary | |
- VPC comes with a default ACL | |
- Default ACL allows all outbound and inbound traffic | |
- Can create custom network ACL | |
- By default this blocks all inbound and outbound traffic | |
- Each subnet in a VPC can only be associated with one ACL | |
- A subnet is automatically associated with the default ACL for that VPC unless a new one is assigned to it | |
- 1 ACL can be associated with multiple subnets across availability zones | |
- ACLs contained numbered lists, lowest picked first | |
- ACLs have different inbound and outbound rules | |
- Exam Tip | |
- Blocking ip address -> ACL | |
==================== | |
Custom VPCs and ELBs | |
==================== | |
- Two types of load balancers: | |
- Application load balancers | |
- Classic load balancers | |
- Creating a custom load balancer you're reminded to have two public subnets minimum in two availability zones for resilience | |
- You don't want to be reliant on a single availability zone | |
============== | |
NAT vs Bastion | |
============== | |
- Bastion server - Your public facing server (jump server) | |
- Instead of having to secure all your EC2 servers as someone could access the private subnet through them you have this one hardened server you use to jump into your private subnet | |
- E.g. You set the bastion server to only allow access from your ip address etc | |
- High reliability with bastions: | |
- Multiple public subnets in diff availability zones | |
- Have an autoscaling group of 1 so if it goes down it's stood back up again | |
- Route53 can be doing checks on the server | |
- Exam tips | |
- NAT allows internet traffic to a an EC2 instance in a private subnet | |
- A bastion is used to securely administer the EC2 instances in the private subnet | |
=================================== | |
VPC Flow Logs - Only in SysOps Exam | |
=================================== | |
- You can create flow logs | |
- Capture ip traffic and log to cloudwatch | |
- Any traffic going in will be logged | |
============= | |
VPC Misc Info | |
============= | |
- Can't delete VPCs with active EC2 instances | |
- Load balancers cost money | |
=========== | |
VPC Summary | |
=========== | |
- VPC is a logical datacentre | |
- You can create them in any region | |
- 5 max per region | |
- They can't span regions | |
- They can span multiple availability zones in a region though | |
- 1 subnet = 1 availability zone | |
- Security groups are stateful (in/out both open) | |
- NACLs are stateless (in/out individual) | |
- Peer VPCs within same AWS account and external ones too | |
- Can't do transitive peering | |
- NAT instances | |
- Disable the source/destination checks | |
- Very old | |
- Must be in public subnet | |
- Must have an elastic ip | |
- Must be route from the private subnet to the NAT instance | |
- Traffic supported depends on instance size | |
- Behind security groups | |
- Can script it for a resilient architecture | |
- NAT Gateways | |
- Preferred | |
- Auto scale - 10Gbps | |
- No patching | |
- Auto assigned IP | |
- Just need to add route to NAT Gateway from private subnet | |
- ACLs | |
- VPC comes with a default one | |
- Allows all in and out | |
- Can create new ones: | |
- Blocks all in and out by default | |
- Each subnet in VPC must be associated with an ACL | |
- Each subnet is auto associated with the main ACL | |
- 1 ACL per subnet (old gets remove) | |
- Lowest ordered number first | |
- Separate inbound / outbound rules | |
- Blocking IPs. This is the only option | |
- NAT vs Bastions | |
- NAT allows EC2 instances in private subnet to access the internet | |
- Bastions are to securely administer the EC2 instance (over SSH/RDP) | |
- Resiliency | |
- Always need 2 public subnets and 2 private subnets | |
- Both subnet pairs in different availability zones | |
- Elastic load balancers need to be in 2 public subnets in 2 different availability zones | |
- Bastion hosts - Behind an autoscaling group with a minimum size of 2. | |
- Route 53 (round robin or health check) to auto fail over | |
- NAT Instance - 1 in each public subnet with their own ip | |
- Need to write a script to auto fail over | |
- Where possible use NAT gateways | |
- VPC Flow Logs | |
- Logs to Cloudwatch | |
=========================== | |
=========================== | |
Shared Responsibility Model | |
=========================== | |
=========================== | |
- Infrastructure services (e.g. EC2) | |
- Amazon: | |
- Essentially responsible up to the hypervisor level | |
- Infrastructure | |
- Core services | |
- Customer: | |
- Encryption of data | |
- Configs | |
- App management | |
- Securing the info in transit | |
- Container services: | |
- Amazon: | |
- Takes on more responsibility including O/S and patching it | |
- AWS abstracted services: | |
- S3/Dynamo/Lambda | |
- Amazon manages everything other than: | |
- customer data | |
- encryption of client side | |
=================== | |
=================== | |
Exam Practicalities | |
=================== | |
=================== | |
- 55 questions | |
- Arrive 15 min early | |
- Need authorisation code with you | |
- 2 forms of ID (driving licence / company ID / bank card) | |
- 65% -> Min pass mark generally but moves around |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment