Skip to content

Instantly share code, notes, and snippets.

@cemeng
Last active April 1, 2021 10:51
Show Gist options
  • Save cemeng/d7ea5795a4e35ef2dc8b3a86e6214d4a to your computer and use it in GitHub Desktop.
Save cemeng/d7ea5795a4e35ef2dc8b3a86e6214d4a to your computer and use it in GitHub Desktop.
AWS Pathways

Certification Prep: https://aws.amazon.com/certification/certification-prep/

Solutions Architect: https://aws.amazon.com/training/course-descriptions/architect/ https://aws.amazon.com/certification/certified-solutions-architect-associate/

Reading Materials:

Exam Tips

Interesting links: Breaking the monolith: https://aws.amazon.com/getting-started/container-microservices-tutorial/ Make diagrams: https://draw.io

h3. S3 https://aws.amazon.com/s3/faqs/

@cemeng
Copy link
Author

cemeng commented Oct 5, 2017

06/10

Continuing S3 Lifecycle Management

Lifecycle rules help ppl manage storage costs by controlling the lifecycle of S3 objects, the rules can automatically transition objects into Standard -> IA -> Glacier -> permanently removed

A little detour due to my fragmented mind

Question that springs to my mind - if an object is marked as deleted e.g having a delete marker against it - do we still have to pay for it - according to -> http://docs.aws.amazon.com/AmazonS3/latest/dev/DeleteMarker.html - yes.

Delete markers accrue a nominal charge for storage in Amazon S3. The storage size of a delete marker is equal to the size of the key name of the delete marker

What if I want to list the object with delete markers? i.e "deleted" objects?

The only way to list delete markers (and other versions of an object) is by using the versions subresource in a GET Bucket versions request

Ok now back to Lifecyle!

Object in glacier incurs cost for at least 90 days! Doesn't matter if you delete the object earlier.
It's interesting the transition is only standard -> IA -> Glacier (no reduced redundancy storage option? - doesn't really make sense I guess)

Tips: the old UI is easier to understand, as it has diagram with the timings.

My question what is the difference between expiring the current version of object and permanently delete previous version?
Expire is putting a delete marker of the current version. For permanent deleting combine it with permanently delete previous versions.

By the way it might also good idea to read S3 official documentation here -> http://docs.aws.amazon.com/AmazonS3/latest/dev/Welcome.html

Exam tips

  • Lifecycle can be used in conjuction w/ versioning - it can also be used without
  • can be applied to current and previous versions (eh actually lifecycle ini apply for buckets kan - not configurable for individual objects?)
  • Main drive for lifecycle is for saving storage cost

@cemeng
Copy link
Author

cemeng commented Oct 12, 2017

12/10

S3 - Security and Encryption

Encryption can be done on the transit, server side and client side.

13/10

Lecture 19 - CloudFront Lab
CloudFrount has to be related to an S3 bucket or Elastic Load Balancers.

Origin path is the directory under the bucket.
You can have multiple origins in the same distribution - that's why you put in your own origin ID.
The restrict bucket access option is quite interesting - when set to yes, the object S3 URL will not work anymore (not sure?) - but this enforces user to use the CloudFront URL.
Remember the concept of edge (CloudFront) and origin (S3 bucket / Elastic load balancer)

You can restrict access to your CloudFront URLs using signed URLs / signed cookies - not sure how that works
Alternate domain name / CNAME - you can put friendlier URL here - will be covered on Route53.

23/10

Creating static content with S3

Pretty easy to do. The example use case that Ryan gave was interesting - major movie release 15M traffic, using static content with S3 - it is very easy to do - no need to worry about elastic load balance etc2 - it just scales automatically and with low cost.

Smugmug -> https://news.ycombinator.com/item?id=422225

  • Our biggest win is simply that it's easy. We have a simpler architecture, a lot less people, and a lot less worry. We get to focus on our product (sharing photos) rather than the necessary evils of doing so (managing storage). We have two ops guys for a Top 500 website with over a petabyte of storage. That's pretty awesome.

Things to look out in the exam - the URL format -> http://[bucket-name].s3-website-[region-location].amazonaws.com

Summary

  • Bucket url: s3-[region].amazonaws.com/[bucketname] (path style) or [bucket].s3-[region].amazonaws.com (vhost style)- except for north virginia which doesn't have the region (remove the region from the URL).
  • bucket name must be unique globally
  • recap consistency model - Eventual onsistency (PUTs DELETEs) vs Read after Write consistency (PUTs for new objects)
  • storage classes - S3, S3 IA, S3 RRS, Glacier. RRS is being phased out - there is no reason to use it as the pricing is actually more expensive than S3 standard.
  • It offers versioning - IMPORTANT to remember that you have to pay for all the versions of the object. IMPORTANT: versioning cannot be disabled (only suspended?) - if you want disable it, you need to delete the bucket and re-create it.
  • Lifecycle management - remember the min days before you can transition object. From S3 standard to IA, needs to be at least 30 days from creation date, from IA to glacier it needs to be at least 30 days.
  • cloudfront. origin -> s3, ec2 instance, elastic balancer or route 53. egde locations are not just read only you can write to them.
  • object by default is cached in 24 hours - can be changed of course. you can clear cached objects - but you will be charged.
  • security of buckets - secure using bucket policies and access control list (per object).
  • encryption:
    ** in transit - SSL is used everywhere
    ** at rest: S3 Managed keys (SSE-S3), AWS key management service (SSE-KMS) - similar to sse-s3 - but more feature, allows audit trail of the keys for example (more expensive, SSE-C -> SSE with customer keys. And lastly client side encryption - which means you encrypt data before sending it to S3 - use a library for this, Amazon has a library too Amazon S3 Encryption Client.

@cemeng
Copy link
Author

cemeng commented Nov 1, 2017

02/11

EC2 intro 2

Re-review EC2 intro 1 - it has different availability of EC2 instance which I have forgotten, for example: spot vs dedicated etc2.

EBS is a storage volume that you can attach to EC2 instance - it's block based.

Exam tips:

Know the difference between on demand, spot, reserved, dedicated hosts - pricing ect - which is better?
remember with spot instance you need to pay when you terminate it yourselves - but if AWS terminate it - you got it for free.
EBS types:

  • SSD General Purpose - high IOPS
  • SSD Provisioned IOPS - higher IOPS (I/O operations) than general purpose -> good for database
  • HDD Throughput optimised -> good for data warehousing - not bootable
  • HDD Cold SC1 - infrequently accessed data, e.g: file server - not bootable
  • HDD Magnetic standard - cheap, infrequently accessed - bootable - good for experimenting etc coz it's cheap -> actually can't see this on AWS anymore perhaps it is no longer supported?

You cannot mount 1 EBS volume to multiple EC2 instances, instead use EFS (Elastic File System).

EC2 instance types: DR. McGift PX -> 10 different instance types - Oh dear

Lec 29 - EC 2 - Lab 1

When configuring / launching an EC2 instance - under advanced details, you have user data - this is the bootscript - which useful to tell the instance to install softwares for example (kind of like ansible eh?) - can i use ansibe to provision an EC2? well apparently you can -> http://docs.ansible.com/ansible/latest/guide_aws.html

Could be an important exam topic - if you terminate EC2 instance what happens to the EBS volume - if you tick delete on EC2 termination checkbox when you set this up - then it will be deleted (duh) - it is the default. root volume is where the os gonna be boot up from.

Ryan says in AWS tag everything - he said it's useful for cost control. Don't just tag the name - tag department, person etc2 - super useful to track down the cost.

Lec 30 - EC2 - Lab 2

Oh just worked out reserved instance is pretty cool - it can be quite cost effective actually.

Things to remember in the exam:

  • termination protection is off by default
  • on EBS backed instance - default action is for the root EBS instance to be deleted when instance is terminated (question -> is it possible to have a non EBS backed instance?)
  • EBS root volume on DEFAULT AMI the one that amazon provided cannot be encrypted. what you can do is you can create a copy of this - and then you can encrypt it. you can also use 3rd party software.

@cemeng
Copy link
Author

cemeng commented Nov 4, 2017

04/11

Lec 32 - Security Group Lab

Security groups are stateful - which means whatever you allow on the inbound rule will be symmetrically added to the outbound rule too (you won't see this the outbound but it is implicitly done).

However Network Access Control List is stateless which means you'll need to do both.

Remember you cannot deny only allow on security group but you can do that on NACL.

All inbound traffic is blocked by default.
All outbound traffic is allowed by default.

Lec 33 - Upgrading EBS Volume Types

It is interesting that the additional volume (a non root one) is not automatically mounted. So you need to do that manually?

ssh to my instance
sudo su
lsblk # see all the partition
mkfs -t ext4 /dev/xvdb -> this creates filesystem for that partition
mkdir /acloudguru
mount /dev/xvdb /acloudguru

You can easily modify volume on the fly, like changing volume type from SSD GP to SSD Provisioned IOPS to HDD etc. No downtime required.
HDD standard you cannot change the volume though.

How do we go about on moving EBS from one AZ to another:

  • Create snapshot of the volume
  • Then on the snapshots list - you create volume from the snapshot and set the AZ

How do we go about moving EBS from one region to another:

  • Create snapshot
  • Do a copy snapshot - in here you can choose a destination region

How do you copy one EC2 instance from on region to another:

  • Create snapshot
  • Do a copy snapshot
  • Create Image from the snapshot
  • Create instance from the image
    OR
  • create snapshot
  • create image (AMI)
  • copy AMI specify region

@cemeng
Copy link
Author

cemeng commented Nov 5, 2017

05/11

Did a practice exam (25 questions) from Whizlab -> https://www.whizlabs.com/aws-solutions-architect-associate/free-test/ - scored 75% - which is not bad - but I do think the questions are rather easy.

09/11

AMI lab

Why would you want to create AMI? It is an image with your pre-loaded server, software etc - easy to spin up, no need to re-configure.

How to create an encrypted root volume? Create a snapshot of a volume and there will be an option to encrypt it. I wonder why AWS doesn't provide this option when creating an EC2 .. this creating a snapshot business is a bit tedious.
There is no option for sharing an encrypted AMI in AWS - as encryption is done on the base of your account key.

AMI -> Amazon Machine Image.

@cemeng
Copy link
Author

cemeng commented Nov 17, 2017

17/11

Lec 40 - IAM Roles & EC2

To better manage permissions to resources - use roles, you can create users but Ryan said this can be an administrative nightmare.
Ryan made a point about access key as well.
Also important IAM Roles are global so you don't need to create it per region. Actually all IAM stuff is global?
In this example, we created an S3 admin role for an EC2 - which means the EC2 instance will have access to the S3.
Felix observation: so it is interesting that you can only attach one role to an EC2 instance, what if you want to EC2 to access multiple things for example ? e.g maybe RDS and S3 ? In that case you need to create a Role that have both S3 policy and RDS policy - simple!

Lec 41 - S3 CLI & Regions

Haven't been note taking as dilligently but I am up to lecture 41 now, which is a good progress.
Before: you can only attach an IAM role on EC2 instance upon the creation of the instance, but that's no longer the case, Ryan is unsure whether the exam has been updated with the fact or not.
Tips: get the habit of using the region flag when dealing with s3 cli. eg: aws s3 cp --recursive s3://bla /home --region eu-west-2

@cemeng
Copy link
Author

cemeng commented Nov 17, 2017

18/11 - Saturday 34% completed - this weekend looks free, no more bloody coding test to do yay.

Lec 42 - Using bootstrap scripts

In this lab, we created an S3 bucket in us-east region (North Virginia)
When creating an EC2 - on step 3: Configure Instance Details > Advanced Details - you can enter User Data
In here you can enter a bash script - this will be run on a root level
An example:

#!/bin/bash
yum update -y
yum install httpd -y
service httpd start
chkconfig httpd on
aws s3 cp s3://mywebsitebucket-felix/index.html /var/www/html 

Real life tip by Ryan: He test drive the bash creating script on this EC2 instance that already launch for easy debugging. For example that s3 command might fail if the bucket is in different region -- which in that case you need to pass on --region option

Next destroy that test EC2 - create one with the bash init script - magic!

Lec 43 - Instance Metadata

Magic url - that you can invoke inside EC2 instance:

 curl http://169.254.169.254/latest/meta-data/

will give you instance metadata duh!
for example if you want to get public ip address:

curl http://169.254.169.254/latest/meta-data/public-ipv4
52.87.228.220

Real life tip: combine this metadata utility with bash init script - then you can do wonderful thing like writing this into a text file send it to s3 bucket then we can make that triggers a lambda function to update the dns entry in route 53 for example.

QUESTION: why not making this as part of the aws CLI??

Lec 44 - Autoscaling 101

Create load balancer / ELB - ELB is part of EC2, so it's one of the submenus inside EC2. Note that AWS has added a third kind of ELB, which is Network Load Balancer - it is not on the course yet, and probably won't be on the exam.

But just in case:
Choose a Network Load Balancer when you need ultra-high performance and static IP addresses for your application. Operating at the connection level, Network Load Balancers are capable of handling millions of requests per second while maintaining ultra-low latencies. -> https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html

Remember Application load balancer works on the layer 7 of OSI (app layer), where the classic load balancer works on layer 4 of OSI (transport layer).

I didn't do this lab - but I think it's worth doing it to see how it work out. In this lab Ryan created an auto scaling group with 3 EC2 instances - he took out 2 of them and the auto scaling replaces them - it's pretty cool. Ryan mentioned combining this with the Route 53 - to extend the reliability to regions.

My question can you add an existing EC2 into an autoscaling group? yes you can -> http://docs.aws.amazon.com/autoscaling/latest/userguide/attach-instance-asg.html

@cemeng
Copy link
Author

cemeng commented Nov 18, 2017

18/11

Lec 37 - Load Balancers

ELB classic

I agree this lecture is outdated. I went to the AWS documentation and created my own summary:
Using these resources:

Three kinds of load balancers:

  1. Application - HTTP/HTTPS: OSI layer 7, round robin
  2. Network - TCP: OSI layer 4, selects a target using a flow hash algorithm
  3. Classic - Prev Gen HTTP/HTTPS/TCP: OSI layer 4, round robin

Terminology

  • load balancer - single entry point for clients to increase availability
  • listener - forwards requests from client to target based on rules
  • rule - specifies a target group, condition, and priority
  • target group - routes requests to one or more registered targets
  • health checks - monitor registered instances for health so requests only go to healthy instances
  • cross-zone load balancing - enable to distribute traffic across all availability zones

Comparison Chart:
https://aws.amazon.com/elasticloadbalancing/details/#compare

When you would use classic instead of application:

  • Support for EC2-Classic
  • Support for TCP and SSL listeners
  • Support for sticky sessions using application-generated cookies

When you would use application instead of classic:

  • Support for path-based routing.
  • Support for host-based routing.
  • Support for routing requests to multiple applications on a single EC2 instance.
  • Support for registering targets by IP address, including targets outside the VPC for the load balancer.
  • Support for containerized applications.
  • Support for monitoring the health of each service independently, as health checks are defined at the target group level and many CloudWatch metrics are reported at the target group level.
  • Access logs contain additional information and are stored in compressed format.
  • Improved load balancer performance.

When you would use network instead of classic:

  • Ability to handle volatile workloads and scale to millions of requests per second.
  • Support for static IP addresses for the load balancer.
  • Support for registering targets by IP address, including targets outside the VPC for the load balancer.
  • Support for routing requests to multiple applications on a single EC2 instance.
  • Support for containerized applications.
  • Support for monitoring the health of each service independently, as health checks are defined at the target group level and many Amazon CloudWatch metrics are reported at the target group level.

Architecture - I guess architecture wise you would do this then:

Outside world -> Route 53 -> ELB -> EC2
When there is failure with your EC2 - you can configure ELB to do different things I suppose? what are some of the things that you can do? I would like to get notified and I would like a static page in S3 to be displayed. When the EC2 is healthy - serve from EC2 again.

With ELB note that you don't get an IP address but a DNS name - when we use Route 53 you will use that DNS name.
From the surface it looks like the only difference between the app ELB vs classical ELB is w/ app ELB you can specify the HTTP status for the health check.

Interestingly when creating a 2nd ELB - it says no EC2 instance is available

@cemeng
Copy link
Author

cemeng commented Nov 20, 2017

20/11

Lec 45 - EC2 Placement Group

  • 10 Gb/second network throughput
  • It can only be created for one Avail Zone - won't work cross Avail Zones - makes sense as you really need the speed for this feature
  • only certain instances can be created for placement group - for example you can't make micro instance for placement group
  • use homogenous instances within placement groups
  • can't move existing instance to placement group, can't merge

Lec 47 - Lambda

Interesting looking at the evolution of data centre for dev: data centre, Infra as Service (EC2), Platform as Service (Elastic Beanstalk), Lamdba

Lambda - scales automatically, scales out - you don't need to set up auto scaling or ELB. scales up (increase ram) vs scales out (adding more instances) - lambda scales out automatically, for example if you have 1 million ppl hitting your lambda - a million lambdas will be provisioned automatically for you.

The important thing for the exam though is the lambda triggers. You will be asked what can trigger lambda services and what cannot, the core services: sns, s3, kinesis, dynamoDb, alexa, IO, cloudfront, cloudwatch, API gateway are the ones that can (among other things).

The one thing that you need to understand is the difference between Lambda and EC2 w/ load balancer for instance;
for example if 2 users are going to an API for the same lambda function - 2 lambda functions are created (see 1 million user example above - key here is scale out). Whereas w/ EC2 - a single or a couple EC2 is handling multiple calls distributed by load balancer. Note: lambda does not scale up.

Node.JS, Phython, C# and Java are the supported languages in Lambda.

How is it priced? It is priced based on number of requests - so first million requests are free and hten 0.20 per 1 million requests.
Also priced based on duration - $0.00001667 for every GB second.

Felix comment - in terms of price wise - this is a game changer, remember EC2 small instance for MEC costs us about $30 p/m. Although not sure if we can build an app using all lambda approach? acloud guru though?

There is a 5 min limit for a lambda to response. if your function is running more than 5 mins - use something else or break it down.

Know what services in AWS that are serverless: s3, dynamoDB etc - EC2 is not serverless as you need to maintain the server.
Architectures can get extremely complicated and it can extremely hard to debug - if you are using serverless architecture. Use AWS X-ray service (woot???)

@cemeng
Copy link
Author

cemeng commented Nov 28, 2017

4/12

Databases

Probably won't feature much on the exam.
RDS - for OLTP - SQL, MySQL etc
DynamoDB - noSQL
Redshift - OLAP
Elasticache - in memory - Memcached / Redis
RDS Multi AZ - can be turned on - off by default, in the event of primary DB down - it will failover to another instance - pretty cool.
I remember on having to have to manage this sort of things ourselves at Gruden - painful.
read replicas
Aurora scaling - 6 copies of the data (2 copies on min 3 AZ) automatically but remember this is copy of data NOT instance - you'd want multi AZ for instance and its gon cost ya mullah

Learn about storage sizes and stuffs -> https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html

FOCUS on RDS - read the RDS FAQ katanya Ryan

@cemeng
Copy link
Author

cemeng commented Dec 4, 2017

When you have deployed an RDS database into multiple availability zones, can you use the secondary database as an independent read node?

So what's the primary difference between read replica and multi AZ then?
multi AZ is so that you can fail over to - read replica to speed up / performance.
So I guess the answer is no -> YAY correct

@cemeng
Copy link
Author

cemeng commented Dec 7, 2017

07/12

Lecture ? Exam Tips EC2

Know the difference between: on demand, spot, reserved, dedicated hosts
For example: spot for minimising cost - when you don't care if your instance gets terminated.
steady state website, traffic doesn't spike a lot and you want to minimise cost in that case use reserved.
black friday traffic spike -> use on demand
regulatory requirement that you cannot use multi tenant compute -> dedicated host

Spot instance -> if you terminate the instance you pay for the hour, but AWS terminates - you get the hour it was terminated for free.

Remember DR. MC GIFT PX - you don't know in details.

  • D -> Dense, R -> Memory optimised (RAM),
  • M -> general -> all rounder this is what you should use in production - don't use T2 (tiny), they are just for toy.
  • C -> Compute
  • Two graphics specific family type G2 and P2, G is for encoding, P (general purpose GPU) for machine learning and bit coin

EBS consists of:

  • SSD general purpose - GP2 up to 10k IOPS
  • SSD provisioned IOPS IO1 more than 10k IOPS
  • HDD throughput optimised - used for data warehousing / transaction logging -> frequently accessed data
  • HDD cold SC1 - less frequently accessed data
  • HDD magnetic standard cheap - infrequently accessed storage -> you can use this for boot but the top HDD cannot

Termination pro

@cemeng
Copy link
Author

cemeng commented Dec 8, 2017

08/12

Route 53: Weighted Routing Lab

I was a bit confused between ELB and Routing w/ Route 53. With Route53, you can have more granular control, e.g having different routing strategies. In the example, you might even put Routing logic that direct traffic to more than one ELB.

Some useful comments from: https://acloud.guru/forums/aws-certified-solutions-architect-associate/discussion/-KTWtu_Y5HscAAS8NCyc/elb-vs-route-53-routing

ELB duty is to distribute traffic to instances while making sure that the instances are healthy to make sure your application is always available. Route 53 has many functionality which are called polices and they are; simple, weighted, latency, failover and geolocation. Unlike ELB, with Route 53 weighted policy you can manually set the traffic distribution for your applications like 20% traffic should be routed to instance A while 80% should be routed to instance B. In the case of failover, both Route 53 and ELB do have similar functionality by routing traffic to only health application or instances. But for Route 53, you use fail-over for active passive fail-over.

In summary, I believe ELBs are intended to load balance across EC2 instances in a 'single' region. Whereas DNS load-balancing (Route 53) is intended to help balance traffic 'across' regions. Route53 policies like geolocation may help direct traffic to preferred regions, then ELBs route between instances within one region.

Functionally, another difference is that DNS-based routing (e.g. Route 53) only changes the address that your clients' requests resolve to. On the other hand, an ELB actually reroutes traffic.

One analogy is: if you ask for the closest WalMart, you may get an address based on your location, but you could choose to go to another Walmart if you know one. That's Route 53; it just switches the address resolved based on some context. On the other hand, a policeman redirecting traffic because of construction, is more like an ELB, he/she is actually changing the traffic flow, not just suggesting.

A/B testing is a good use case of using weighted routing with Route53.

Route 53: Latency Based Routing

Latency with Route 53 respond to DNS queries with the resources that provide the best latency. This means that Route 53 chooses to respond to the DNS query based on which data centre gives your user the lowest latency.

Route 53: Failover Routing Policy

So i think this is similar to ELB. In this you specify passive / active route - just like Hagrid has production and DR instances.
In the case of failure - the passive will become active.

So just like ELB - you need to setup health check too. Setting this up feels a little bit complicated. So first of all you need to create health checks - which is a separate section on the route 53. And then you go to Hosted Zone - setup an A record with the failover routing policy - in here you are to associate the routing with a health check.
Ok - so say I have a domain felixt.org - in here I can then create multiple A records (hm interesting?) and each will have their own routing policy - ah okay, kinda get it - set these records as aliases and set target alias - which can be EC2, S3, ELB - what else?

There is only primary and secondary option for failover - production and DR I guess. So you create a health check attached to the primary A record, the secondary record has no need for health check I guess? But what if you want a cascading failover? e.g more than primary and secondary? Can you do that?

Geolocation is when you want Amazon Route 53 to respond to DNS queries based on the location of your users. But Geo-location do have a functionality of presenting the user the ability to change the content like languages to suit the users.

@cemeng
Copy link
Author

cemeng commented Dec 8, 2017

what is mx record?
there is a limit 50 domain you can register with route 53 - if you want more contact amazon sales.

@cemeng
Copy link
Author

cemeng commented Dec 10, 2017

10/12

VPC Lab

Create VPC manually - don't use Wizard - you won't learn by doing that.
When VPC is created, the following will be created:

  • route table
  • NACL
  • security group NACL
    Subnet and internet gateway you need to create yourself.

When creating subnet - Ryan likes the following naming convention: 10.0.1.0-us-east-1-a
Remember you will get 5 less adresess - 1st 4 and last are reserved by aws
Internet gateway is one to one relation with a VPC. Once VPC is connected to a gateway you cannot connect it to another gateway.

Uhm what is route table and internet gateway?

Then Ryan create a separate route table - that allows connection to the internet - he said this is a best practice, you don't want your default route table to be internet accessible.
Ok, so create a route table - under route tab add another entry with the following:
destination: 0.0.0.0/0
target: igw-420a753b -> this is our internet gateway that created on the previous step
now that's for ipv4 - add another record below for internet connectivity for ip6
::/0

Next on the lab creating an EC2 using the public VPC.
Note that security group does not span VPC - the webDMZ one was created for the default VPC - you won't be able to select it here. So create a new one for this VPC - using the same webDMZ security group.

@cemeng
Copy link
Author

cemeng commented Dec 11, 2017

11/12

NAT Instance and NAT Gateway

NAT - Network Access Translation - what is it? From AWS:
You can use a NAT device to enable instances in a private subnet to connect to the Internet (for example, for software updates) or other AWS services, but prevent the Internet from initiating connections with the instances.
Oh ok - I got it - so I guess NAT will be on a router / modem for example - where the modem has 1 public IP address but can have multiple private IPs? And the modem - will add private IP info on the header so it know how to connect the public to the private.

NAT devices are not supported for IPv6 traffic—use an egress-only Internet gateway instead. - uhm what?

So how to use this NAT instance? NAT instance is basically an EC2 - so you go to EC2 and pick NAT AMI. it's in the community AMI.

NAT Gateway is the preferred method for NATting - it is essentially managed NAT - it scales to 10GB traffic automatically - while with NAT instance you need to worry about scaling and redundancy etc2 - I suspect NAT Gateway is more expensive though.

NACL and security groups

OK so what is NACL?
A network access control list (ACL) is an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets. You might set up network ACLs with rules similar to your security groups in order to add an additional layer of security to your VPC.

So it's basically a firewall!

Important one subnet can only be associated with one NACL. Also NACL is related to security - duh! it is listed under Security heading on VPC - along with security group.

By default a new NACL will deny everything.

Ok I don't get this - Ryan said NACL is stateless - what does that mean? So you need to specify inbound and outbound rules, while security group you only need to specify once.

Let's talk about ephemeral ports - uhm what is that?
An ephemeral port is a short-lived endpoint that is created by the operating system when a program requests any available user port. The operating system selects the port number from a predefined range, typically between 1024 and 65535, and releases the port after the related TCP connection terminates.

@cemeng
Copy link
Author

cemeng commented Dec 15, 2017

Exams

24/11

Scored 58% You answered 35 of 60 questions correctly in 59:19
but reflective on the fact that I've only half through the course.

Design 58.3%
Data security 50%
Implementation / Deployment 83.3%
Troubleshooting 50%

29/11

Scored 62%

Design 69.4%
Data Security 50%
Implementation/Deployment 66.7%
Troubleshooting 33%

15/12

Scored 60% - discouraging result.

Design 61.1%
Data Security 66.7%
Implementation/Deployment 83.3%
Troubleshooting 16.7%

22/12

Scored 75% - aw yisss

Designing highly available, cost-efficient, fault-tolerant, scalable systems 80.6%
Data Security 66.7%
Implementation/Deployment 66.7%
Troubleshooting 66.7%

09/01

Scored 82% - awww yess

Designing highly available, cost-efficient, fault-tolerant, scalable systems 80.6%
Data Security 75%
Implementation/Deployment 83.3%
Troubleshooting 100.0%

@cemeng
Copy link
Author

cemeng commented Dec 16, 2017

16/12 - Saturday

Reviewing S3 FAQ

  • Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 terabytes. The largest object that can be uploaded in a single PUT is 5 gigabytes. For objects larger than 100 megabytes, customers should consider using the Multipart Upload capability.
  • S3 Standard is designed for 99.99% availability and Standard - IA is designed for 99.9% availability. Also IA - has higher cost to retrieve data - but lower storage.
  • Any publicly available data in Amazon S3 can be downloaded via the BitTorrent protocol, in addition to the default client/server delivery mechanism
  • By default, customers can provision up to 100 buckets per AWS account.
  • You can limit access to your bucket from a specific Amazon VPC Endpoint or a set of endpoints using Amazon S3 bucket policies. S3 bucket policies now support a condition, aws:sourceVpce, that you can use to restrict access. For more details and example policies, read Using VPC Endpoints.
  • There are two ways to get data into Standard – IA from within S3. You can directly PUT into Standard – IA by specifying STANDARD_IA in the x-amz-storage-class header. You can also set lifecycle policies to transition objects from Standard to Standard - IA.
  • Standard - IA is designed for larger objects and has a minimum object size of 128KB. Objects smaller than 128KB in size will incur storage charges as if the object were 128KB.
  • To retrieve Amazon S3 data stored in Amazon Glacier, initiate a retrieval request using the Amazon S3 APIs or the Amazon S3 Management Console.
  • Deleting data that is archived to Amazon Glacier is free if the objects being deleted have been archived in Amazon Glacier for three months or longer. If an object archived in Amazon Glacier is deleted or overwritten within three months of being archived then there will be an early deletion fee. This fee is prorated.
  • How should I choose between Transfer Acceleration and Amazon CloudFront’s PUT/POST? Transfer Acceleration optimizes the TCP protocol and adds additional intelligence between the client and the S3 bucket, making Transfer Acceleration a better choice if a higher throughput is desired. If you have objects that are smaller than 1GB or if the data set is less than 1GB in size, you should consider using Amazon CloudFront's PUT/POST commands for optimal performance.

The following are not mentioned in acloud guru course - worth knowing what they are:

  • Query in Place - Amazon S3 allows customers to run sophisticated queries against data stored without the need to extract, transform, and load (ETL) into a separate analytics platform. S3 offers multiple query in place options, including S3 Select (currently in preview), Amazon Athena, and Amazon Redshift Spectrum, allowing you to choose one that best fits your use case. You can even use Amazon S3 Select with AWS Lambda to build serverless apps that can take advantage of the in-place processing capabilities provided by S3 Select.
  • S3 Analytics - Storage Class Analysis - provides analysis to your S3 usage. With storage class analysis, you can analyze storage access patterns and transition the right data to the right storage class.

@cemeng
Copy link
Author

cemeng commented Dec 17, 2017

17/12 - Sunday

VPC Summary and exam tips

  • NAT instance - disable source / destination check on the instance, it must be in public subnet. Traffic that it can handle - depends on the instance size. Create high availability using autoscaling groups, multiple subnets in different AZs and script to automate failover (hm?). It is behind a security group.
  • NAT gateways - remember to update route table - may take 15 mins to provision.

19/12

Overview of security processes (part 2)

Make sure you read the security white paper - it's quite long at 95 pages though.
Encryption of data is generally a good practice, you can encrypt EBS volumes and their volumes with AES-256.
This mean data moving EC2 instances and EBS storages become secure.
However this feature only available on more powerful instances such as M3, C3, R3, G2.

ELB - SSL termination on load balancer is supported -> why is this good? this is good because then your webserver doesn't need to do decryption on their end which saves them operation power. ELB does pass the originating IP address to your webserver.

Direct Connect - bypass internet service in your network path. You can buy rack space within AWS Direct Connect location and deploy your equipment nearby.

You can conduct vulnerability scan on your instances - but you must tell Amazon beforehand - failing in doing so is a violation to your thing. AWS conduct scans on their own systems - not customers' instance.

Compliance - AWS complies to a lot of standards, one of them is PCI DSS 1 - this is compliance on the infrastructure level though, you need to make sure your app is compliance.

Storage options in cloud whitepaper

S3, Glacier, EBS, EC2 instance storage -> ephemeral (gone when you terminate the instance)
AWS Import/Export -> a service that allows you to send your data to AWS Import/Export centre - not via the internet, but by mailing or sending your physical storage. Apparently Snowball is more preferable.
AWS Storage Gateway -> connect on prem software with cloud storage. The purpose is you use cloud for data storage. It is essentially and interestingly is a VM that you install on your prem. Once installed - you can create gateway cached or gateway stored volumes that can be mounted as iSCSI devices by your on prem apps.
Gateway cached -> uses S3 for primary data - while retain some of data local in a cache for frequently access data.
you can create storage volumes up to 32TB in size.
Gateway stored ->

@cemeng
Copy link
Author

cemeng commented Dec 21, 2017

21/12

Kinesis Firehose vs Kinesis Streams

Kinesis Streams - You must manually provision the appropriate number of shards for your stream to handle the volume of data you expect to process. Amazon helpfully provides a shard calculator when creating a stream to correctly determine this number. Once created, it is possible to dynamically scale up or down the number of shards to meet demand, but only with the AWS Streams API at this time. with kinesis streams you do stuffs

Kinesis Firehose is Amazon’s data-ingestion product offering for Kinesis. It is used to capture and load streaming data into other Amazon services such as S3 and Redshift. From there, you can load the streams into data processing and analysis tools like Elastic Map Reduce, and Amazon Elasticsearch Service. It is also possible to load the same data into S3 and Redshift at the same time using Firehose.
In Firehose - you don't have to worry about consumer.

You can check out this re-Invent recording on Kinesis: https://www.youtube.com/watch?v=SmcgiweeviY. What I found the key difference is where you flow data through Firehose, it doesn't store. It happens under the cover for streaming data to flow through to the configured destination where you store it persistently ie S3, Redshift or ES.

@cemeng
Copy link
Author

cemeng commented Dec 22, 2017

22/12

OK - so I have finished all the major components in acloudguru - progress is 76% now, except for:

  • ch 10: hands on lab
  • ch 12: well architected frameworks
  • ch 13: additional exam tips

24/12

Wordpress lab

draw.io -> website to create diagram.
The architecture of for this lab -> ELB, EC2 with auto scaling (2 instances), RDS multi AZ too (2 RDS).
The EC2s & ELB are inside a VPC with WebDMZ, RDS is inside private VPC.
IAM roles -> create a role that allow EC2 to have full access to S3
Creating security group for the EC2 -> created by going to VPC - so remember security group is part of VPC.

At this stage - I am a bit haze about EC2 auto scaling - so I am taking a detour here and re-read how to set it up

When it gets to setting up security group - I stopped the lecture and trying to remember myself how to set up a VPC that for my EC2 - the VPC should allow:

  • public access to my EC2 / ELB (ingress) on port 80, 443 and 22. egress on port 80, 443 - in the case of contacting updates? I don't think it needs egress on port 22?

Steps:
Security groups and VPC

  • create VPC with CIDR 10.0.0.0/16 -> wordpress VPC - so I guess VPC is per app in this case if I have to app / websites for different clients, I'd create different VPCs to isolate them.
  • create a security group - web DMZ - with inbound rule:
    HTTP (80) | TCP (6) | 80 | 0.0.0.0/0 |  
    SSH (22) | TCP (6) | 22 | 0.0.0.0/0 |  
    HTTPS (443) | TCP (6) | 443 | 0.0.0.0/0 |  
  • create security group for aurora - with 3306 on inbound rule - but hang on you can have multiple security groups in one VPC?
    see the inbound rule:
    MySQL/Aurora (3306) | TCP (6) | 3306 | sg-2133d255 -> the last one is the my web security group (kinda weird specifying security group as the allowable source - and we haven't touched on subnet yet). I guess what that means is allow MySQL connection from any instances that have security group of web - for instance you may have multiple EC2s with web security group, they will be allowed connection to this resource.
  • created 2 subnets for wordpress VPC - specifying CIDR range a bit tricky - used both cidr.xzy and example from default VPC -> 10.0.0.0/20 and 10.0.16.0/20. What's confusing is - these subnets also have route table - I am not sure what they are for - I forgot.
  • created an internet gateway and attach it to the VPC - also added the internet gateway to the route table
  • this route table seems to have been created automatically?

ELB

create an ELB application one - I am not sure, should I let it listens to port 80 only?
In this ELB you need to specify which VPC - when I choose wordpress VPC no AZ is shown!!! Which means I am missing some steps here - the error message says 2 subnets must be specified - OK, so that means I need to create at least 2 subnets and obviously on 2 different AZs? added subnet above.
ok added 2 subnets - OMG - now it's complaning: You are creating an internet-facing Load Balancer, but there is no Internet Gateway attached to these subnets you have selected: subnet-5a3d7d3e, subnet-25f87f0a

Side notes:

are you sure you want to delete this vpc? - the following will be deleted too: subnets, security groups, network acls, vpn attachments, internet gateways, route tables, network interfaces, vpc peering connections.

@cemeng
Copy link
Author

cemeng commented Dec 27, 2017

27/12

Took few days break - Christmas and spending time with kids is kinda important.

WordPress lab - setting up EC2

Now on setting up EC2 - after I set up my EC2 - I found that there's no public IP address on the EC2 - wooottt! Turns out I need to turn on auto assign public IP address setting on my subnet!!!!

30/12

Adding resilience and cloudfront lab

Ryan backs up the wordpress code inside /var/www/html into s3 bucket. Felix note: I would probably use git for this.

aws s3 sync --delete /var/www/html/wp-content/upload s3://my-little-pony (--dry-run) -> ah pretty cool - rsync for s3 - I could use this for my blog later.

Then we do some URL rewriting magic on WP so the files are served from CloudFront instead of EC2 / WP.
Next step is to automate the sync process using cron - which is basically pasting the command above into crontab.

Setting up AMIs lab

Why bother creating ELB for one EC2 instance? It's due to the public IP address, when EC2 is re-started - it will get a new IP address.
Note to Felix: can't Cloud53 automatically connect to that EC2 based on ARN? No -> https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resource-record-sets-values-alias.html - you can only set ELB, Elastic beanstalk, cloudfront distro, S3 buckets as alias record destination.
According to Ryan: you could avoid this issue by using ELB or Elastic IP.

In this lab - Ryan split the wordpress site into 2: the production WP and the writer WP and built AMI for both of them. I don't really understand the practicality of this lab in the real world hence I didn't do it.

Autoscaling and Load Testing

didn't do ..

Exam tips based on students feedback

Kinesis - it is the way to consume big data / data stream or bringing it to the cloud - like social media
business intelligence -> RedShift
big data processing -> EMR (processing)

EC2 - EBS backed vs Instance store, know the difference. EBS store data long term.

Opswork -> orchestration services that uses Chef.

SWF Actors: workflow starters (initiate workflow), deciders (control flow), activity workers (carry out activity tasks)

AWS Organisations & consolidated billing

This is a feature for larger organisations where it may have a lot of AWS accounts. Consolidated billing is sort of replaced by AWS Organizations?
What is AWS orgs?
one root account and you can have multiple organisation units below it. Then you can have granular control for policies.

Ok - for consolidated billing. In this setup, root account is the paying account - then we have several linked accounts under it such as test, production and back office.
You will receive one bill - with break down for individual linked accounts. The accounts are independent though they cannot access resources of other accounts. Limit for consolidate billing is 20 linked accounts.
The good thing about this - you will get volume pricing discount.

Best practices -> paying account should be used for billing purpose only - don't deploy your resource there.

@cemeng
Copy link
Author

cemeng commented Jan 1, 2018

01/18

Happy new year! :)

Cross Account Access

Cross account access - what is it? From a post in the internet:
Today, we made it possible for you to enable a user to switch roles directly in the AWS Management Console to access resources across multiple AWS accounts—while using only one set of credentials.

I have actually experienced this at FFX - using my login, and then switch to developer role and then I was able to access devs specific resources.

Not doing the lab - but taking the idea an apply it to MEC and my own account scenario - almost got it working.

Also reading IAM documentation on AWS as well as best practice for IAM - this has solidified my understanding of user, group, policy and role.

@cemeng
Copy link
Author

cemeng commented Jan 2, 2018

02/18

acloud guru 88% completed - the end is near, I think I can finish this course before the holiday ends - I am stoked!

Learnt a bit about docker and docker in AWS.
ECS - Elastic Container Service is managed docker service in AWS.
ECR - Elastic Container Registry is docker image registry in AWS. AWS version of DockerHub.

I had to read additional resources to wrap my head around docker stuffs again.
docker image -> template to create a docker container (in my own words).
I am still a bit fuzzy with Task Definition, Clusters.
In my own words again - Task Definition defines how to run a docker container in AWS. It is a container configuration.
Clusters is region specific and this is a place to deploy task definitions. An ECS cluster is basically autoscaling for docker, it will provision your required number of EC2 instances to run the docker image.

ECS quick tutorial from youtube not from acloudguru https://www.youtube.com/watch?v=kQBGbmrdYO4:

  • push an image to ECR
  • create task definition - so here you specify the image URL from the ECR. Then you configure the container here by specifying CPU requirements etc2.
  • then you create a cluster. in here you specify what EC2 instance type you want, the VPC config etc2.
  • and then you create a service - uhm what? don't really get it .

When you finish this if you go to EC2 - you'll see the instance that ECS created for the container. SSH in to that box, you'll see docker installed and provisioned for you. doing docker images will show 2 images, one is the ECS agent and the other one is your image.

@cemeng
Copy link
Author

cemeng commented Jan 3, 2018

03/18

Did Whizlabs diagnostic exam and scored 85% (51 out of 60) - pretty stocked!
Area to improve:

  • the details of things I guess for example: which DB doesn't support read replica in RDS answer Oracle
  • how long can a message stays in SQS - max is 14 days default is 4 days
  • autoscaling - what to do if you want to change instance type on your autoscaling group. answer: create new launch configuration and replace autoscaling's existing launch config with the new one.
  • direct connect is not VPN.

@cemeng
Copy link
Author

cemeng commented Jan 29, 2018

29/01

Few more days before the exam

Doing a cloud guru final exam, few things to review:

  • Site to site VPN vs direct connect - what's required -> You need to ensure that your application in your custom VPC can communicate back to the on-premise data center. You can do this by either using a site to site VPN or Direct Connect. It will be using an internal IP address range, so you must make sure that your internal IP addresses do not overlap.
  • cname vs a record
  • what is AWS WAF - what filters are available
  • in auto scaling - how to determine which instance to kill?
  • what services are offered by trusted advisor?
  • ECS - especially with regards to permission, permission can be applied to task and the instance themselves?
  • SQS - what is DelaySeconds mean?
  • To establish a successful site-to-site VPN connection from your on-premise network to an AWS Virtual Private Cloud, which of the following must be configured? (Choose 3)
    You must have a VPC with Hardware VPN Access, an on-premise Customer Gateway, and a Virtual Private Gateway to make the VPN connection work.
  • what is Virtual Private Gateway and Customer Gateway?

Got 72%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment