Created
January 16, 2019 12:50
Google Certified Cloud Architect Part 1 Notes from Linux Academy
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Google Certified Professional Cloud Architect - Part 1 | |
====================================================== | |
GCP Overview | |
Google's suite of cloud computing services; run on same infra and network as google | |
Compute -> App Engine, Container Engine, Compute Engine | |
Storage -> Bigtable, Cloud Storage, Cloud SQL, Cloud Datastore | |
Big Data -> BigQuery, Pub/Sub, Dataflow, Dataproc, Datalab | |
Machine Learning -> Vision API, Machine Learning, Speech API, Translation API | |
https://cloud.google.com/pricing | |
Per second pricing for instances;Private Global Fiber network;Live migration of VMs;Better performance;Industry leading security;access to innovative resources(Big data,ML) | |
Datacenter Infrastructure | |
https://cloud.google.com/about/locations | |
Datacenter -> 9 Regions and 27 Zones -> 17 Regions and 52 Zones | |
Backbone -> High speed private fiber network | |
Point of presence -> 90+ edge points in 33 countries | |
Commitment to social responsibility | |
Network Infra - Three elements 1) Core Data centers 2) Edge Points of presence(PoPs) 3) Edge Nodes | |
Edge Nodes - Tier of Google's infra close to end users; Youtube and Playstore cached in Edge nodes | |
GCP Free Trial Limitations: | |
Compute Engine - 8 cores running at once, 100GB SSD, 2TB Persistent standard disk space | |
Always Free Tier limits - Only for US regions; 5GB Cloud Storage per region; 1 f1-micro per month(US-region), 30GB HDD/5GB per month snapshot | |
https://cloud.google.com/free/docs/always-free-usage-limits | |
Organization: | |
Cloud Resource Hierarchy | |
Hierarchy Overview: 1) Organization (N/A for individual accounts) 2) Projects 3) Resources | |
Projects - Core Org component of GCP; Control access to resources; Creating, enabling, and using all Cloud Platform Services | |
Projects attributes -> Project Name, Project ID(Unique Application ID), Project Number (used by service accounts) | |
IAM: | |
Importance of resource access and management - who, can do what, on which resource -> to provide granular access, prevent unwanted access to other resources; least privilege | |
Cloud IAM - Members are granted permissions and roles to GCP services using the principle of least privilege. | |
Member - Person or Service Account -> People (Google account, Google group, G Suite domain, Cloud Identity domain) -> Service Account (Application access) | |
Service Account - special type of google account; identity for carrying out server-to-server interactions in a project; identified with email project_id@developer.gserviceaccout.com | |
Role - Collection of permissions to give access to a given resource; Permissions represented in form <service>.<resource>.<verb> Ex: compute.instances.delete | |
Users are assigned roles (not directly assigned permissions); 1) primitive 2) predfined | |
Primitive - Historically available GCP roles; Applied to project level; Broad roles (Viewer/Editor/Owner) | |
Primitive roles suitable for small teams, only broader permissions for project | |
Predefined or Curated - Granular access, granted at resource level. Ex: App Engine Admin, multiple predefined roles given to individual resources | |
https://cloud.google.com/iam/docs/understanding-roles#predefined_roles | |
IAM policy - Collection of roles; full list of roles granted to a member or resource | |
Policy Hierarchy - Org->Project->Resources-parent/child format; each child has one parent; children inherit parent roles, parent policies overrule restrictive child policies | |
Interacting with Google Cloud Platform | |
1.Cloud Console | |
2.Google Cloud SDK - CLI for managing resources and applications; gcloud (many common GCP tasks), gsutil (cloud storage), bq(work with data in BigQuery) | |
3.RESTful API | |
https://cloud.google.com/sdk | |
Cloud Shell - interactive web based shell environment accessed from web console; includes temp compute engine VM, 5GB persistent storage, pre-installed SDK and other tools, | |
built-in authorization | |
1 hour timeout for inactivity, | |
RESTful API - programmatice access to GCP resources; use JSON; Use OAUTH 2.0 for authentication and authorization; enabled via GCP console; API have daily quota; can experiment with API explorer | |
gcloud [GROUP] [GROUP] [COMMAND] - arguments | |
Ex: gcloud compute instances create instance-1 --zone us-central1-a | |
gcloud config set project [PROJECT_ID] | |
https://cloud.google.com/sdk/gcloud/reference | |
gcloud config list | |
gcloud projects list | |
API Explorer | |
Used for interacting with GCP via your own applications | |
https://developers.google.com/apis-explorer | |
Compute Options Comparison | |
Method of hosting apps on GCP - 1) Google Compute Engine, 2) Google App Engine, 3) Google Container Engine 4) Google Cloud Functions | |
GCP Architect: Given a set of business and technical requirements, know how to choose and implement the right tool for the task. | |
https://cloudplatform.googleblog.com | |
Compute Options Decision Tree | |
https://cloudplatform.googleblog.com/2017/07/choosing-the-right-compute-option-in-GCP-a-decision-tree.html | |
Google Compute Engine: | |
---------------------- | |
CE is IaaS running virtual machines(instances) on Google Infra. | |
Robust Network Features: Custome Networks, Firewall rules, Regional HTTP(s) Load balancing, Network Load balancing, Subnetworks | |
Boot quickly, local SSD performance | |
Low cost+Automatic Discounts, minute level increments(10 min minimum charges), Automatic sustained discounts | |
Global private fiber network | |
Extremely flexible | |
Basic Instance Management: | |
Create, stop, start, reset and Delete instance via console | |
Instance Options: | |
gCloud and REST Reference: | |
gcloud compute instances create "test-2" --zone "us-central1-a" --machine-type "f1-micro" | |
Connect to a Linux Instance and More gcloud Commands | |
gcloud compute --project "compute-engine-overview" instances create "linux-instance" --zone "us-central1-a" | |
SSH in to the instance and do graceful shutdown via "sudo poweroff" command | |
IAM -> NETWORKING -> FIREWALL RULES | |
Connecting to Windows Instances | |
freerdp or Remmina.org => Free RDP Clients | |
prabhakaranaquarius | |
Nno:D+gHnKzu}.^ | |
Editing Instance specifications: | |
Change instance type only when the instance is stopped, cannot change zone once instance is allocated | |
Creating, Editing, and Manipulating Disks: | |
Edit disk and increase size but cannot decrease | |
can create additional disk, but cannot change zones after creating | |
Windows Instance -> Change disk size, RDP and do Extend Volume on Disk management | |
Linux Instance -> Change disk size, SSH and run "df -h" | |
https://cloud.google.com/compute/docs/disks/add-persistent-disk | |
Snapshots and Images: | |
Snapshots - Backup and Disaster recovery, Persistent even while they are attached to running instances, lower cost than custom image, differential backups | |
Available only in the project they are created | |
Images - create instances(modified root volume), available across projects, stop the instance first to create custom image | |
Preemptible VMs, Instance Templates, and Groups: | |
Preemptible VMs - short lived, low cost VM, 80% cheaper, max lifetime of 24 hrs, suitable for short term batch processing | |
Instance group - Group of VMs, Managed and Unmanaged, Autoscaling, Instance templates define and deploy the group, multiple instance acting as once | |
Cloud Launcher(Marketplace): | |
quickly deploy functional software packages that run on GCP, free, manage and view info with deployment manager | |
Networking Overview: | |
VPC Networks - virtual version of traditional physical network, limited to internal resources, resources in VPC get internal ip from subnet | |
External IP - Ephemeral or static, static IP can be reserved(Prod servers) | |
Firewall rules - Every VPC = Managed Firewall(Ingress/Egress traffic), can be applied to entire VPC or individual instances | |
Routes - mapping IP range to destination, default or custom rules | |
Load balancing - distribute user requests among set of instances, works with instance group, used for AS, Batch, distribute traffic, FT | |
Cloud DNS - high performace, resilient, global DNS service that publish your domain names to global DNS in cost effective way. Create managed zones, | |
(add,edit and delete DNS records) | |
VPN - on premise ot GCP thru IPSec, private internal connection over public internet, supports gateway to gateway connections(site-site) | |
Cloud Router - dynamic routing for Google VPN, managed service for handling route | |
Cloud CDN - places online content closer to users for fast response time, content cached in 80+ edge cache sites around the globe. | |
VPC: | |
subnets can span multiple zones, network can span multiple regions | |
IP Addressing and Firewall Rules: | |
All instance comes with private IP based on subnet, Public IP enabled by default, (Ephemeral/Static), Instance can have one external address, | |
Unassigned static $.01 per hour(1 cent) | |
Firewall rules - protect resource from unapproved network connections, allow or deny, IP addresses, Port or protocol | |
Operation and Management Tools: | |
Google Stackdriver - provides powerful monitoring, logging and diagnostics for cloud operations, Natively monitor GCP, AWS or hybrid of both | |
Stackdriver components - 1.Monitoring 2.Logging 3.Error Reporting 4.Debugger 5.Trace | |
Deployment Manager - IaaS, uses YAML, repeatable deployment process, declarative approach, template driven | |
Source Repositories - Git repo hosted on GCP, built in source code editor, integrate with Stackdriver Debugger, connect to GitHub or BitBucket | |
Google App Engine: | |
PaaS - Focus on App dev, Managed Infra, Pay per use vs Pay per allocation | |
App Engine is GCP's tool to build modern web and mobile app on an open cloud platform, fully managed, support custom labguages, no vendor lock in | |
gcloud app deploy - | |
App Engine - Standarad Environment and Flexible Environment | |
Standard Env - runs in secure, sandboxed env, cannot write to local file system or modify runtime, pricing based on instance hours | |
Flexible Env - Based on Compute Engine, more customization, more native support, pricing based on CPU, memory and disk usage | |
App Engine is regional and available in selected regions | |
Use cases: Nitendo and DeNA - Super Mario Run | |
Deploy sample python app called BookShelf - App Engine Standard Runtime Env | |
https://codelabs.developers.google.com/codelabs/cp100-app-engine | |
1. 'git' the code | |
2. review the code | |
3. install requirements | |
4. deploy(gcloud app deploy) | |
gcloud init | |
gcloud source repos clone default --project=bookshelf-project-228207 | |
cd default | |
git push -u origin master | |
git pull https://github.com/GoogleCloudPlatformTraining/cp100-bookshelf | |
cd app-engine | |
pip install -r requirements.txt -t lib | |
gcloud app deploy | |
Beginning deployment of service [default]... | |
╔════════════════════════════════════════════════════════════╗ | |
╠═ Uploading 235 files to Google Cloud Storage ═╣ | |
╚════════════════════════════════════════════════════════════╝ | |
File upload done. | |
Updating service [default]...done. | |
Setting traffic split for service [default]...done. | |
Deployed service [default] to [https://bookshelf-project-228207.appspot.com] | |
You can stream logs from the command line by running: | |
$ gcloud app logs tail -s default | |
To view your application in the web browser run: | |
$ gcloud app browse | |
App Engine Resources: | |
Versions | |
Instances | |
Datastore | |
Storage | |
Cloud Endpoints: | |
Create, deploy, protect, monitor, analyze and server your APIs | |
APIs - Standardize interface for developers to build apps Ex: Google Drive API | |
Using Cloud Endpoints - Build your own API in App Engine Std, Expose API using RESTful interface, Oauth 2.0 authorization, Supports python and Java | |
Google Cloud Storage Options: | |
Bigtable(Analytics)(Non-relational) | |
Datastore(Non-relational) | |
Firestore | |
Storage(Unstructured) | |
SQL(Relational) | |
Spanner(new category)(Horizontal scalability) | |
Memorystore | |
Filestore | |
SQL - Consitency(Based on ACID), NoSQL - scalability/flexibility | |
Database Options - Closer Look: | |
Cloud SQL - Create instance/region/size, hosted mysql service, vertical scaling(read/write), horizontal scaling(read), Limited scalability | |
Datastore - Born as App Engine repo, Scale and Flexibility, Fully managed, scale from 0 to TB of data, Cost efficient, support ACID txns | |
Bigtable - managed NoSQL analytics(TB to PB), Hosted version of Google's own internal Bigtable technology, High vol write, millisec latency, | |
pricier than datastore. | |
CloudSpanner - RDBMS with better scalability(Horizontal), Billed as best of both worlds(NewSQL) | |
Use cases: | |
CloudSQL - web framework, CMS, eCommerce | |
Datastore - user profiles, product catalog, game states | |
Bigtable - high throughput analytics, IOT, Adtech | |
Spanner - scale+consistency, financial service, global supply chain | |
Cloud Storage(Unstructured Data): | |
Ex: pictures, videos, obj, docs, multimedia etc (BLOB), Integrates with Compute Engine, App Engine, BigQuery, CloudSQL | |
Unified Object storage, price competitive, pay per use, high scalable, multiple storage classes based on storage needs, | |
Not FS but can be setup as FS using 3rd party, data encrypted in transit and at rest | |
Bucket(basic container, cannot nest buckets, name shd be unique), Objects(files, 5TB per single file), Data opacity(no knowledge of structure) | |
Storage classes - Multi-regional(geo redundant), Regional(geographic), Nearline(low cost per GB, 30 days min, Infrequent access), Coldline (lower, 90 days, Cold data) | |
(all has same throughput, low latency, high durability) 99.95%/99.9%/99%/99% available | |
Storage cost - 26$/20$/$10/$7 per TB per month | |
gsutil mb -l us-central1 gs://bookshelf-prabha | |
gsutil defacl ch -u AllUsers:R gs://bookshelf-prabha | |
Hands-On - Cloud Storage with Third Party Application: | |
Cloudberry Backup - For Backup and Restore | |
create a bucket with nearline storage class | |
Use Home Edition Free | |
Google Container Engine | |
What are Containers and Kubernetes?: | |
Docker | |
Kubernetes("Helmsman")-Open source container manager | |
Master - Control K8s nodes | |
Node - Machines that performs tasks. Controlled by K8S master. | |
Pod - Group of 1 or more containers in a node. | |
Replication Controller - ensures specified no of pod replicas are running at any one time across nodes | |
kubectl - CLI tool for K8S | |
GKE: | |
fully managed env for deploying containerized apps, use GCE resources, Self healing, Autoscaling, Powered by K8S, Custom OS(Container Optimized OS) | |
K8S the hard way - https://github.com/kelseyhightower/kubernetes-the-hard-way | |
GKE Organization/Components - Container Cluster, K8S Master, Pods, Nodes, Replication controller, Services, Container Registry | |
Use case: Niantic - Pokemon Go | |
Reference for demo: https://codelabs.developers.google.com/codelabs/cp100-container-engine/#0 | |
Enable scope for User Info and Cloud Platform when creating K8s Engine | |
cd container-engine | |
gcloud config set container/cluster bookshelf | |
docker build -t gcr.io/bookshelf-project-228207/bookshelf | |
gcloud docker --push gcr.io/bookshelf-project-228207/bookshelf | |
gcloud container clusters get-credentials bookshelf --zone us-central1-a --project bookshelf-project-228207 | |
kubectl create -f bookshelf-frontend.yaml | |
kubectl get pods | |
kubectl get services | |
Big Data and Machine Learning | |
Big Data - Extremely large datasets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human | |
behavior and interactions. | |
GCP - serverless, managed infrastructure. both batch and stream processing. spark and hadoop in the cloud. industry leading ML capabilities | |
GCP Big Data Services: | |
GCP BigData Suite - BigQuery, Dataflow, Dataproc, Datalab, Dataprep, Pub/Sub | |
BigQuery - Enterprise DW, stores and queries massive datasets, SQL syntax for queries, Real-time analytics | |
Dataflow - Fully managed data processing service, Batch and Stream processing, open source, tight integration with GCP | |
Dataproc - Fully managed service for running Apache spark and Hadoop clusters, scalable clusters, billed by minute, open source and integrated | |
Ideal for Migrate hadoop jobs to cloud, Analyze data stored in Cloud storage, Use spark to perform data mining and analysis, Use spark ML libraries to | |
run classification algorithms | |
Datalab - Interactive tool for data exploration, analysis, visualization, and ML. open source(Jupyter), supports ML models based on Tensorflow | |
Pub/Sub - Messaging middleware, Apps publish and subscribe to topics, Ideal for stream processing, Decouples senders and receivers, connecting services | |
between other GCP services. | |
Big Data Life cycle: | |
Ingest Process Store Analyze | |
------ ------- ----- ------- | |
Google App Engine Cloud Dataproc BigQuery Storage Big Query Analytics | |
Cloud Pub/Sub Cloud Dataflow | |
Cloud Monitoring Cloud Dataflow Cloud Storage Apache Hadoop | |
Cloud Storage Apache Spark | |
BigQuery Organization | |
Projects - same as GCP projects, can be shared | |
Dataset - grouping of tables, lowest level of access control | |
Tables - row/column structure, actual data | |
Jobs - queuing large requests | |
https://www.reddit.com/r/bigquery/comments/3dg9le/analyzing_50_billion_wikipedia_pageviews_in_5/ | |
Google Cloud Machine Learning Platform | |
machines and apps can learn and adapt new | |
creating (intelligent)apps that can see, hear and understand the world around them | |
Cloud ML Engine, Cloud Vision API, Cloud Speech API, Cloud Jobs API, Cloud Translation API, Cloud Natural Language API, Cloud Video Intelligence API | |
Build on Tensorflow, open source tool to build and run neural network models | |
ML Engine - managed service to create own machine learning service, ideal for custom predictive analysis. Use cases: Data security, Financial Trading, | |
Health care, Marketing, Fraud detection, Smart cars. | |
Cloud Vision API - Image recognition, detect and extract text for OCR | |
Cloud Natural language API - Reveal the structure and meaning of text, sentiment behind the text | |
Cloud Translate API = language translation and detection | |
Cloud Speech API - speech recognition, convert audio to text, over 110 languages supported | |
Cloud Video Intelligence - video analysis, makes video searchable by content | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment