Created
January 20, 2019 10:24
-
-
Save j-mprabhakaran/2993ea834bb54b2e03bcb64d0034f5a5 to your computer and use it in GitHub Desktop.
GCP Cloud Architect - Part 3
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
GCP Cloud Architect - Part 3 | |
Case Studies | |
Refreshed Nov 9th 2018; JencoMart completely dropped | |
Overview - 3 case studies; 40-50% on the exam; Question on one side, Case study on other side | |
Layout of Case study - 1.Company Overview 2.Solution concept - current goal 3.Existing Technical Env 4.Requirements(Tech/Business) 5.Executive statement | |
Mountkirk Games | |
Dress4Win | |
TerramEarth | |
Mountkirk Games | |
https://cloud.google.com/solutions/mobile/mobile-gaming-analysis-telemetry | |
Business Requirements: | |
Single Global HTTP LB | |
Pub/Sub, Datastore, BigQuery, Cloud Storage | |
Dataflow | |
Monitor with stackdriver | |
Multi-regional GCE backends; Multi-region Datastore | |
Technical Requirements: | |
Game Backend Platform | |
---------------------- | |
Autoscaling Managed IG | |
Cloud Datastore | |
BigQuery | |
Cloud Dataflow | |
Managed IG - Custom Images | |
Game Analytics Platform | |
----------------------- | |
Autoscaling services | |
Connect services with Pub/Sub, process with Dataflow | |
Dataflow accounts for late/out of order data | |
BigQuery | |
Upload to storage; Process via Dataflow | |
Dress4Win | |
Executive Priorities: | |
Scale | |
Contain costs - improve TCO | |
Too many resources sitting idle | |
Solution concept: | |
Move Dev/Test Env to Google Cloud (seperate project for different env) | |
DR Site - Hybrid cloud/On-Premise Env; Connect over VPN | |
Business Requirements: | |
create equivalent setup on the cloud(Lift and Shift) | |
Principle of least privilege; separate test/dev env | |
Automate infra creation; gcloud/SDK; Rapid deployment(Deployment manager) | |
Stackdriver - monitor infra with stackdriver monitoring; notified of errors with stackdriver logging; troubleshoot with Debug/Error Reporting | |
Technical Requirements: | |
Best practices for migration; Move date first then applications | |
Deployment Manager, other IAC products | |
CI/CD pipeline; Jenkins etc | |
Failover - MySQL replicating to CloudSQL; on-premise/cloud app servers - DNS cutover | |
All data encrypted by default; customer supplied encryption | |
VPN | |
Databases: MySQL - CloudSQL(Native MySQL support; 10TB size limit; Single region) | |
Migration - create replica server managed by cloud SQL; once replica is synced, 1.Update app to point replica 2.promote replica to standalone instance | |
Redis 3 - 1.Run Redis server on compute engine 2.Use new Memorystore managed Redis database | |
40 web app server - Managed IG - autoscaling; Use custom machine types | |
20 Apache hadoop - Cloud Dataproc | |
3 RabbitMQ - 1.Pub/Sub 2.Deploy same env on CE IG | |
iSCSI and Fiber channel SAN - Persistent Disk working on SAN cluster | |
NAS - Cloud Storage | |
TerramEarth | |
Heavy equipment, mining, agriculture | |
500 dealers all over the world | |
mission = make customers more productive | |
Current Setup | |
Collect analytics on vehicles | |
Increase efficieny | |
Predict breakdown and pre-stage replacement parts | |
20 million vehicles - each collect 120 fields per second | |
Data stored locally, then uploaded(batch) when at dealer | |
200,000 use cellular connection - Always streaming data; 9TB per day total upload | |
Problem to solve | |
Turn around time 4 weeks - Needs be 1 week | |
Management priority - Business agility | |
Data Ingest - Data Warehouse - Analytics | |
GCP Approach 1 - increase cellular connectivity to higher %; Migrate FTP batch upload to streaming upload | |
Cloud IOT Core -> Cloud Pub/Sub -> Cloud Dataflow -> BigQuery -> Cloud Datalab or Datastudio | |
-> Cloud Fuctions | |
BigQuery -> Cloud ML -> Cloud Dataflow | |
GCP Apporach 2 - 100% local service center server(Batch) | |
Transfer via API ->Cloud Storage Regional Bucket -> Cloud Dataflow -> BigQuery -> Insights | |
BigQuery -> Cloud ML -> Cloud Dataflow -> Cloud Storage | |
Custom API for Dealers and partners; API Engine+Cloud Endpoints | |
Data transfer to GC - IOT Core | |
Planning Your Cloud Transition | |
Making the Case for the Cloud and GCP | |
Why move to cloud? 1.Cost 2.Future proof infra 3.Scale to meet demand 4.Greater business agility 5.Managed services 6. Global reach 7.security at scale | |
Cost Optimization - Sustained use discounts; Custom machine types(0.9-6.5GB RAM per CPU); Rightsizing recommendations(based on 8 days usage); | |
Preemptible VMs (FT, batch processing); Coldline storage(archive/DR, millisecond access); Commited use discounts; | |
Architecting Cloud Apps: | |
App design requirements - 5 principles: HA, Scalability, Security, DR, Cost | |
Migrating to Google Cloud | |
Planning a Successful Cloud Migration | |
Assess, Pilot, Move Data, Move Applications, Cloudify & Optimize | |
Assess - 3 categories 1.Easy to move 2.Hard to move 3.Can't move; Evalution criteria: Criticality of app, compliance, license, ROI; consider app dependencies | |
Pilot - POC/Test Run; non-critical or easily duplicated services; small steps at first; considerations - licensing, rollback plan, process changes | |
start mapping roles: projects, seperation of duties, test/prod env, VPC's | |
Move Data - Data before app, evaluate storage options, transfer methods (gsutil, transfer appliance, batch upload, storage transfer service, mysqldump) | |
Move apps - self service or partner assisted; Lift & shift recommended; VM IMPORT freely available options via CloudEndure; Hybrid, Backup as Migration | |
Optimize - Cloud makeover; retool processes and apps with modern GCP tools - 1.offload static assets to CS 2.Enable AS 3.Enhance redundancy with AZ | |
4.Enhanced monitoring with stackdriver 5.Managed services 6.Decouple stateful storage from app | |
Storage transfer service | |
1.Import online data(AWS S3, HTTP/(S) location, Another CS) into CS 2.Import from online data source to data sink(CS bucket) | |
Transfer operation configured thru transfer job; Requires owner or editor project IAM's role+access to source and sink | |
gsutil(on-premise) vs Storage Transfer Service(CSP - GCS, AWS, HTTP) | |
Migrating Applications | |
Migrating from on-premises=migrating servers; App migration=server migration; Map to GCP services | |
Before moving server, 1.Create a project 2.Determine network config(VPC) - Firewall, Region, subnets 3.Determine IAM roles | |
Lift and Shift (GCE public image, Import direct image) | |
Data Migration Best Practices | |
Cloud - Storage Transfer, On-premise - gsutil, slow network - "mail it in" | |
Make Data transfer easier - 1.Decrease data size 2.Increase network bandwidth (Direct peering/Cloud Interconnect) | |
gsutil - Multithreaded, parallel upload, resumable | |
Physical media - Google Transfer Appliance; 20TB or more data | |
Mapping storage solutions | |
Unstructured data - CS | |
Relational data - Cloud SQL or Spanner | |
Non-relational data - BigTable and Datastore | |
Big Data Analysis - BigQuery | |
In-memory database - MemoryStore | |
Other - Persistent disk | |
Cloud Solution Infrastructure | |
Preemptible VM's - Rendering, Media transcoding, Bigdata analytics; | |
Best practices - Use smaller machine types; run off-peak times; preserve disk on machine termination; use shutdown scripts | |
create and terminate machine to save costs, but preserve disk states: --no-auto-delete --disk example-disk | |
Managed IG with PVMs keep recreating every minute: Health check/Firewall configuration | |
Backup and Disaster Recovery - Backup individul instance(snapshots); Database backup(use cron job to backup data to CS or persistent disk); | |
CS backup/rollback(object versioning+lifecycle mgmt); Distributed computing app rollback(rolling update for managed IG/version control,split traffic for GAE); | |
Scheduled automated backup(cron jobs, apply to snapshot, database backup) | |
Rollback plan for managed IG serving website - 100's of instances - Obj versioning on static data in CS; Rolling updates; NOT snapshots | |
Backup critical database with zero downtime & min resource usage - Scheduled cron job; Backup database data to another location(CS, Persistent Disk) | |
App Engine, need to push risky update to live env - Versioning/Split traffic, canary update | |
Security | |
Methods of Security GCP Environment | |
Exam scenarios - | |
1.IG VM's keep restarting every minute - Failing health check/Configure firewall to allow proper IG VM's from LB IP | |
2.On-premise network access to proper network resources - Restrict ingress firewall access to on-premise network IP range | |
3.Failover from on-premise LB hosted app to GCP hosted IG - consider security & compliance; allow firewall access at IG from outside source | |
4.External SSH access disabled, but ops team need to remotely manage VM's - Give ops team access to cloud shell; not same scenario as removing external IP | |
Legal Compliance and Audits | |
Designing for LC & Audits, consideration include Legislation, Audits, Certification; | |
Audit, auditor, access logs, compliance, think Stackdriver logging | |
Billing data exported directly to CS/BigQuery | |
Automating/Exporting Logging data for audits - analysis(BigQuery), access to external parties(CS); | |
Analyze PCI data - PCI DSS securely handle credit card info; stream to BigQuery for analysis | |
Send Log data to BigQuery for analysis - Data travels from squid proxy to Stackdriver logging/monitoring; Export from stackdriver logging to BigQuery | |
Securely migrating Database data - migrate Database to Datastore; App Authentication oAuth 2.0-> Export Database info to CS -> import into Datastore | |
if migrating via app/API = Authenticate with oAuth 2.0 with service account and export to GCS | |
if exporting as simple copy = gsutil copy to GCS | |
Development Practices | |
SDLC, CI/CD, Blue/Green model for deployment, Application microservices | |
SDLC - produces s/w with high quality, lowest cost, shortest time; Plan to develop, alter, maintain, and replace software system; Env seperate; sep projects | |
CI/CD - CI:Integrate code into main branch of shared repo early and often; minimize cost of integration; CD:Focus on automating software delivery process | |
automatically deploy each build that passes full lifecycle; GCP Container Builder, Jenkins, Spinakar | |
Blue/Green Deployment | |
Break monolith to micro services - reduce unplanned rollbacks due to errors, what best practices? Blue-Green Model, break monolith into microservices | |
Application Error Examples | |
Java digest error - Re-sign JAR file | |
News Mobile App caching under load, Needs to prevent caching - Overwrite Datastore entries; Set app to work from snigle instance; Modify API to prevent | |
caching; (set HTTP cache flag to -1) | |
Data Flow Lifecycle | |
Data Flow - Putting the Pieces Together | |
Managing Data's life cycle - Bigdata Focus, 4 stages: Ingest, Store, Process and Analyze, Explore and Visualize | |
Ingest - GAE, GCE, GKE, Cloud Pub/Sub, Stackdriver Logging, Cloud Transfer Service, Transfer Appliance | |
Store - Cloud Storage, Cloud SQL, Cloud Datastore, Cloud BigTable, BigQuery, CloudSpanner, CS for Firebase, Cloud Firestore | |
Process & Analyze - Cloud Dataflow, Dataproc, BigQuery, ML, Vision API, Speech API, Translate API, Natural Language API, Dataprep, Video Intelligence API | |
Explore & Visualize - Datalab, Datastudio, Google Sheets | |
Structured -> Transactional(CloudSQL, CloudSpanner), Analytical(BigQuery) | |
Semi-structured -> Fully indexed(Cloud Datastore), Row Key(Cloud BigTable) | |
Unstructured -> Cloud Storage | |
Cloud Dataproc - Existing Hadoop/Spark App; ML/DS Ecosystem; Tunable cluster parameters | |
Cloud Dataflow - New data processing pipelines, Unified streaming & batch, Fully managed | |
Cloud Dataprep - UI-Driven preparation, Scaled on-demand, Fully managed | |
Data Flow Hands-On and Reference Material | |
gcp.solutions |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment