Created
March 17, 2018 04:14
-
-
Save voratham/512cc99d6ac00bb291ea8a6380dd1746 to your computer and use it in GitHub Desktop.
BIS618 - Dataproc Google Cloud
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
เลือกสร้าง Cluster ใน zone ที่ยังไม่เคยใช้เพื่อไม่ให้เกินข้อจำกันฟรีที่ 8vCPUs 100GB disk | |
Master: 4 vCPUs 15GB 40 GB disk | |
2 workers: 2 vCPUs 7.5GB 30 GB disk | |
Image version: 1.2 | |
Initialization actions: | |
gs://dataproc-initialization-actions/zeppelin/zeppelin.sh | |
gs://dataproc-initialization-actions/hue/hue.sh | |
(ดู Initialization actions เพิ่มเติมที่ https://github.com/GoogleCloudPlatform/dataproc-initialization-actions) | |
--------------ทำครั้งเดียว--------------------- | |
ที่ Google cloud | |
ตั้งค่า Firewall rules | |
ไปที่ VPC --> Firewall rules --> Create a firewall rule | |
ตั้งค่าตามนี้ | |
Name: zeppelin | |
Priority: 8080 | |
Targets: master | |
Source IP ranges: 0.0.0.0/0 | |
Protocols and ports: tcp | |
Name: hue | |
Priority: 8888 | |
Targets: master | |
Source IP ranges: 0.0.0.0/0 | |
Protocols and ports: tcp | |
----------end-------------------------- | |
import example database | |
1------------ | |
ไปใช้ SSH หรือ zeppelin (%sh) ที่ Ip-master:8080 | |
wget https://raw.githubusercontent.com/praisan/hello-world/master/retail_hive_export.zip | |
unzip retail_hive_export.zip | |
hdfs dfs -put retail_export /user/zeppelin/ | |
2------------ | |
ไปใช้ Hive บน Hue ที่ Ip-master:8888 | |
DROP DATABASE IF EXISTS retail CASCADE; | |
CREATE DATABASE retail; | |
USE retail; | |
IMPORT FROM '/user/zeppelin/retail_export/categories'; | |
IMPORT FROM '/user/zeppelin/retail_export/customers'; | |
IMPORT FROM '/user/zeppelin/retail_export/departments'; | |
IMPORT FROM '/user/zeppelin/retail_export/order_items'; | |
IMPORT FROM '/user/zeppelin/retail_export/orders'; | |
IMPORT FROM '/user/zeppelin/retail_export/products'; | |
gcloud dataproc --region us-east1 clusters create adhoc-analytic --subnet default --zone us-east1-b --master-machine-type n1-standard-4 --master-boot-disk-size 40 --num-workers 2 --worker-machine-type n1-standard-2 --worker-boot-disk-size 30 --image-version 1.2 --project quiet-biplane-186814 --initialization-actions 'gs://dataproc-initialization-actions/zeppelin/zeppelin.sh','gs://dataproc-initialization-actions/hue/hue.sh' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment