Skip to content

Instantly share code, notes, and snippets.

@bugcy013
Created January 21, 2013 13:07
Show Gist options
  • Save bugcy013/4585907 to your computer and use it in GitHub Desktop.
Save bugcy013/4585907 to your computer and use it in GitHub Desktop.
EC2 with Apace hadoop notes
ec2-api-tools used in scripts
-----------------------------
launch-hadoop-cluser <cluster name> <cluser slave node (No.of slave nodes)>
launch-hadoop-cluster dhanaHadoop 500
dhanaHadoop --> Masternode
500 --> No.of Slave nodes (Data nodes)
For example taken 500 nodes
Total ec2 instance running = 501 [1+500 = 501]
Before lanuch the check user level auth. security credential All the details specfied in hadoop-ec2-env.sh
All the scripts looks for hadoop-ec2-env.sh
launch-hadoop-cluster --
|__launch-hadoop-master
|
|__launch-hadoop-slaves
launch-hadoop-master ec2-api-tools list
========================================
1.ec2-describe-instances
2.ec2-describe-group
3.ec2-add-group
4.ec2-authorize
5.ec2-describe-images -a
6.ec2-run-instances [Starting instance command]
1.Get detail from Master ip
2.Run instance with USER_DATA_FILE
3.From master instance from copy password less access the across the all the nodes.
copy the private key to all the nodes.
launch-hadoop-slave ec2-api-tools list
========================================
echo "Adding $1 node(s) to cluster group $CLUSTER with AMI $AMI_IMAGE"
ec2-run-instances $AMI_IMAGE -n "$NO_INSTANCES" -g "$CLUSTER" -k "$KEY_NAME" -f "$bin"/$USER_DATA_FILE.slave -t "$INSTANCE_TYPE" - z "$MASTER_ZONE" $KERNEL_ARG | grep INSTANCE | awk '{print $2}'
1.Getiing variable from hadoop-ec2-env.sh like AWS_ACCOUND_ID, KEY,ACCESSKEY
2.Getting Master ip from from master instance
3.Same master ip applied for all the slave instance.
hadoop-ec2-env.sh
=================
AWS_ACCOUNT_ID=
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
HADOOP_VERSION=0.19.0
JAVA_VERSION=1.6.0_07
INSTANCE_TYPE="m1.small"
m1 -- Memory insentive
c1 -- CPU insentive
terminate-hadoop-cluster
========================
ec2-terminate-instances
How to get master node ip
===========================
wget -q -O - http://169.254.169.254/latest/meta-data/local-hostname
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment