Skip to content

Instantly share code, notes, and snippets.

@jeongho
Created February 25, 2016 17:43
Show Gist options
  • Select an option

  • Save jeongho/6237beff8096181f8411 to your computer and use it in GitHub Desktop.

Select an option

Save jeongho/6237beff8096181f8411 to your computer and use it in GitHub Desktop.
Using the CLI to access the cluster from your own host
#!/usr/bin/env bash
#Using the CLI to access the cluster from your own host
#Step 1. Setup your Hadoop config
#Cloudera Manager UI, Services>All Services>Client Configuration URLs
#Step 2. Download CDH4 and setup your environment
#1. Point your browser at CDH Tarballs
#2. Click on CDH4 tarballs and download hadoop-2-x
#3. Update your environments (~/.bash_profile is a good bet)
export HADOOP_BASE=~/path/to/cdh/dir
export HADOOP_HOME=${HADOOP_BASE}/share/hadoop/mapreduce1
export PATH=${PATH}:${HADOOP_BASE}/bin-mapreduce1
export HADOOP_CONF_DIR=~/hadoop-conf/dev
hadoop fs -ls /
#Step 3. Test with MapReduce
echo "the cat on the mat" | hadoop fs -put input.txt
hadoop jar ${HADOOP_BASE}/share/hadoop/mapreduce1/hadoop-examples-*.jar wordcount input.txt output
hadoop fs -cat "output/part*"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment