Skip to content

Instantly share code, notes, and snippets.

View jbenninghoff's full-sized avatar

John Benninghoff jbenninghoff

  • Ventura, CA, United States
View GitHub Profile
@jbenninghoff
jbenninghoff / runSparkTeraSort.sh
Last active March 26, 2019 19:34
Spark TeraSort launch example
#!/bin/bash
/opt/mapr/spark/spark-1.2.1/bin/spark-submit --master yarn-client \
--class org.apache.spark.examples.terasort.TeraGen \
--name 'TeraGen' \
--conf 'mapreduce.terasort.num.partitions=5' \
--executor-cores 30 \
--executor-memory 7G \
--num-executors 9 \
terasort-project_2.10-1.0.jar 50G /user/$USER/spark-terasort
@jbenninghoff
jbenninghoff / text
Last active August 29, 2015 14:16
clush2ansible-hosts
root@cent01 redhat 4C 02:43pm# sed -n '/^#/d;/@/d;s/\(^.*\): \(.*$\)/[\1]\n\2\n/p' /etc/clustershell/groups | sed '/\[.*\]/s/-/:/;/.* .*/s/ /\n/'
[all]
cent[01:05]
[zk]
cent[01:03]
[cldb]
cent01
cent02
@jbenninghoff
jbenninghoff / gist:13982fe5468c591c43df
Created January 22, 2015 20:51
Another wordcount in pig
hduser@master:~$ cat wordcount.pig
A = load '/user/jbenninghoff/somefile.txt';
B = foreach A generate flatten(TOKENIZE((chararray)$0)) as word;
C = filter B by word matches '\\w+';
D = group C by word;
E = foreach D generate COUNT(C), group;
store E into '/user/jbenninghoff/somefileWordcount';
@jbenninghoff
jbenninghoff / 0_reuse_code.js
Created July 31, 2014 06:49
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console