Skip to content

Instantly share code, notes, and snippets.

View chiangbing's full-sized avatar

Bing Chiang chiangbing

View GitHub Profile
@chiangbing
chiangbing / hadoop_streaming_cmd
Created July 4, 2012 02:54
basic streaming command
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming*.jar \
-input /path/to/input/data/dir \
-ouput /path/to/output/result/dir \
-mapper mapcmd \
-reducer reducecmd \
@chiangbing
chiangbing / get_bash_var.py
Last active December 18, 2015 11:59
Read variables in bash script by sourcing the file and echoing the variable.
def get_var(srcfile, *variables):
"""Get variables' values from bash shell source file.
Return a dictionary that maps from variable name to its value.
"""
cmd = ". " + srcfile + ";"
for var in variables:
cmd += "echo ${" + var + "};"
proc = subprocess.Popen([cmd], stdout=subprocess.PIPE, shell=True)
@chiangbing
chiangbing / battery.lua
Last active December 19, 2015 00:18
a battery widget for awesome wm.
-- {{{ Battery bar
mybatterybar = awful.widget.progressbar()
mybatterybar:set_border_color(theme.border_normal)
mybatterybar:set_background_color(theme.bg_normal)
mybatterybar:set_color(theme.bg_focus)
mybatterybar:set_width(50)
mytimer = timer({ timeout = 30 })
mytimer:connect_signal("timeout", function()
f = io.popen('acpi -b', r)
@chiangbing
chiangbing / minhash1.py
Last active December 23, 2015 15:39
A code snippet that solve Exercise 3.3.1(b) of the book *Mining of Massive Datasets*.
# A code snippet that solve Exercise 3.3.1(b) of *Mining of Massive Datasets*.
def permute(items):
"""Iterate all permutations of a list of items."""
length = len(items)
iternum = reduce(lambda x, y: x * y, range(1, length + 1))
for i in range(iternum):
digs = fac_base_digits(i)
@chiangbing
chiangbing / centos_sunjdk_alternatives.sh
Last active December 28, 2015 15:59
Execute after jdk is installed so that it become the default java version.
JDK_VERSION=1.7.0_45
sudo alternatives --install /usr/bin/java java /usr/java/jdk${JDK_VERSION}/jre/bin/java 200000
sudo alternatives --install /usr/bin/javaws javaws /usr/java/jdk${JDK_VERSION}/jre/bin/javaws 200000
sudo alternatives --install /usr/lib/mozilla/plugins/libjavaplugin.so libjavaplugin.so /usr/java/jdk${JDK_VERSION}/jre/lib/i386/libnpjp2.so 200000
sudo alternatives --install /usr/lib64/mozilla/plugins/libjavaplugin.so libjavaplugin.so.x86_64 /usr/java/jdk${JDK_VERSION}/jre/lib/amd64/libnpjp2.so 200000
sudo alternatives --install /usr/bin/javac javac /usr/java/jdk${JDK_VERSION}/bin/javac 200000
sudo alternatives --install /usr/bin/jar jar /usr/java/jdk${JDK_VERSION}/bin/jar 200000
sudo alternatives --install /usr/lib/jvm/java-1.7.0 java_sdk_1.7.0 /usr/java/jdk${JDK_VERSION} 200000
sudo alternatives --install /usr/lib/jvm/jre-1.7.0 jre_1.7.0 /usr/java/jdk${JDK_VERSION}/jre 200000
@chiangbing
chiangbing / enable-hadoop.sh
Last active December 28, 2015 15:59
Start/Enable Hadoop(CDH) services.
sudo chkconfig hadoop-hdfs-namenode on
sudo chkconfig hadoop-hdfs-datanode on
sudo chkconfig hadoop-yarn-resourcemanager on
sudo chkconfig hadoop-yarn-nodemanager on
sudo chkconfig hadoop-mapreduce-historyserver on
sudo -u hdfs hadoop fs -mkdir /solr
sudo -u hdfs hadoop fs -chown solr /solr
solrctl init
@chiangbing
chiangbing / install_storm.sh
Last active December 30, 2015 18:49
storm installation
sudo salt '*' cp.get_file salt://storm-0.9.0.1.tar.gz /tmp/storm-0.9.0.1.tar.gz
sudo salt '*' cmd.retcode 'tar zxf /tmp/storm-0.9.0.1.tar.gz -C /home/hadoop/apps/' 'runas=hadoop'
sudo salt '*' cp.get_file salt://storm.yaml /home/hadoop/apps/storm-0.9.0.1/conf/storm.yaml
sudo salt '*' file.chown /home/hadoop/apps/storm-0.9.0.1/conf/storm.yaml hadoop hadoop
sudo salt '*' file.mkdir /var/lib/storm
sudo salt '*' file.chown /var/lib/storm hadoop hadoop
# install storm native libraries
sudo salt '*' cp.get_file salt://zeromq-2.1.7.tar.gz /tmp/zeromq-2.1.7.tar.gz
sudo salt '*' cmd.retcode 'tar zxf /tmp/zeromq-2.1.7.tar.gz -C /tmp'
@chiangbing
chiangbing / hugepage_config.sh
Created December 20, 2013 02:29
an example to configure huge page on centos (from willis)
tlbuser=clay
totalgb=$(( 32*1024*1024*1024 ))
pgsize=$(( 2048*1024 ))
sysctl=/etc/sysctl.conf
limits=/etc/security/limits.conf
configured=`grep kernel.shmmax $sysctl`
if [ -n "$configured" ]; then
echo 'hugepage is configured'
exit
else
# configure hadoop
wget http://public-repo-1.hortonworks.com/HDP/tools/2.0.6.0/hdp_manual_install_rpm_helper_files-2.0.6.76.tar.gz
tar xf hdp_manual_install_rpm_helper_files-2.0.6.76.tar.gz
(TODO ...)