Skip to content

Instantly share code, notes, and snippets.

View bugcy013's full-sized avatar
🪄
Focusing

Dhanasekaran Anbalagan bugcy013

🪄
Focusing
View GitHub Profile
# Add Cloudera RPM-GPG-KEY and repo
rpm --import http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
rpm -ivh http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
# note: if you want to install a specific version,
# modify /etc/yum.repos.d/cloudera-cdh4.repo accordingly.
# For example, if you want to install 4.2.1, use the following baseurl:
# baseurl=http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4.2.1/
# Install CDH4 httpfs Base
@bugcy013
bugcy013 / setup
Created November 6, 2013 18:52 — forked from tariqmislam/setup
##########
# For verification, you can display the OS release.
##########
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=11.10
DISTRIB_CODENAME=oneiric
DISTRIB_DESCRIPTION="Ubuntu 11.10"
##########
sudo apt-get install oozie oozie-client
sudo apt-get install mysql-server-5.1
mysql -u root -p password
create database oozie;
grant all privileges on oozie.* to 'oozie'@'localhost' identified by 'oozie';
grant all privileges on oozie.* to 'oozie'@'%' identified by 'oozie';
exit
sudo vim /etc/oozie/conf/oozie-site.xml
This gist includes components of a oozie workflow - scripts/code, sample data
and commands; Oozie actions covered: shell action, email action
Action 1: The shell action executes a shell script that does a line count for files in a
glob provided, and writes the line count to standard output
Action 2: The email action emails the output of action 1
Pictorial overview of job:
--------------------------
This gist includes hive ql scripts to create an external partitioned table for Syslog
generated log files using regex serde;
Usecase: Count the number of occurances of processes that got logged, by year, month,
day and process.
Includes:
---------
Sample data and structure: 01-SampleDataAndStructure
Data download: 02-DataDownload
Data load commands: 03-DataLoadCommands
@bugcy013
bugcy013 / doit
Created October 2, 2013 17:30 — forked from stantonk/doit
#!/bin/bash
# Source: http://toomuchdata.com/2012/06/25/how-to-install-python-2-7-3-on-centos-6-2/
yum groupinstall "Development tools"
yum install zlib-devel
yum install bzip2-devel openssl-devel ncurses-devel
wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tar.bz2
tar xf Python-2.7.3.tar.bz2
cd Python-2.7.3
#!/bin/bash
set -o nounset
set -o errexit
if [ $# -lt 1 ]; then
echo "Usage: $0 <User>@<Host>"
echo ""
echo " Copies your id_rsa.pub file to the remote host and adds it to the"
echo " authorized keys."
hbase(main):014:0> add_peer '1', 'localhost:2181:/hbase-2'
0 row(s) in 0.0580 seconds
hbase(main):015:0> start_replication
2011-02-11 18:04:58,347 INFO org.apache.hadoop.hbase.replication.ReplicationZookeeper: Replication is now started
0 row(s) in 0.0500 seconds
hbase(main):016:0> put 'test', '2011-02-11 18:05:22,003 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [10.0.0.57,60020,1297437319991]
2011-02-11 18:05:22,016 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for 10.0.0.57,60020,1297437319991
2011-02-11 18:05:22,016 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=10.0.0.57,60020,1297437319991 to dead servers, submitted shutdown handler to be executed, root=true, meta=true
hbase(main):014:0> add_peer '1', 'localhost:2181:/hbase-2'
0 row(s) in 0.0580 seconds
hbase(main):015:0> start_replication
2011-02-11 18:04:58,347 INFO org.apache.hadoop.hbase.replication.ReplicationZookeeper: Replication is now started
0 row(s) in 0.0500 seconds
hbase(main):016:0> put 'test', '2011-02-11 18:05:22,003 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [10.0.0.57,60020,1297437319991]
2011-02-11 18:05:22,016 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for 10.0.0.57,60020,1297437319991
2011-02-11 18:05:22,016 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=10.0.0.57,60020,1297437319991 to dead servers, submitted shutdown handler to be executed, root=true, meta=true