Skip to content

Instantly share code, notes, and snippets.

View SwathiMystery's full-sized avatar
🏠
Working from home

idevnull SwathiMystery

🏠
Working from home
View GitHub Profile
@SwathiMystery
SwathiMystery / start watcher
Created April 15, 2013 16:29
Start the watcher script
Goto Watcher repository
$ cd ~/Watcher
$ sudo python watcher.py start
This will create ~/.watcher directory and has watcher.log in it,
when started.
@SwathiMystery
SwathiMystery / clean_log.sh
Created April 15, 2013 16:23
clean log files
#!/bin/bash
##.........................................................##
# Clean watcher log #
#..........................................................##
sudo -s <<EOF
cat /dev/null>/home/ubuntu/.watcher/watcher.log
EOF
@SwathiMystery
SwathiMystery / crontab
Last active December 16, 2015 06:08
To clean log files and re try the uploads : cron
$ crontab -e
Add the following lines at the end and save.
# EVERY SATURDAY 8:00AM clean watcher log
0 8 * * 6 sudo sh /home/ubuntu/s3sync/clean_log.sh
# EVERYDAY at 10:00AM check failed uploads of previous day
0 10 * * * sudo sh /home/ubuntu/s3sync/re-upload.sh
@SwathiMystery
SwathiMystery / re-upload.sh
Created April 15, 2013 16:04
Upload the failed files
Goto s3sync directory.
$ cd s3sync
$ sudo vim re-upload.sh
#!/bin/bash
##.........................................................##
## script to detect failed uploads of other date directories
## and re-try ##
##.........................................................##
@SwathiMystery
SwathiMystery / monitor.sh
Created April 15, 2013 15:49
Monitor the upload to s3, the new file, to the directory /home/ubuntu/data/yyyy/mm/dd/*.*
Goto s3sync directory
$ cd ~/s3sync
$ sudo vim monitor.sh
#!/bin/bash
##...........................................................##
## script to upload to S3BUCKET, once the change is detected ##
##...........................................................##
@SwathiMystery
SwathiMystery / watcher-s3upload
Last active December 16, 2015 05:59
Using watcher.py for real time of s3 uploads
$ cd Watcher/
Start the script,
$ sudo python watcher.py start
This will create a .watcher dirctory at /home/ubuntu
Now,
$ sudo python watcher.py stop
Goto the .watcher directory created and
set the destination to be watched for and action to be undertaken
in jobs.yml ie. watch: and command:
Goto https://github.com/greggoryhz/Watcher
Copy https://github.com/greggoryhz/Watcher.git to your clipboard
Install git if you have not
Clone the Watcher
$ git clone https://github.com/greggoryhz/Watcher.git
$ cd Watcher/
@SwathiMystery
SwathiMystery / install-s3sync
Created April 15, 2013 14:16
Installation of s3sync
Install Ruby from the repository
$ sudo apt-get install ruby libopenssl-ruby
Confirm with the version
$ ruby -v
Download and unzip s3sync
$ wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
$ tar -xvzf s3sync.tar.gz
Install the certificates.
@SwathiMystery
SwathiMystery / Configure proxy for Namenode and JobTracker WebUI
Created February 14, 2013 11:12
This is how-to run a proxy and see the WebUI of Namenode and Jobtracker of EC2 instances in the Hadoop (CDH4) cluster.
Now, Launch Firefox (3.0v+)
Download the FoxyProxy extension by clicking this link:https://addons.mozilla.org/en-US/firefox/addon/2464.
Steps to configure and access the UI
Select Tools > FoxyProxy > Options
Click the “Add New Proxy” button.
Select “Manual Proxy Configuration”
Enter “localhost” for the “Host or IP Address” field.
Enter “6666″ for the “Port” field.
Click on the “General” tab at the top of the dialog box.
Enter “EC2″ for the “Proxy Name” field.
## Cluster name goes here.
whirr.cluster-name=yarncluster
# Change the number of machines in the cluster here
whirr.instance-templates=1 hadoop-namenode+yarn-resourcemanager+mapreduce-historyserver,2 hadoop-datanode+yarn-nodemanager
# Install JAVA
whirr.java.install-function=install_openjdk
whirr.java.install-function=install_oab_java