Skip to content

Instantly share code, notes, and snippets.

View ragingbal's full-sized avatar

Balaji Bal ragingbal

View GitHub Profile
# Before you start please set up a AWS ClI IAM profile for the AWS account you are using.
# Based on instructions from http://blog.melnicki.com/2014/03/20/Set-up-public-and-private-subnets-using-AWS-VPC/
# Design concepts explained here http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html
export AWS_DEFAULT_PROFILE='your_iam_profile_for_aws_cli'
export VPC_ID=`aws ec2 create-vpc --cidr-block '10.0.0.0/16' | grep VpcId | head -1 | awk '{gsub(/\"/, "");gsub(/,/,""); print $2}'`
echo 'VPC_ID='$VPC_ID >> vpc-details.txt
echo "New VPC Created"
import sys,cmd
import ClassifierLib
import argparse
class ClassifierShell(cmd.Cmd):
intro = 'Welcome to the Classifier shell. Type help or ? to list commands.\n'
prompt = '(classifier) '
file = open('history.txt','a')
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install oracle-java8-set-default
#!/bin/bash
export JAVA_HOME=/usr/local/java
JAVA_MIRROR_DOWNLOAD=https://www.reucon.com/cdn/java/jdk-8u51-linux-x64.tar.gz
ELASTICSEARCH_MIRROR_DOWNLOAD=https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.1.1/elasticsearch-2.1.1.tar.gz
KIBANA_MIRROR_DOWNLOAD=https://download.elastic.co/kibana/kibana/kibana-4.3.1-linux-x86.tar.gz
JAVA_ARCHIVE=jdk-8u51-linux-x64.tar.gz
ELASTICSEARCH_ARCHIVE=elasticsearch-2.1.1.tar.gz
KIBANA_ARCHIVE=kibana-4.3.1-linux-x86.tar.gz
#!/bin/bash
export JAVA_HOME=/usr/local/java
export HADOOP_PREFIX=/usr/local/hadoop
export HIVE_HOME=/usr/local/hive
HADOOP_ARCHIVE=hadoop-2.7.1.tar.gz
JAVA_ARCHIVE=jdk-7u51-linux-x64.tar.gz
MAHOUT_ARCHIVE=apache-mahout-distribution-0.11.1.tar.gz
HIVE_ARCHIVE=apache-hive-1.2.1-bin.tar.gz
HADOOP_MIRROR_DOWNLOAD=http://www.eu.apache.org/dist/hadoop/common/stable2/hadoop-2.7.1.tar.gz
HIVE_MIRROR_DOWNLOAD=http://www.eu.apache.org/dist/hive/stable/apache-hive-1.2.1-bin.tar.gz
import sys
import json
'''
Base data of the user.
CREATE TABLE imp_base (person_id string,name string,first_name string,last_name string,username string,country_code string,age string,email string,gender string,birthday string)
add file /vagrant/base_data.py ;
INSERT OVERWRITE TABLE imp_base SELECT TRANSFORM(lines) USING 'python base_data.py' AS (person_id,name,first_name,last_name,username,country_code,age,email,gender,birthday) FROM u_data;
select * from imp_base limit 100 ;
#!/bin/bash
FILES=/usr/local/data/active_users_big/*.json
for f in $FILES
do
echo "Processing $f file..."
hadoop fs -appendToFile $f /user/data/sourcedata/active_users.json
done
@ragingbal
ragingbal / fabric_vagrant.py
Last active March 12, 2016 21:15
snippet manage vagrant using fabric
from fabric.api import env, local, run
def vagrant():
"""Allow fabric to manage a Vagrant VM/LXC container"""
env.user = 'vagrant'
v = dict(map(lambda l: l.strip().split(),local('vagrant ssh-config', capture=True).split('\n')))
# Build a valid host entry
env.hosts = ["%s:%s" % (v['HostName'],v['Port'])]
# Use Vagrant SSH key
if v['IdentityFile'][0] == '"':
@ragingbal
ragingbal / fabexec.py
Created March 13, 2016 11:25
execute fabfile from another python script
from fabric.context_managers import settings
from fabfile import deployFiles, deployConfiguration
with settings(host_string='[email protected]'):
deployFiles()
deployConfiguration('master', 7)
@ragingbal
ragingbal / es_dsl_sample.py
Last active March 17, 2016 15:30
Some Samples with Elastic Search DSL in python
from elasticsearch_dsl.connections import connections
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
from datetime import datetime
from elasticsearch_dsl import DocType, String, Date, Nested, Boolean, analyzer
class MerchantTerm(DocType):