Skip to content

Instantly share code, notes, and snippets.

View thanoojgithub's full-sized avatar
🏠
Working from home

thanooj kalathuru thanoojgithub

🏠
Working from home
View GitHub Profile
@thanoojgithub
thanoojgithub / row_number() ranking in Hive - DE-DUPLICATION
Last active March 4, 2016 07:17
row_number() ranking in Hive - DE-DUPLICATION
ubuntu@ubuntu:~$ jps
3330 Jps
ubuntu@ubuntu:~$ start-dfs.sh
16/03/03 21:47:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/hadoop-ubuntu-namenode-ubuntu.out
localhost: starting datanode, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/hadoop-ubuntu-datanode-ubuntu.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/hadoop-ubuntu-secondarynamenode-ubuntu.out
16/03/03 21:47:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
@thanoojgithub
thanoojgithub / collect_set_collect_list in Hive.sql
Last active August 9, 2018 12:01
collect_set collect_list in Hive
hive> select * from thanooj.cust_credit;
OK
cust_credit.id cust_credit.name cust_credit.doa cust_credit.location
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
@thanoojgithub
thanoojgithub / case when in Hive.sql
Last active May 29, 2018 08:01
case when in Hive
hive> set hive.cli.print.header=true;
hive> select location,(case when doa is not null then concat(location,',',doa) when doa is null then location end ) as location_doa from thanooj.cust_credit;
OK
location location_doa
ayodhya ayodhya,2015-10-12
midhila midhila,2015-09-12
ayodhya ayodhya,2015-11-12
ayodhya ayodhya,2015-12-12
ayodhya ayodhya,2015-12-12
ayodhya ayodhya,2015-11-12
@thanoojgithub
thanoojgithub / PlayJavaOne
Last active March 10, 2016 14:58
Play-Java start up
D:\thanooj\installers\typesafe-activator-1.3.7-minimal\activator-1.3.7-minimal>set PATH = %PATH%;D:\thanooj\installers\typesafe-activator-1.3.7-minimal\activator-1.3.7-minimal
D:\thanooj\installers\typesafe-activator-1.3.7-minimal\activator-1.3.7-minimal>activator new
Getting com.typesafe.activator activator-launcher 1.3.7 ...
downloading https://repo.typesafe.com/typesafe/ivy-releases/com.typesafe.activator/activator-launcher/1.3.7/jars/activator-launcher.jar ...
[SUCCESSFUL ] com.typesafe.activator#activator-launcher;1.3.7!activator-launcher.jar (9731ms)
downloading https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.11.7/scala-library-2.11.7.jar ...
[SUCCESSFUL ] org.scala-lang#scala-library;2.11.7!scala-library.jar (21344ms)
downloading https://repo.typesafe.com/typesafe/ivy-releases/com.typesafe.activator/activator-props/1.3.7/jars/activator-props.jar ...
ubuntu@ubuntu:~/softwares/hadoop-2.7.2$ find . -type f | xargs grep "8088"
ubuntu@ubuntu:~/softwares/hadoop-2.7.2$ find -name '*.*' | Xargs grep -i '8088'
@thanoojgithub
thanoojgithub / File_Formats_Apache_HIVE.sql
Last active September 26, 2021 04:13
File Formats in Apache HIVE
[mapr@maprdemo work]$ cat /home/mapr/work/cust_credit_ext.txt
1,sriram,2015-10-12,ayodhya
2,seeta,2015-09-12,midhila
3,lakshman,2015-11-12,ayodhya
4,bharata,2015-12-12,ayodhya
5,sathrugna,2015-12-12,ayodhya
6,hanuma,2015-11-12,ayodhya
7,sriram,2015-10-12,ayodhya
8,seeta,2015-09-12,midhila
9,lakshman,2015-11-12,ayodhya
@thanoojgithub
thanoojgithub / Issues in installation and configuration Hadoop
Last active December 13, 2019 18:25
Issues in installation and configuration Hadoop
1. connect to host localhost port 22: Connection refused
Stopping namenodes on [localhost]
localhost: ssh: connect to host localhost port 22: Connection refused
localhost: ssh: connect to host localhost port 22: Connection refused
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: ssh: connect to host 0.0.0.0 port 22: Connection refused
Fix:
thanooj@ubuntu:/$ sudo apt-get install openssh-server
Reading package lists... Done
1. Install JAVA
2.
thanooj@thanooj-Inspiron-3521:~$ sudo addgroup hadoop
Adding group `hadoop' (GID 1001) ...
Done.
thanooj@thanooj-Inspiron-3521:~$ sudo adduser --ingroup hadoop hadoopuser
Adding user `hadoopuser' ...
Adding new user `hadoopuser' (1001) with group `hadoop' ...
@thanoojgithub
thanoojgithub / HadoopConfigureFilesChanges
Created February 11, 2017 14:36
Hadoop configure files changes
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export SCALA_HOME=/home/thanooj/Scala/scala-2.12.1
export PATH=$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin
export HADOOP_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_MAPRED_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_COMMON_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_HDFS_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export YARN_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_CONF_DIR=/home/thanooj/bigdata/hadoop-2.7.3/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=/home/thanooj/bigdata/hadoop-2.7.3/lib/native
@thanoojgithub
thanoojgithub / spark_notes.sh
Created February 12, 2017 20:30
apache spark with scala
Apache Spark is general purpose computation/execution engine,
uses RDD in a reselient(lineage using underlying HDFS for recovery, in its own way).
having Transformations results new RDD from it, consistency by Immutable in nature
does Lazy evaluation until action called.
Benifits:
Fault Recovery using lineage
Optimized for inmemory computations - placing computations optimally using directed acyclic graph
Easy programming - doing transfermations on RDD by calling actions.
Rich is library support - MLib (machine learning), graphx, data frames, including batch and streaming