thanooj kalathuru thanoojgithub

🏠

Working from home

a member of Java/BigData practice team at IMPETUS bangalore

12 followers · 0 following

Impetus Infotech India Pvt Ltd
Bangalore, India
https://www.linkedin.com/in/thanooj-kalathuru-47079310

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

thanoojgithub / pysparkstreamingusingnc.py

Created November 27, 2020 07:49

pyspark streaming using netcat ‎as HTTP requests

	hduser@thanoojubuntu-Inspiron-3521:~$ nc -lk 9999
	helllo word hello python hello spark hello pyspark hellow streaming pyspark

thanoojgithub / AutoGenClassUtil.java

Created November 24, 2020 18:45

Basic Auto Gen Class Util

	package com.autogenclass;

	import java.io.File;
	import java.io.IOException;
	import java.nio.file.Files;
	import java.nio.file.Paths;
	import java.util.ArrayList;
	import java.util.List;
	import java.util.stream.Collectors;
	import java.util.stream.Stream;

thanoojgithub / gist:528cbcc042998068add7261dfec93124

Created November 20, 2020 17:07

How to run HiveServer2 (Hive 2.3.3) on ubuntu 20.04

	Pre-requisites:
	1. Expecting Hadoop and Hive is well configured
	2. hive CLI is working as expected
	then, we can try running HiveServer2


	hduser@thanoojubuntu-Inspiron-3521:~/softwares/apache-hive-2.3.3-bin/conf$ hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console
	2020-11-20 21:25:32: Starting HiveServer2
	SLF4J: Class path contains multiple SLF4J bindings.
	SLF4J: Found binding in [jar:file:/home/hduser/softwares/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]

thanoojgithub / SparkSQLOne

Created November 4, 2020 05:58

Spark SQL notes

	start-dfs.sh
	start-yarn.sh
	jps
	sudo mkdir /tmp/spark-events
	sudo chown hduser:hadoop -R tmp

	hduser@thanoojubuntu-Inspiron-3521: start-master.sh
	hduser@thanoojubuntu-Inspiron-3521: start-slave.sh spark://thanoojubuntu-Inspiron-3521:7077
	starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark-2.4.6-bin-hadoop2.7/logs/spark-hduser-org.apache.spark.deploy.worker.Worker-1-thanoojubuntu-Inspiron-3521.out
	hduser@thanoojubuntu-Inspiron-3521:/tmp$ spark-shell --master spark://thanoojubuntu-Inspiron-3521:7077

thanoojgithub / MySQL_Notes.sql

Last active April 22, 2022 09:47

MySQL Notes

	# MySQL installation in WSL2 ubuntu
	# How to access mysql with default password in Ubuntu 20.04 ::
	--------------------------------------------------------
	sudo apt update
	sudo apt upgrade
	sudo apt install mysql-server
	sudo apt install mysql-client
	mysql --version

	sudo usermod -d /var/lib/mysql/ mysql

thanoojgithub / Spring Boot notes

Created September 19, 2020 11:41

Spring Boot notes

Spring Boot is an open source Java-based framework used to develop a stand-alone and production-grade spring application that you can just run.

thanoojgithub / How to find java.home

Last active August 9, 2020 07:06

How to find java.home

	For Linux and macOS, let's use grep:
	java -XshowSettings:properties -version 2>&1 > /dev/null \| grep 'java.home'

	And for Windows, let's use findstr:
	java -XshowSettings:properties -version 2>&1 \| findstr "java.home"



	----------------------------------------------------------------------------------------------------

thanoojgithub / docker-mysql-standalone

Last active June 20, 2020 07:29

docker-mysql connecting using mysql workbench

	PS C:\Users\thanooj> docker pull mysql
	PS C:\Users\thanooj> docker images
	REPOSITORY TAG IMAGE ID CREATED SIZE
	springio/gs-spring-boot-docker latest c8778cb72ef5 5 days ago 527MB
	openjdk 8 b190ad78b520 10 days ago 510MB
	mysql latest be0dbf01a0f3 11 days ago 541MB
	hello-world latest bf756fb1ae65 5 months ago 13.3kB
	PS C:\Users\thanooj>
	PS C:\Users\thanooj> docker container ls -a
	CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

thanoojgithub / wordCount.txt

Created July 28, 2019 07:25

Spark SQL is a Spark module for structured data processing. Unlike the basic Spark RDD API, the interfaces provided by Spark SQL provide Spark with more information about the structure of both the data and the computation being performed. Internally, Spark SQL uses this extra information to perform extra optimizations. There are several ways to interact with Spark SQL including SQL and the Dataset API. When computing a result the same execution engine is used, independent of which API/language you are using to express the computation. This unification means that developers can easily switch back and forth between different APIs based on which provides the most natural way to express a given transformation.

thanoojgithub / HadoopHiveSparkHBase

Last active February 11, 2020 07:42

Hadoop Hive Spark configuration on Ubuntu 16.04

	sudo apt-get install ssh
	sudo apt-get install rsync


	sudo apt install openssh-client
	sudo apt install openssh-server

	ssh localhost
	ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
	cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Newer Older