Skip to content

Instantly share code, notes, and snippets.

View kumar-de's full-sized avatar
🎯
LOYO

AK kumar-de

🎯
LOYO
View GitHub Profile
@kumar-de
kumar-de / kill_port_process.sh
Created May 3, 2018 13:25 — forked from marcosvidolin/kill_port_process.sh
Linux: How to list and kill a process using a port
# to list all ports that are used
sudo netstat -ntlp | grep LISTEN
# you can obtain a specific port using the following command
sudo netstat -ntlp | grep :8080
# when you execute the command above you will see some thing like that
tcp 0 0 0.0.0.0:27370 0.0.0.0:* LISTEN 4394/skype
tcp 0 0 127.0.1.1:53 0.0.0.0:* LISTEN 2216/dnsmasq
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 4912/cupsd
@kumar-de
kumar-de / install_scala_REPL.md
Last active April 19, 2020 21:01
Hassle-free installation of scala-REPL on Ubuntu 18.04 LTS. This hasn't been tested on other versions. #scala #deb #freeze #hold

Remove existing version, if any

sudo apt-get remove scala-library scala

Change version, if necessary: e.g. 2.11.12

sudo wget www.scala-lang.org/files/archive/scala-2.11.8.deb
@kumar-de
kumar-de / OpenVPNonUbuntu.md
Last active August 20, 2019 21:47
Install openvpn, its network-manager and network-manager-gnome

Install openvpn, its network-manager and network-manager-gnome

sudo apt-get install openvpn network-manager-openvpn network-manager-openvpn-gnome
@kumar-de
kumar-de / prepare-alpine-docker-for-opcua.md
Last active April 19, 2020 21:02
Preparation of Alpine-based docker containers to run freeopcua's minimal client #alpine #opcua #docker
apk add python wget py-pip python-dev gcc musl-dev libxml2-dev libxslt-dev

echo "Installing opcua via pip may take longer than expected..."

pip install setuptools opcua
@kumar-de
kumar-de / read_from_spark_shell.md
Last active September 7, 2020 09:52
Read file directly from spark-shell for experimenting #parquet #sequencefile #spark #shell

Parquet

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val df = sqlContext.read.parquet("/path/to/file/without/hdfs://")

df.printSchema

df.count
@kumar-de
kumar-de / HDFS from Spark.md
Last active September 2, 2020 21:45
access HDFS from Spark
val conf = sc.hadoopConfiguration
val fs = org.apache.hadoop.fs.FileSystem.get(conf)
val exists = fs.exists(new org.apache.hadoop.fs.Path("/path/on/hdfs")) // File or directory
val sequenceFiles1 = htu.getDFSCluster.getFileSystem.listStatus(new Path(outputFileDirforCrashes)).filter(_.isDirectory).map(_.getPath.toString)
        sequenceFiles1.foreach(dirPath=>{
 val files = htu.getDFSCluster.getFileSystem.listStatus(new Path(dirPath)).filter(_.isFile).map(_.getPath.toString)
@kumar-de
kumar-de / mount_parquet_into_impala.sh
Last active May 21, 2019 12:03
sample mounting of parquet file as a table in Impala
CREATE EXTERNAL TABLE table_name (
vehicle_id String,
start_time String,
av0 Double
)
STORED AS parquet
LOCATION '/path/in/hdfs/without//hdfs://';
@kumar-de
kumar-de / create-fat-jar.md
Last active June 6, 2022 07:59
Create a fat jar using Maven

Create a fat jar using Maven

<?xml version="1.0" encoding="UTF-8"?>
<build>
	<plugins>
        <!-- Set a compiler level -->
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>2.3.2</version>
@kumar-de
kumar-de / Kafka-console-consumer-with-kerberos.md
Last active March 1, 2023 03:59
Kafka console consumer with Kerberos

Kafka console consumer with Kerberos

1. Create a jaas.conf file with the following contents:

KafkaClient {
   com.sun.security.auth.module.Krb5LoginModule required
   useKeyTab=true
   keyTab="keytabFile.keytab"
   storeKey=true
   useTicketCache=false
   serviceName="kafka"
@kumar-de
kumar-de / create-external-table-with-partitions.sh
Created June 21, 2019 11:51
Create external Impala table partitioned by a certain column and recover old partitions of the same table
# The following example is for a job that produces daily, parquets partitioned by 'day' column
DB="database"
BASE_DIR="/output/parquets"
TBL="${DB}.table" # name of the impala table
PQT="${BASE_DIR}/day" # name of the parent directory containing output-subdirectories (in the format 'day=ddMMyyy')
# create/mount impala table
impala-shell -q "drop table if exists $TBL;
create external table $TBL (
account_id string,