Skip to content

Instantly share code, notes, and snippets.

View marhan's full-sized avatar

Markus Hanses marhan

View GitHub Profile
@marhan
marhan / logstash-haspa-bank-account.conf
Last active April 29, 2017 21:25
Logstash 5.3.2 configuration file to read the CSV files provided by Hamburger Sparkasse (Haspa) and send the data to Elasticsearch 5.3.2
input {
stdin {
type => "stdin-type"
}
file {
path => ["/PATH/bank-account-files/*"]
start_position => "beginning"
sincedb_path => "/dev/null"
codec => plain { charset => 'Windows-1252' }
@marhan
marhan / copy_bank_account_files.sh
Created April 27, 2017 06:10
Copies the CSV files recursive to another location and garanties file name uniqueness by random filename.
#!/bin/bash
DESTINATION_PATH="/PATH_TO_DESTINATION"
for SOURCE_FILE in `find "/PATH_SOURCE" -name *.csv`; do
FILE_NAME=$(basename "$SOURCE_FILE")
FILE_NAME_EXTENSION="${FILE_NAME##*.}"
FILE_NAME_PART="${FILE_NAME%.*}"
DESTINATION_FILE_NAME="${FILE_NAME_PART}-${RANDOM}.${FILE_NAME_EXTENSION}"
@marhan
marhan / extract_file_name.sh
Created April 27, 2017 05:33
Extract file name and its extension from absolut file path
#!/bin/bash
full_file_path="/PATH/test.csv"
# complete file name without path
file_name="${full_file_path##*/}"
# file name without extension
file_name_part="${full_file_path%.*}"
@marhan
marhan / export_postgres.scala
Created April 18, 2017 21:06
Export spark data frame into PostgreSQL database (Zeppelin paragraph)
%spark
data.write
.format("jdbc")
.option("url", "jdbc:postgresql://SERVER:5432/DATABASE")
.option("dbtable", "public.account")
.option("user", "zeppelin")
.option("password", "PASSWORD")
.option("driver", "org.postgresql.Driver")
.mode("overwrite")
.save
@marhan
marhan / haspa_transform.scala
Last active April 18, 2017 21:02
Transform haspa spark data frame into computable formats (Zeppelin Paragraph)
%spark
val account = df.withColumn("Betrag", regexp_replace( regexp_replace( df("Betrag"), "\\.","" ) , "\\,",".").cast(DecimalType(10,2)))
.withColumn("Buchung", to_date( unix_timestamp( col("Buchung") , "dd.MM.yyyy" ).cast("timestamp") ) )
.withColumn("Wert", to_date( unix_timestamp( col("Wert") , "dd.MM.yyyy" ).cast("timestamp") ) )
@marhan
marhan / haspa_import_files.scala
Last active April 18, 2017 21:04
Import HASPA bank account files into spark context (Zeppelin paragraph)
%spark
val filesPath = "/PATH/*.csv"
import org.apache.spark.sql.types.{StructType, StructField, StringType, DecimalType};
val customSchema = StructType( Array( StructField("Buchung", StringType, true),
StructField("Wert", StringType, true),
StructField("Verwendungszweck", StringType, true),
StructField("Betrag", StringType, true) ))
cat ~/.ssh/*.pub | ssh user@remote-system 'umask 077; cat >>.ssh/authorized_keys'
@marhan
marhan / install_jenkins.sh
Last active February 6, 2017 20:37
Install Jenkins on Raspberry 3
#!/bin/bash
wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | sudo apt-key add -
sudo sh -c 'echo deb http://pkg.jenkins-ci.org/debian-stable binary/ > /etc/apt/sources.list.d/jenkins.list'
sudo apt-get update
sudo apt-get install -y jenkins
@marhan
marhan / Dockerfile
Created February 3, 2017 16:01 — forked from nknapp/Dockerfile
Traefik setup as reverse-proxy with docker and letsencrypt
FROM traefik:camembert
ADD traefik.toml .
EXPOSE 80
EXPOSE 8080
EXPOSE 443
@marhan
marhan / config
Created January 22, 2017 20:09
MacOS Sierra asks every time for ssh password while connecting to other server via ssh
Host *
AddKeysToAgent yes
UseKeychain yes
IdentityFile ~/.ssh/id_rsa