Skip to content

Instantly share code, notes, and snippets.

@zouzias
zouzias / clean_code.md
Created September 8, 2018 18:42 — forked from wojteklu/clean_code.md
Summary of 'Clean code' by Robert C. Martin

Code is clean if it can be understood easily – by everyone on the team. Clean code can be read and enhanced by a developer other than its original author. With understandability comes readability, changeability, extensibility and maintainability.


General rules

  1. Follow standard conventions.
  2. Keep it simple stupid. Simpler is always better. Reduce complexity as much as possible.
  3. Boy scout rule. Leave the campground cleaner than you found it.
  4. Always find root cause. Always look for the root cause of a problem.

Design rules

@zouzias
zouzias / load_parquet_s3.py
Created August 10, 2018 20:43 — forked from asmaier/load_parquet_s3.py
Pyspark script for downloading a single parquet file from Amazon S3 via the s3a protocol. It also reads the credentials from the "~/.aws/credentials", so we don't need to hardcode them. See also https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html .
#
# Some constants
#
aws_profile = "your_profile"
aws_region = "your_region"
s3_bucket = "your_bucket"
#
# Reading environment variables from aws credential file
#

Keybase proof

I hereby claim:

  • I am zouzias on github.
  • I am zouzias (https://keybase.io/zouzias) on keybase.
  • I have a public key ASD5s6mtU_Xas0oIByPzMdlx1dBBDUkmPeEz0tmrWJs2rQo

To claim this, I am signing this object:

@zouzias
zouzias / sql.py
Created March 18, 2018 08:40 — forked from jorisvandenbossche/sql.py
Patched version of pandas.io.sql to support PostgreSQL
"""
Patched version to support PostgreSQL
(original version: https://github.com/pydata/pandas/blob/v0.13.1/pandas/io/sql.py)
Adapted functions are:
- added _write_postgresql
- updated table_exist
- updated get_sqltype
- updated get_schema
@zouzias
zouzias / gist:da5b6939c40858148b4e1f8c7a214c9c
Created March 7, 2018 16:18
Parser Combinators in Haskell
https://two-wrongs.com/parser-combinators-parsing-for-haskell-beginners.html
@zouzias
zouzias / README.md
Created February 16, 2018 16:22 — forked from phatak-dev/README.md
Functional Programming in C++

#Compilng You need g++ 4.9 to compile this code. Follow these steps to install g++-4.9

After installing run the following command to compile

/usr/bin/g++-4.9 -std=c++11 lambda.cpp

#Running

./a.out
@zouzias
zouzias / install_python3.6_opensuse42.3.sh
Created January 25, 2018 14:25 — forked from amoilanen/install_python3.6_opensuse42.3.sh
Installing Python 3.6 on OpenSUSE Leap 42.3
# !/bin/bash
# Step 1. Install pyenv
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.bashrc
# Step 2. Install missing headers for all the Python modules to be built
@zouzias
zouzias / 1-sozial-hilfe-template.txt
Created January 22, 2018 20:18
Permit C Documents Switzerland Zuerich
Haben Sie waehrend dieser Zeit in der Stadt Zuerich gewohnt, dann senden Sie eine Kopie dieses
Schreibens an: Sozialzentrum Hoenggerstrasse. Die Bestaetigung wird Ihnen per Post zugestellt.
--------------------------------
Liebe/r Sachbearbeiter
Mein Name ist XXX und ich wohne in Zürich.
@zouzias
zouzias / TestLuceneRDD.scala
Created January 11, 2018 13:09
LuceneRDD Hello World
// Start Spark Shell using
// bin/spark-shell --packages org.zouzias:spark-lucenerdd_2.11:0.3.1
// import required spark classes
import org.zouzias.spark.lucenerdd.LuceneRDD
import org.zouzias.spark.lucenerdd._
// set the implicit value for sc (required by LuceneRDD)
implicit val sc = spark.sparkContext
val array = Array("Hello", "world")
@zouzias
zouzias / lda.py
Created January 10, 2018 10:39 — forked from aronwc/lda.py
Example using GenSim's LDA and sklearn
""" Example using GenSim's LDA and sklearn. """
import numpy as np
from gensim import matutils
from gensim.models.ldamodel import LdaModel
from sklearn import linear_model
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer