Skip to content

Instantly share code, notes, and snippets.

View dbist's full-sized avatar

Artem Ervits dbist

View GitHub Profile
@dbist
dbist / install_airflow.txt
Last active July 18, 2018 20:47
install Airflow
sudo easy_install pip
export AIRFLOW_HOME=/usr/lib/airflow
sudo pip install apache-airflow[hdfs,hive,password]
# Cannot uninstall 'Markdown'. It is a distutils installed project
# and thus we cannot accurately determine which files belong to it
# which would lead to only a partial uninstall.
@dbist
dbist / airflow-usecase
Last active July 24, 2018 15:32
airflow-usecase
hdfs dfs -mkdir /tmp/data
hdfs dfs -chmod -R 777 /tmp/data
hdfs dfs -mv /tmp/data.csv /tmp/data
//hive
!connect jdbc:hive2://aervits-hdp1:10000 "" ""
CREATE EXTERNAL TABLE IF NOT EXISTS traffic_csv (
@dbist
dbist / mvn_only_show_warnings.md
Created September 5, 2018 20:10
maven show only warnings

MAVEN_OPTS=-Dorg.slf4j.simpleLogger.defaultLogLevel=warn mvn clean package -DskipTests

export MAVEN_OPTS="-Dorg.slf4j.simpleLogger.showDateTime=true -Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss,SSS"
./dev/make-distribution.sh --name hadoop3.2 --pip --tgz -Phadoop-3.2 -Pyarn
# this does not build R support and you have to be on a branch that has profile for the version of hadoop you're running, this one is master branch
# the pom for Spark master has the following profile, activate it with -Phadoop-3.2, for other versions that are not present
# need to create own profiles and match dependencies if necessary
# <profile>
# <id>hadoop-3.2</id>
# <properties>
# <hadoop.version>3.2.0</hadoop.version>
# <curator.version>2.13.0</curator.version>
# PR option will be against the origin and not fork
git checkout -b new-branch origin/master
git cherry-pick old-branch
git push origin new-branch
<create PR from new-branch>
@dbist
dbist / keep_fork_updated_with_master.md
Last active April 29, 2019 15:03 — forked from CristinaSolana/gist:1885435
Keeping a fork up to date

1. Clone your fork:

git clone git@github.com:YOUR-USERNAME/YOUR-FORKED-REPO.git

2. Add remote from original repository in your forked repository:

cd into/cloned/fork-repo
git remote add upstream git://github.com/ORIGINAL-DEV-USERNAME/REPO-YOU-FORKED-FROM.git
git fetch upstream
git branch -r
# let's say you want branch-1
git checkout origin/branch-1

Install the Components from the Ubuntu Repositories

sudo apt-get update
sudo apt-get install python3-pip python3-dev libpq-dev

Create a Database and Database User

My CockroachDB instance is running on 10.142.0.46 in insecure mode on port 26257. These details will be necessary when we’re going to configure Django. You can validate connectivity by accessing

Launch Multipass

multipass launch --name django

Access Multipass instance via shell

multipass shell django