Skip to content

Instantly share code, notes, and snippets.

View habedi's full-sized avatar
🪬

Hassan Abedi habedi

🪬
View GitHub Profile
@habedi
habedi / start_single_worker_spark_cluster.sh
Created February 12, 2020 21:50
Starting Spark cluster with minimum requirements
## run the following commands in BASH
start-master.sh
# go to http://localhost:8080 and check if the Spark's master service is started
start-slave.sh spark://$(hostname):7077
# if the worker's service is started successfully you should be able to see the worker in http://localhost:8080, at the connected worker's section
@habedi
habedi / get_spark.sh
Last active February 12, 2020 21:18
Simple commands to download and extract Apache Spark's pre-built binaries from its websites
## run the following commands in BASH
cd # let's get back to your user's home directory
wget -c https://www-us.apache.org/dist/spark/spark-2.4.5/spark-2.4.5-bin-hadoop2.7.tgz # this will download spark
tar xvfz spark-2.4.5-bin-hadoop2.7.tgz # this will extract the downloaded file to current directory
mv spark-2.4.5-bin-hadoop2.7 spark # renaming the extarcted folder to "spark"
# appending the JAVA_HOME and SPARK_HOME environement variables to end of your BASH startup script
# we are assuming that our JRE 8 is installed in "/usr/lib/jvm/java-1.8.0-openjdk-amd64"
cat >> .bashrc <<'EOF'
@habedi
habedi / first_time_ubuntu_setup.sh
Last active February 12, 2020 21:03
Installing and setting up required packages and utilities in Ubuntu 18.04
## run the following commands in BASH
sudo apt-get update -y && sudo apt-get upgrade -y
# you may need to enter your password when you use "sudo" before another command
# if you are asked a Yes, and No question during the execution of the previous command choose "Yes"
sudo apt-get install -y htop nload netcat emacs nano openjdk-8-jdk-headless python-pip python3-pip wget \
curl python-mode scala-mode-el
# be patient it can take a while for all the packages to be downloaded and be installed
sudo pip install pyspark
sudo pip3 install pyspark
# again it may take a while for PySpark packages be downloaded for both Python 2 [which is defunct] and Python 3
@habedi
habedi / test_whoosh.py
Created December 6, 2019 13:27 — forked from turicas/test_whoosh.py
Some tests with whoosh (full-text search library written entirely in Python)
#!/usr/bin/env python
# coding: utf-8
# To bootstrap the environment:
# mkvirtualenv whoosh
# pip install whoosh
import os
from whoosh.index import create_in, open_dir
@habedi
habedi / gzip.h
Created August 25, 2019 18:02
c++ gzip compress/decompress string with boost
#ifndef __GZIP_H__
#define __GZIP_H__
#include <sstream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
class Gzip {
public:
@habedi
habedi / lmdb.tcl
Created July 31, 2019 14:01 — forked from antirez/lmdb.tcl
LMDB -- First version of Redis written in Tcl
# LVDB - LLOOGG Memory DB
# Copyriht (C) 2009 Salvatore Sanfilippo <[email protected]>
# All Rights Reserved
# TODO
# - cron with cleanup of timedout clients, automatic dump
# - the dump should use array startsearch to write it line by line
# and may just use gets to read element by element and load the whole state.
# - 'help','stopserver','saveandstopserver','save','load','reset','keys' commands.
# - ttl with milliseconds resolution 'ttl a 1000'. Check ttl in dump!
@habedi
habedi / twitter_sentiment_analysis_convnet.py
Created June 6, 2019 18:39 — forked from giuseppebonaccorso/twitter_sentiment_analysis_convnet.py
Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks
import keras.backend as K
import multiprocessing
import tensorflow as tf
from gensim.models.word2vec import Word2Vec
from keras.callbacks import EarlyStopping
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Flatten
from keras.layers.convolutional import Conv1D
@habedi
habedi / search.py
Created December 13, 2018 15:29 — forked from pavelk2/search.py
graph = {
1: {2:1,5:1},
2: {1:1,3:1,5:1},
3: {2:1,4:1},
4: {3:1,5:1,6:1},
5: {1:1,2:1,4:1},
6: {4:1}
}
time = 0
@habedi
habedi / autolog.py
Created December 5, 2018 13:00 — forked from brendano/autolog.py
python decorators to log all method calls, show call graphs in realtime too
# Written by Brendan O'Connor, [email protected], www.anyall.org
# * Originally written Aug. 2005
# * Posted to gist.github.com/16173 on Oct. 2008
# Copyright (c) 2003-2006 Open Source Applications Foundation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
@habedi
habedi / Makefile
Created August 22, 2018 16:03 — forked from Nemo157/Makefile
A simple Makefile for LaTeX projects.
# Author:
# Wim Looman
# Copyright:
# Copyright (c) 2010 Wim Looman
# License:
# GNU General Public License (see http://www.gnu.org/licenses/gpl-3.0.txt)
## User interface, just set the main filename and it will do everything for you
# If you have any extra code or images included list them in EXTRA_FILES
# This should work as long as you have all the .tex, .sty and .bib files in