Skip to content

Instantly share code, notes, and snippets.

View eliasah's full-sized avatar

Elie A. eliasah

View GitHub Profile
@eliasah
eliasah / installing_keras_tf.md
Last active October 8, 2018 11:52
Install Keras 2.0.5 on Tensorflow 1.2.1 backend with Anaconda

create conda env

conda create --name keras
source activate keras

installing utilities

conda install python=3.5 numpy scikit-learn=0.18.1 jupyter matplotlib pip
conda install pandas h5py pillow lxml
@eliasah
eliasah / Anaconda from Powershell.md
Last active December 14, 2018 08:46
Running Anaconda 4 from Powershell on Windows 10

To correctly work with Anaconda on Powershell :

  • Run the anaconda command :
C:\Users\elias\Anaconda3> cmd C:\Users\elias\Anaconda3\envs\py35\Scripts\activate.bat C:\Users\elias\Anaconda3\envs\py35\Scripts\activate.bat
Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. All rights reserved.
  • Activate your env :
import pandas as pd
import numpy as np
import scipy
import scipy.stats as sts
import random
import pyspark
import pyspark.sql.types as stypes
import pyspark.sql.functions as sfunctions
@eliasah
eliasah / IntelliJ_IDEA__Perf_Tuning.txt
Created January 9, 2017 15:02 — forked from P7h/IntelliJ_IDEA__Perf_Tuning.txt
Performance tuning parameters for IntelliJ IDEA. Add these params in idea64.exe.vmoptions or idea.exe.vmoptions file in IntelliJ IDEA. If you are using JDK 8.x, please knock off PermSize and MaxPermSize parameters from the tuning configuration.
-server
-Xms2048m
-Xmx2048m
-XX:NewSize=512m
-XX:MaxNewSize=512m
-XX:PermSize=512m
-XX:MaxPermSize=512m
-XX:+UseParNewGC
-XX:ParallelGCThreads=4
-XX:MaxTenuringThreshold=1
@eliasah
eliasah / dupes.sh
Created December 22, 2016 09:56
Find duplicate files based on MD5 signatures.
#!/bin/bash
# usage dupes location
if [ "$#" -ne 2 ]; then
echo "Usage : dupes location type (pdf,gz)"
exit
fi
LOCATION=$(readlink -f $1)
@eliasah
eliasah / install_mesos.sh
Created December 20, 2016 13:00
Install Mesos in a Standalone Pseudo-cluster.
#!/bin/bash
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)
echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update
sudo apt-get -y install mesos marathon

Hello, I am using linear SVM to train my model and generate a line through my data. However my model always predicts 1 for all the feature examples. Here is my code:

print data_rdd.take(5) [LabeledPoint(1.0, [1.9643,4.5957]), LabeledPoint(1.0, [2.2753,3.8589]), LabeledPoint(1.0, [2.9781,4.5651]), LabeledPoint(1.0, [2.932,3.5519]), LabeledPoint(1.0, [3.5772,2.856])]


from pyspark.mllib.classification import SVMWithSGD from pyspark.mllib.linalg import Vectors from sklearn.svm import SVC

@eliasah
eliasah / xgb_aws_emr.sh
Last active August 3, 2016 09:01 — forked from walterreade/xgb_aws.txt
XGBoost on AWS EMR
#!/bin/bash
sudo yum -y install make
sudo yum -y update
sudo yum -y install gcc gcc-c++ git
git clone https://github.com/dmlc/xgboost --recursive
cd xgboost
make -j4
cd python-package; sudo python setup.py install
export PYTHONPATH=~/xgboost/python-package
import numpy as np
import tensorflow as tf
import os
from tensorflow.python.platform import gfile
import os.path
import re
import sys
import tarfile
from subprocess import Popen, PIPE, STDOUT
def run(cmd):
import org.apache.http.client.methods.HttpGet
import org.apache.http.impl.client.{BasicResponseHandler, HttpClientBuilder}
import org.apache.spark.mllib.fpm.PrefixSpan
// sequence database
val sequenceDatabase = {
val url = "http://www.philippe-fournier-viger.com/spmf/datasets/SIGN.txt"
val client = HttpClientBuilder.create().build()
val request = new HttpGet(url)
val response = client.execute(request)