Elie A. eliasah

Lead Data Scientist Interested in Recommender Systems and beyond. I'm also a Scala & Spark evangelist. @awesome-spark @kiliba-codebase

110 followers · 84 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

eliasah / 0.question_svm_with_sgd.md

Last active November 30, 2016 15:50 — forked from aditya1702/Data used for training SVM model

SVM with SGD clarification for question http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-with-SVM-tt27955.html

Hello, I am using linear SVM to train my model and generate a line through my data. However my model always predicts 1 for all the feature examples. Here is my code:

print data_rdd.take(5) [LabeledPoint(1.0, [1.9643,4.5957]), LabeledPoint(1.0, [2.2753,3.8589]), LabeledPoint(1.0, [2.9781,4.5651]), LabeledPoint(1.0, [2.932,3.5519]), LabeledPoint(1.0, [3.5772,2.856])]

from pyspark.mllib.classification import SVMWithSGD from pyspark.mllib.linalg import Vectors from sklearn.svm import SVC

eliasah / xgb_aws_emr.sh

Last active August 3, 2016 09:01 — forked from walterreade/xgb_aws.txt

XGBoost on AWS EMR

	#!/bin/bash

	sudo yum -y install make
	sudo yum -y update
	sudo yum -y install gcc gcc-c++ git
	git clone https://github.com/dmlc/xgboost --recursive
	cd xgboost
	make -j4
	cd python-package; sudo python setup.py install
	export PYTHONPATH=~/xgboost/python-package

eliasah / Tensorflow spark demo

Created August 2, 2016 14:51 — forked from tnachen/Tensorflow spark demo

	import numpy as np
	import tensorflow as tf
	import os
	from tensorflow.python.platform import gfile
	import os.path
	import re
	import sys
	import tarfile
	from subprocess import Popen, PIPE, STDOUT
	def run(cmd):

eliasah / Spark_PrefixSpan_ASL_example.scala

Created July 8, 2016 13:14 — forked from feynmanliang/Spark_PrefixSpan_ASL_example.scala

	import org.apache.http.client.methods.HttpGet
	import org.apache.http.impl.client.{BasicResponseHandler, HttpClientBuilder}
	import org.apache.spark.mllib.fpm.PrefixSpan

	// sequence database
	val sequenceDatabase = {
	val url = "http://www.philippe-fournier-viger.com/spmf/datasets/SIGN.txt"
	val client = HttpClientBuilder.create().build()
	val request = new HttpGet(url)
	val response = client.execute(request)

eliasah / aggregateByKeyStatCounter.java

Last active May 30, 2016 14:25 — forked from zero323/aggregateByKey.java

	import org.apache.spark.SparkConf;
	import org.apache.spark.api.java.JavaPairRDD;
	import org.apache.spark.api.java.JavaRDD;
	import org.apache.spark.api.java.JavaSparkContext;
	import org.apache.spark.util.StatCounter;
	import scala.Tuple2;
	import scala.Tuple3;

	import java.util.Arrays;
	import java.util.List;

eliasah / Update remote repo

Created May 27, 2016 20:29 — forked from mandiwise/Update remote repo

Transfer repo from Bitbucket to Github

	// Reference: http://www.blackdogfoundry.com/blog/moving-repository-from-bitbucket-to-github/
	// See also: http://www.paulund.co.uk/change-url-of-git-repository

	$ cd $HOME/Code/repo-directory
	$ git remote rename origin bitbucket
	$ git remote add origin https://github.com/mandiwise/awesome-new-repo.git
	$ git push origin master

	$ git remote rm bitbucket

eliasah / testIndexer.java

Created April 22, 2016 17:46 — forked from anonymous/testIndexer.java

	import java.io.IOException;
	import java.util.ArrayList;
	import java.util.LinkedList;
	import java.util.List;

	import org.apache.lucene.analysis.Analyzer;
	import org.apache.lucene.analysis.en.EnglishAnalyzer;
	import org.apache.lucene.analysis.standard.StandardAnalyzer;
	import org.apache.lucene.analysis.util.CharArraySet;
	import org.apache.lucene.document.Document;

eliasah / PrepareData.scala

Created April 8, 2016 12:14 — forked from oluies/PrepareData.scala

	package com.combient.sparkjob.tedsds

	/**
	* Created by olu on 09/03/16.
	*/

	import org.apache.spark.{SparkContext, SparkConf}
	import org.apache.spark.sql.hive.HiveContext
	import org.apache.spark.sql.expressions.Window
	import org.apache.spark.sql.functions._

eliasah / RunRandomForest2.scala

Created April 8, 2016 12:14 — forked from oluies/RunRandomForest2.scala

	/*
	* Licensed to the Apache Software Foundation (ASF) under one or more
	* contributor license agreements. See the NOTICE file distributed with
	* this work for additional information regarding copyright ownership.
	* The ASF licenses this file to You under the Apache License, Version 2.0
	* (the "License"); you may not use this file except in compliance with
	* the License. You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*

eliasah / null_test.py

Created April 4, 2016 15:59 — forked from opikalo/null_test.py

	#!/usr/bin/env python
	# encoding: utf-8

	# This file lives in tests/project_test.py in the usual disutils structure
	# Remember to set the SPARK_HOME evnironment variable to the path of your spark installation

	import logging
	import sys
	import unittest

Newer Older