Skip to content

Instantly share code, notes, and snippets.

View eliasah's full-sized avatar

Elie A. eliasah

View GitHub Profile

Hello, I am using linear SVM to train my model and generate a line through my data. However my model always predicts 1 for all the feature examples. Here is my code:

print data_rdd.take(5) [LabeledPoint(1.0, [1.9643,4.5957]), LabeledPoint(1.0, [2.2753,3.8589]), LabeledPoint(1.0, [2.9781,4.5651]), LabeledPoint(1.0, [2.932,3.5519]), LabeledPoint(1.0, [3.5772,2.856])]


from pyspark.mllib.classification import SVMWithSGD from pyspark.mllib.linalg import Vectors from sklearn.svm import SVC

@eliasah
eliasah / xgb_aws_emr.sh
Last active August 3, 2016 09:01 — forked from walterreade/xgb_aws.txt
XGBoost on AWS EMR
#!/bin/bash
sudo yum -y install make
sudo yum -y update
sudo yum -y install gcc gcc-c++ git
git clone https://github.com/dmlc/xgboost --recursive
cd xgboost
make -j4
cd python-package; sudo python setup.py install
export PYTHONPATH=~/xgboost/python-package
import numpy as np
import tensorflow as tf
import os
from tensorflow.python.platform import gfile
import os.path
import re
import sys
import tarfile
from subprocess import Popen, PIPE, STDOUT
def run(cmd):
import org.apache.http.client.methods.HttpGet
import org.apache.http.impl.client.{BasicResponseHandler, HttpClientBuilder}
import org.apache.spark.mllib.fpm.PrefixSpan
// sequence database
val sequenceDatabase = {
val url = "http://www.philippe-fournier-viger.com/spmf/datasets/SIGN.txt"
val client = HttpClientBuilder.create().build()
val request = new HttpGet(url)
val response = client.execute(request)
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.util.StatCounter;
import scala.Tuple2;
import scala.Tuple3;
import java.util.Arrays;
import java.util.List;
@eliasah
eliasah / Update remote repo
Created May 27, 2016 20:29 — forked from mandiwise/Update remote repo
Transfer repo from Bitbucket to Github
// Reference: http://www.blackdogfoundry.com/blog/moving-repository-from-bitbucket-to-github/
// See also: http://www.paulund.co.uk/change-url-of-git-repository
$ cd $HOME/Code/repo-directory
$ git remote rename origin bitbucket
$ git remote add origin https://github.com/mandiwise/awesome-new-repo.git
$ git push origin master
$ git remote rm bitbucket
import java.io.IOException;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.en.EnglishAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.analysis.util.CharArraySet;
import org.apache.lucene.document.Document;
package com.combient.sparkjob.tedsds
/**
* Created by olu on 09/03/16.
*/
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
#!/usr/bin/env python
# encoding: utf-8
# This file lives in tests/project_test.py in the usual disutils structure
# Remember to set the SPARK_HOME evnironment variable to the path of your spark installation
import logging
import sys
import unittest