Elie A. eliasah

Lead Data Scientist Interested in Recommender Systems and beyond. I'm also a Scala & Spark evangelist. @awesome-spark @kiliba-codebase

110 followers · 84 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

eliasah / gist:adeacd2537640d733fb1

Created October 17, 2015 15:41 — forked from rezazadeh/gist:5a3bb88d9fdf423dd861

CosineSimilarity DIMSUM Example

	/*
	* Licensed to the Apache Software Foundation (ASF) under one or more
	* contributor license agreements. See the NOTICE file distributed with
	* this work for additional information regarding copyright ownership.
	* The ASF licenses this file to You under the Apache License, Version 2.0
	* (the "License"); you may not use this file except in compliance with
	* the License. You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*

eliasah / Setup.md

Last active August 29, 2015 14:20 — forked from xrstf/setup.md

Nutch 2.3 crawler + HBase 0.94 + Elasticsearch 1.4.2

Info

This guide sets up a non-clustered Nutch crawler, which stores its data via HBase. We will not learn how to setup Hadoop et al., but just the bare minimum to crawl and index websites on a single machine.

Terms

Nutch - the crawler (fetches and parses websites)
HBase - filesystem storage for Nutch (Hadoop component, basically)

eliasah / LDA_SparkDocs

Last active August 29, 2015 14:17 — forked from jkbradley/LDA_SparkDocs

	/*
	This example uses Scala. Please see the MLlib documentation for a Java example.

	Try running this code in the Spark shell. It may produce different topics each time (since LDA includes some randomization), but it should give topics similar to those listed above.

	This example is paired with a blog post on LDA in Spark: http://databricks.com/blog
	Spark: http://spark.apache.org/
	*/

	import scala.collection.mutable

eliasah / elasticsearch-analysis-french-stopwords

Last active August 29, 2015 14:11 — forked from nboire/elasticsearch-analysis-french-stopwords

	# delete all data
	curl -XDELETE localhost:9200/test

	# create an index and define specific french stop_words
	curl -XPUT localhost:9200/test -d '{
	"settings" : {
	"index" : {
	"analysis" : {
	"analyzer" : {
	"french" : {

eliasah / functions.js

Last active August 29, 2015 14:11 — forked from RedBeard0531/functions.js

	// derived from http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm

	function map() {
	emit(1, // Or put a GROUP BY key here
	{sum: this.value, // the field you want stats for
	min: this.value,
	max: this.value,
	count:1,
	diff: 0, // M2,n: sum((val-mean)^2)
	});

eliasah / kibana3-es14-connection-fail-error

Last active August 29, 2015 14:11 — forked from rmoff/gist:379e6ce46eb128110f38

	Kibana 3 against ElasticSearch 1.4 throws an Connection Failed screen. The error text says to set `http.cors.allow-origin`, but it misses out the important `http.cors.enabled: true`

	Working config:

	$ grep cors elasticsearch-1.4.0.Beta1/config/elasticsearch.yml
	http.cors.allow-origin: "/.*/"
	http.cors.enabled: true

	* [Ref](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-http.html)
	* [Ref](http://elasticsearch-users.115913.n3.nabble.com/Kibana-upgrade-trouble-nor-4-0BETA1-neither-3-11-work-now-td4064625.html)

eliasah / add-puppetlabs-repo.sh

Last active August 29, 2015 14:07 — forked from mauromorales/add-puppetlabs-repo.sh

	wget http://apt.puppetlabs.com/puppetlabs-release-precise.deb
	sudo dpkg -i puppetlabs-release-precise.deb
	sudo apt-get update

eliasah / install_scala_sbt.sh

Last active August 29, 2015 14:02 — forked from visenger/install_scala_sbt.sh

Install Scala 2.10.3 with SBT 0.13 on Ubuntu 12.04

	#!/bin/sh
	# This script installs Scala 2.10.3 with SBT 0.13 on Ubuntu 12.04

	wget http://www.scala-lang.org/files/archive/scala-2.10.3.tgz
	tar zxf scala-2.10.3.tgz sudo mv scala-2.10.3 /usr/local/share/scala

	sudo ln -s /usr/local/share/scala/bin/scala /usr/bin/scala
	sudo ln -s /usr/local/share/scala/bin/scalac /usr/bin/scalac
	sudo ln -s /usr/local/share/scala/bin/fsc /usr/bin/fsc
	sudo ln -s /usr/local/share/scala/bin/scaladoc /usr/bin/scaladoc

eliasah / fr.sh

Created May 21, 2014 12:15 — forked from dadoonet/fr.sh

	#!/bin/bash

	ES='http://localhost:9200'
	ESIDX='test3'
	ESTYPE='test'

	curl -XDELETE $ES/$ESIDX

	curl -XPUT $ES/$ESIDX/ -d '{
	"settings" : {

eliasah / commands.sh

Created April 29, 2014 08:01 — forked from tralston/commands.sh

Find the MD5 sum of the current directory

	# Find the MD5 sum of the current directory
	find . -type f \| grep -v "^./.git" \| xargs md5 \| md5

Newer Older