Skip to content

Instantly share code, notes, and snippets.

View eliasah's full-sized avatar

Elie A. eliasah

View GitHub Profile
@monperrus
monperrus / Apriori.java
Last active August 11, 2023 11:22
Java implementation of the Apriori algorithm for mining frequent itemsets
import java.io.*;
import java.util.*;
/** The class encapsulates an implementation of the Apriori algorithm
* to compute frequent itemsets.
*
* Datasets contains integers (>=0) separated by spaces, one transaction by line, e.g.
* 1 2 3
* 0 9
* 1 9
@guenter
guenter / Main.scala
Last active September 17, 2020 11:25
A simple Mesos "Hello World": downloads and starts a Python web server on every node in the cluster.
import mesosphere.mesos.util.FrameworkInfo
import org.apache.mesos.MesosSchedulerDriver
/**
* @author Tobi Knaup
*/
object Main extends App {
@vincentbernat
vincentbernat / elasticsearch.erl
Created November 30, 2013 07:50
Erlang module for Tsung to extract data from large files
-module(elasticsearch).
-export([autocomplete/1, autocomplete/0,
search/1, search/0,
start/0,
stop/0,
loop/1]).
-on_load(start/0).
%% Return an autocomplete request
autocomplete() ->
package demo;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.MapWritable;
@debasishg
debasishg / gist:8172796
Last active December 31, 2025 22:20
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
@jeremyfelt
jeremyfelt / open-source-search-compare.md
Last active January 3, 2025 13:33
Comparing open source search solutions
@clintongormley
clintongormley / load_test_data.sh
Last active November 28, 2025 02:45
Run these commands in your shell to setup the test data for Chapter 5
curl -XPUT 'http://localhost:9200/us/user/1?pretty=1' -d '
{
"email" : "[email protected]",
"name" : "John Smith",
"username" : "@john"
}
'
curl -XPUT 'http://localhost:9200/gb/user/2?pretty=1' -d '
{
@jprante
jprante / ack.txt
Last active August 29, 2015 13:55
JDBC river - start river, run each minute, select orders, put them by created column as _id into Elasticsearch, send acknowledge info back to DB - https://github.com/jprante/elasticsearch-river-jdbc
mysql> select * from ack;
+------+---------------------+------+
| n | t | c |
+------+---------------------+------+
| 1 | 2014-01-31 23:37:00 | 6 |
| 2 | 2014-01-31 23:38:00 | 6 |
| 3 | 2014-01-31 23:39:00 | 6 |
+------+---------------------+------+
3 rows in set (0.00 sec)
@stormpython
stormpython / README.md
Last active March 21, 2022 13:45
Data Visualization with Elasticsearch Aggregations and D3 (Tutorial)

Data Visualization with Elasticsearch Aggregations and D3

Link to the Elasticsearch Blog.

require(ggplot2)
# Figure 1
ggplot(GermanCredit, aes(x = Class)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
labs(y = "prob.")