- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package niochat; | |
import java.net.*; | |
import java.nio.*; | |
import java.nio.channels.*; | |
import java.io.IOException; | |
import java.util.*; | |
public class NiochatServer implements Runnable { | |
private final int port; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* The Grid ---------------------- */ | |
.lt-ie9 .row { width: 940px; max-width: 100%; min-width: 768px; margin: 0 auto; } | |
.lt-ie9 .row .row { width: auto; max-width: none; min-width: 0; margin: 0 -15px; } | |
.lt-ie9 .row.large-collapse .column, | |
.lt-ie9 .row.large-collapse .columns { padding: 0; } | |
.lt-ie9 .row .row { width: auto; max-width: none; min-width: 0; margin: 0 -15px; } | |
.lt-ie9 .row .row.large-collapse { margin: 0; } | |
.lt-ie9 .column, .lt-ie9 .columns { float: left; min-height: 1px; padding: 0 15px; position: relative; } | |
.lt-ie9 .column.large-centered, .columns.large-centered { float: none; margin: 0 auto; } |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import urllib2 | |
import re | |
import sys | |
from collections import defaultdict | |
from random import random | |
""" | |
PLEASE DO NOT RUN THIS QUOTED CODE FOR THE SAKE OF daemonology's SERVER, IT IS | |
NOT MY SERVER AND I FEEL BAD FOR ABUSING IT. JUST GET THE RESULTS OF THE | |
CRAWL HERE: http://pastebin.com/raw.php?i=nqpsnTtW AND SAVE THEM TO "archive.txt" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.neolitec.examples; | |
import org.apache.commons.codec.binary.Base64; | |
import org.apache.commons.lang.StringUtils; | |
import org.slf4j.Logger; | |
import org.slf4j.LoggerFactory; | |
import javax.servlet.*; | |
import javax.servlet.http.HttpServletRequest; | |
import javax.servlet.http.HttpServletResponse; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
app = '{YOUR-WSGI-APPLICATION}' | |
# Sample Gunicorn configuration file. | |
# | |
# Server socket | |
# | |
# bind - The socket to bind. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python3 | |
# By Steve Hanov, 2011. Released to the public domain. | |
# Please see http://stevehanov.ca/blog/index.php?id=115 for the accompanying article. | |
# | |
# Based on Daciuk, Jan, et al. "Incremental construction of minimal acyclic finite-state automata." | |
# Computational linguistics 26.1 (2000): 3-16. | |
# | |
# Updated 2014 to use DAWG as a mapping; see | |
# Kowaltowski, T.; CL. Lucchesi (1993), "Applications of finite automata representing large vocabularies", | |
# Software-Practice and Experience 1993 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
""" | |
pip install networkx distance pattern | |
In Flipboard's article[1], they kindly divulge their interpretation | |
of the summarization technique called LexRank[2]. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If you use the findAll(Specification, Pageable) method, a count query is first executed and then the | |
data query is executed if the count returns a value greater than the offset. | |
For what I was doing I did not need pageable, but simply wanted to limit my results. This is easy | |
to do with static named queries and methodNameMagicGoodness queries, but from my research (googling | |
for a few hours) I couldn't find a way to do it with dynamic criteria queries using Specifications. | |
During my search I found two things that helped me to figure out how to just do it myself. | |
1.) A stackoverflow question. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Usage: | |
// Copy and paste all of this into a debug console window of the "Who is Hiring?" comment thread | |
// then use as follows: | |
// | |
// query(term | [term, term, ...], term | [term, term, ...], ...) | |
// | |
// When arguments are in an array then that means an "or" and when they are seperate that means "and" | |
// | |
// Term is of the format: | |
// ((-)text/RegExp) ( '-' means negation ) |
OlderNewer