- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
extern crate zip; | |
extern crate quick_xml; | |
extern crate html_entities; | |
use std::io::BufReader; | |
use std::fs::File; | |
use std::io::Read; | |
use std::ops::Deref; | |
use std::collections::BTreeMap; | |
use zip::read::ZipArchive; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
NPR,Fresh Air,http://www.npr.org/rss/podcast.php?id=381444908 | |
,Wait Wait... Don't Tell Me,http://www.npr.org/rss/podcast.php?id=344098539 | |
,Bullseye with Jesse Thorn,http://npr.org/rss/podcast.php?id=510309 | |
,On Point With Tom Ashbrook,http://www.npr.org/rss/podcast.php?id=510053 | |
,Only A Game,http://www.npr.org/rss/podcast.php?id=510052 | |
,Here & Now,http://www.npr.org/rss/podcast.php?id=510051 | |
,Latino USA,http://www.npr.org/rss/podcast.php?id=510016 | |
,Car Talk,http://www.npr.org/rss/podcast.php?id=510208 | |
,Piano Jazz Shorts,http://www.npr.org/rss/podcast.php?id=510056 | |
,From The Top,http://www.npr.org/rss/podcast.php?id=510026 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from django.core.cache import cache | |
from django.core.cache.backends.base import DEFAULT_TIMEOUT | |
from django.db import connection , transaction | |
from hashlib import md5 | |
def cache_chained_calculation(characteristic_str, calculate_function, timeout=DEFAULT_TIMEOUT, force_update=False): | |
""" | |
Attempt to obtain result of @calculate_function, represented by @characteristic_str, through cache or calling the | |
function. Should only allow one caller to be calculating the value at once (enforced using postgres advisory locks), |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Usage: | |
// Copy and paste all of this into a debug console window of the "Who is Hiring?" comment thread | |
// then use as follows: | |
// | |
// query(term | [term, term, ...], term | [term, term, ...], ...) | |
// | |
// When arguments are in an array then that means an "or" and when they are seperate that means "and" | |
// | |
// Term is of the format: | |
// ((-)text/RegExp) ( '-' means negation ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If you use the findAll(Specification, Pageable) method, a count query is first executed and then the | |
data query is executed if the count returns a value greater than the offset. | |
For what I was doing I did not need pageable, but simply wanted to limit my results. This is easy | |
to do with static named queries and methodNameMagicGoodness queries, but from my research (googling | |
for a few hours) I couldn't find a way to do it with dynamic criteria queries using Specifications. | |
During my search I found two things that helped me to figure out how to just do it myself. | |
1.) A stackoverflow question. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
""" | |
pip install networkx distance pattern | |
In Flipboard's article[1], they kindly divulge their interpretation | |
of the summarization technique called LexRank[2]. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python3 | |
# By Steve Hanov, 2011. Released to the public domain. | |
# Please see http://stevehanov.ca/blog/index.php?id=115 for the accompanying article. | |
# | |
# Based on Daciuk, Jan, et al. "Incremental construction of minimal acyclic finite-state automata." | |
# Computational linguistics 26.1 (2000): 3-16. | |
# | |
# Updated 2014 to use DAWG as a mapping; see | |
# Kowaltowski, T.; CL. Lucchesi (1993), "Applications of finite automata representing large vocabularies", | |
# Software-Practice and Experience 1993 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
app = '{YOUR-WSGI-APPLICATION}' | |
# Sample Gunicorn configuration file. | |
# | |
# Server socket | |
# | |
# bind - The socket to bind. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.neolitec.examples; | |
import org.apache.commons.codec.binary.Base64; | |
import org.apache.commons.lang.StringUtils; | |
import org.slf4j.Logger; | |
import org.slf4j.LoggerFactory; | |
import javax.servlet.*; | |
import javax.servlet.http.HttpServletRequest; | |
import javax.servlet.http.HttpServletResponse; |