- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import selenium | |
import time | |
from selenium import webdriver | |
browser = webdriver.PhantomJS("phantomjs") | |
browser.get("https://twitter.com/StackStatus") | |
print browser.title | |
pause = 3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
""" | |
The twitter-text-python library (https://pypi.python.org/pypi/twitter-text-python) can be used | |
to urlify text containing @<username>s and #<hashtag>s. It is a bit trickier if you want to do | |
the same with HTML, but BeautifulSoup makes it straightforward. | |
""" | |
from bs4 import BeautifulSoup, NavigableString | |
from ttp import ttp | |
parse_text = ttp.Parser().parse |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.metrics import confusion_matrix | |
def print_cm(cm, labels, hide_zeroes=False, hide_diagonal=False, hide_threshold=None): | |
"""pretty print for confusion matrixes""" | |
columnwidth = max([len(x) for x in labels]+[5]) # 5 is value length | |
empty_cell = " " * columnwidth | |
# Print header | |
print " " + empty_cell, | |
for label in labels: | |
print "%{0}s".format(columnwidth) % label, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Build a sorted word frequency list from a file, trimmed to a given quantile. | |
-- | |
-- Usage: WordStats <book.txt> <quantile> | |
-- | |
-- `quantile` is a number between 0 and 1. | |
-- | |
-- Example: | |
-- ./WordStats "Don Quijote.txt" 0.85 > "Don Quijote.words.85" | |
import Control.Applicative |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
########3 rep 10 fold CV to determine feature sparsity percentage via RFE######### | |
#X = concatenated text features for training set (title, body, url) transformed via TfIdfVectorizer | |
#y = training set classification (0, 1) | |
import numpy as np | |
import pandas as pd | |
import sklearn.linear_model as lm | |
from sklearn.cross_validation import KFold | |
from sklearn import metrics |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var doctors = [ | |
{ number: 1, actor: "William Hartnell", begin: 1963, end: 1966 }, | |
{ number: 2, actor: "Patrick Troughton", begin: 1966, end: 1969 }, | |
{ number: 3, actor: "Jon Pertwee", begin: 1970, end: 1974 }, | |
{ number: 4, actor: "Tom Baker", begin: 1974, end: 1981 }, | |
{ number: 5, actor: "Peter Davison", begin: 1982, end: 1984 }, | |
{ number: 6, actor: "Colin Baker", begin: 1984, end: 1986 }, | |
{ number: 7, actor: "Sylvester McCoy", begin: 1987, end: 1989 }, | |
{ number: 8, actor: "Paul McGann", begin: 1996, end: 1996 }, | |
{ number: 9, actor: "Christopher Eccleston", begin: 2005, end: 2005 }, |
I wrote this in early January 2012, but never finished it. The research and thinking in this area led to a lot of the design of Yeoman and talks like "Javascript Development Workflow of 2013", "Web Application Development Workflow" and "App development stack for JS developers" (surpisingly little overlap in those talks, btw).
Now it's June 2013 and the state of web app tooling has matured quite a bit. But here's a snapshot of the story from 18 months ago, even if a little ugly and incomplete. :p
- Intro to tooling
- node.js
- Installation paths: use one of these techniques to install node and npm without having to sudo.
- Node.js HOWTO: Install Node+NPM as user (not root) under Unix OSes
- Felix's Node.js Guide
- Creating a REST API using Node.js, Express, and MongoDB
- Node Cellar Sample Application with Backbone.js, Twitter Bootstrap, Node.js, Express, and MongoDB
- JavaScript Event Loop
- Node.js for PHP programmers
Attention: the list was moved to
https://github.com/dypsilon/frontend-dev-bookmarks
This page is not maintained anymore, please update your bookmarks.
NewerOlder