Additions wanted - please just fork and add.
- Parsing PDFs by Thomas Levine
- [Get Started With Scraping – Extracting Simple Tables from PDF Documents][scoda-simple-tables]
Additions wanted - please just fork and add.
Each of these commands will run an ad hoc http static server in your current (or specified) directory, available at http://localhost:8000. Use this power wisely.
$ python -m SimpleHTTPServer 8000| #!/usr/bin/env node | |
| // Verify the most famous MD5 collision example in JavaScript, using nothing but | |
| // built-in Node modules. | |
| var crypto = require('crypto'); | |
| var ucs2encode = require('punycode').ucs2.encode; | |
| var assert = require('assert'); | |
| var md5 = function(string) { |
The Python scripts attached here take care of the following tedious work, and should help one quickly get started with some real work on the corpus:
| doInstall <- TRUE | |
| toInstall <- c("wnominate", "ggplot2") | |
| if(doInstall){install.packages(toInstall, repos = "http://cran.us.r-project.org")} | |
| lapply(toInstall, library, character.only = TRUE) | |
| # Load most recent senate roll call data: | |
| rollCall <- readKH("http://amypond.sscnet.ucla.edu/rollcall/static/S112.ord") | |
| # Run wnominate on the roll call object | |
| nDims <- 3 |
| #cribbed from http://vimeo.com/52569901 (Twilio carrier call origination moderation) | |
| # The idea is that many fan-in queues can enqueue at any rate, but | |
| # dequeue needs to happen in a rate-controlled manner without allowing | |
| # any individual input queue to starve other queues. | |
| # http://en.wikipedia.org/wiki/Leaky_bucket (second sense, "This version is referred to here as the leaky bucket as a queue.") | |
| # | |
| # requires: | |
| # redis 2.6+ | |
| # redis-py>=2.7.0 | |
| # anyjson |
| # (c) 2012 Andreas Mueller [email protected] | |
| # License: BSD 2-Clause | |
| # | |
| # See my blog for details: http://peekaboo-vision.blogspot.com | |
| import numpy as np | |
| import matplotlib.pyplot as plt | |
| from matplotlib.animation import FuncAnimation |
| package twitter4j.internal.http; | |
| public class HttpResponseHelper { | |
| public static HttpClientConfiguration getHttpClientConfiguration( | |
| HttpResponse res) { | |
| return res.CONF; | |
| } | |
| } |
| require "rubygems" | |
| require "twitter" | |
| require "json" | |
| # things you must configure | |
| TWITTER_USER = "your_username" | |
| MAX_AGE_IN_DAYS = 1 # anything older than this is deleted | |
| # get these from dev.twitter.com | |
| CONSUMER_KEY = "your_consumer_key" |