Skip to content

Instantly share code, notes, and snippets.

View xeoncross's full-sized avatar

David Pennington xeoncross

View GitHub Profile
<?php
class Fetcher
{
private $_torIp = '127.0.0.1';
private $_torProxyPort = '8118';
protected function _request($url)
<?php
function route($method, $path, $callback) {
$pattern = '{^' . preg_replace('{:(\w+)}', '(?<\1>[^/]+)', $path) . '$}';
return function ($requiredMethod, $requiredPath) use ($pattern, $method, $callback) {
if (preg_match("{^$method$}i", $requiredMethod)
&& preg_match($pattern, $requiredPath, $matches)) {
$keys = array_filter(array_keys($matches), 'is_string');
$parameters = array_intersect_key($matches, array_flip($keys));
call_user_func_array($callback, $parameters);
public function timeAgo(\Datetime $then)
{
$now = new \Datetime('now');
$interval = $then->diff($now);
$points = array(
'y' => array('year', 'years'),
'm' => array('month', 'months'),
'd' => array('day', 'days'),
<?php
// STEP 1: read POST data
// Reading POSTed data directly from $_POST causes serialization issues with array data in the POST.
// Instead, read raw POST data from the input stream.
$raw_post_data = file_get_contents('php://input');
$raw_post_array = explode('&', $raw_post_data);
$myPost = array();
foreach ($raw_post_array as $keyval) {
import nltk
text = """The Buddha, the Godhead, resides quite as comfortably in the circuits of a digital
computer or the gears of a cycle transmission as he does at the top of a mountain
or in the petals of a flower. To think otherwise is to demean the Buddha...which is
to demean oneself."""
# Used when tokenizing words
sentence_re = r'''(?x) # set flag to allow verbose regexps
([A-Z])(\.[A-Z])+\.? # abbreviations, e.g. U.S.A.
# coding=UTF-8
from __future__ import division
import re
# This is a naive text summarization algorithm
# Created by Shlomi Babluki
# April, 2013
class SummaryTool(object):
<?php
$server = "localhost";
$port = 11211;
/**
* Taken directly from memcache PECL source
*
* http://pecl.php.net/package/memcache
*
*/
<?php
/**
* DecoratorAbstract allows the creation of flexible nested decorators.
* Decorators can be stacked. They can also have methods that overwrite each other.
* Decorators can omit methods that parent decorators have defined and/or child decorators have defined.
* Methods will cascade to the original child object.
* Properties will read and set from the original child object except when your instance has the property defined.
*/
abstract class DecoratorAbstract{
@xeoncross
xeoncross / cors.php
Created October 15, 2013 14:52 — forked from renoirb/cors.php
<?php
/**
* CORS header manipulation handler
*
* This utility will add header in the
* HTTP response and help us execute JavaScript
* from differen sub-domains.
*
* @author Renoir Boulanger <renoir@w3.org>

== Overview of Datasets ==

The examples in this book use the "Chimpmark" datasets: a set of freely-redistributable datasets, converted to simple standard formats, with traceable provenance and documented schema. They are the same datasets as used in the upcoming Chimpmark Challenge big-data benchmark. The datasets are:

  • Wikipedia English-language Article Corpus (wikipedia_corpus; 38 GB, 619 million records, 4 billion tokens): the full text of every English-language wikipedia article, in

  • Wikipedia Pagelink Graph (wikipedia_pagelinks; ) --

  • Wikipedia Pageview Stats (wikipedia_pageviews; 2.3 TB, about 250 billion records (FIXME: verify num records)) -- hour-by-hour pageview