Skip to content

Instantly share code, notes, and snippets.

@pbinkley
pbinkley / BIR-items-1914-1920.txt
Created November 7, 2012 00:56
Topic Models from BIR items, 1914-1920
0 0.10269 war german germany great peace world people british britain allies russia nations ger nation
1 0.07261 tho foi lie fiom bo aie pi und lo ho ihe oi mi hie
2 0.11068 government mr states united minister president london made act committee conference house general council
3 0.06439 ot king years god india prince tbe church queen london japan great china japanese
4 0.02709 island bow prices store street price stock main buy car goods burdett ii good
5 0.05188 mr mrs week island mis bow tho mi sunday day hat local held town
6 0.05623 canada men canadian war service soldiers military army cross boy canadians officers british scout
7 0.25065 man news local jou don day good asked time tbe wife mr woman back
8 0.06251 local news worms worm child mother children graves miller powders exterminator de suffering mothers
9 0.19944 ii tin li il ill tl ot mi lie lo la si ti tho
@pbinkley
pbinkley / BIR-articles-1914-1920.txt
Created November 7, 2012 03:31
Topic Models from BIR articles, 1914-1920
0 0.05384 food eggs milk sugar meat bread birds pounds butter pound flour poultry cents fruit
1 0.1172 government act mr board made committee minister law ment public president council house labor
2 0.10952 ot tbe tho la tha tor ln aa tne bo time ths tlon lng
3 0.34568 work time great ing good men made country make man people fact life means
4 0.07993 miles north railway river west great land line south feet lake east roads pacific
5 0.06452 men war canadian soldiers canada army service british military officers cross general london front
6 0.10948 war germany german world great people peace british britain allies nation nations empire united
7 0.01672 town council island meeting bow municipal held foi gas ii motion school municipality paid
8 0.28501 news local man don jou good asked day wife time ou young thing woman
9 0.02227 corn news local corns drug lift pain bottle cure callus small fingers feet store
1 processes: 3.59 seconds
2 processes: 3.73 seconds
3 processes: 3.70 seconds
4 processes: 3.83 seconds
5 processes: 2.62 seconds
6 processes: 2.99 seconds
7 processes: 2.89 seconds
8 processes: 3.08 seconds
iMac:
1 processes: 2.11 seconds
2 processes: 1.34 seconds
3 processes: 1.02 seconds
4 processes: 0.91 seconds
5 processes: 0.86 seconds
6 processes: 0.79 seconds
7 processes: 0.70 seconds
8 processes: 0.70 seconds
bench.py results for Raspberry Pi:
1 processes: 38.57 seconds
2 processes: 37.72 seconds
3 processes: 37.17 seconds
4 processes: 37.66 seconds
5 processes: 37.98 seconds
6 processes: 38.39 seconds
7 processes: 39.55 seconds
8 processes: 38.81 seconds
func postUrls(urls chan string) {
for url := range urls {
n := NewUrl{url}
data, _ := json.Marshal(n)
post:
resp, err := http.Post(*ginger, "application/json", bytes.NewReader(data))
if err != nil {
log.Fatal("post error: ", err)
} else if resp.StatusCode == http.StatusCreated {
log.Println("added ", url)
@pbinkley
pbinkley / gist:6732586
Created September 27, 2013 18:04
Extracting file from warc by url with warctools
warcindex drupalib.interoperating.info.warc > drupalib.interoperating.info.warc.csv
warcpayload drupalib.interoperating.info.warc:`grep http://drupalib.interoperating.info/files/screencaps/nash-library-thumbnail.jpg drupalib.interoperating.info.warc.csv | grep " response " | head -1 | awk '{ print $2; }'` > nash-library-thumbnail.jpg
identify nash-library-thumbnail.jpg
nash-library-thumbnail.jpg JPEG 320x320 320x320+0+0 8-bit sRGB 21.2KB 0.000u 0:00.000
Alexandria:temp peterbinkley$ mogrify -crop 50%x100% +repage testa.tif
testa.tif
Alexandria:temp peterbinkley$ identify testa.tif
testa.tif[0] TIFF 290x480 290x480+0+0 8-bit sRGB 256c 282KB 0.000u 0:00.000
testa.tif[1] TIFF 290x480 290x480+0+0 8-bit sRGB 256c 282KB 0.000u 0:00.000
Alexandria:temp peterbinkley$ convert -crop 50%x100% +repage testb.tif testb-%d.tif
Alexandria:temp peterbinkley$ ls
testa.tif testb-0.tif testb-1.tif testb.tif
Alexandria:temp peterbinkley$ identify testb-0.tif
testb-0.tif TIFF 290x480 290x480+0+0 8-bit sRGB 256c 141KB 0.000u 0:00.000
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
xmlns:ead="urn:isbn:1-931666-22-9"
exclude-result-prefixes="xd ead"
version="1.0">
<xd:doc scope="stylesheet">
<xd:desc>
<xd:p><xd:b>Created on:</xd:b> Dec 20, 2010</xd:p>
<xd:p><xd:b>Author:</xd:b> pbinkley</xd:p>
@pbinkley
pbinkley / gist:4f31a50d2059266d35e5
Last active August 29, 2015 14:14
Twarc-report waterfall mockup
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<title>twarc-report waterfall mockup</title>
<meta name="generator" content="Bootply" />
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
<link href="http://netdna.bootstrapcdn.com/bootstrap/3.0.3/css/bootstrap.min.css" rel="stylesheet">