Daniel Krech eikeon

Greetings,

At the Library of Congress we've recently been exploring rewriting a [Java web archiving tool][1] in Go. So far this has involved working with an existing body (~500TB) of data encoded using [ISO/DIS 28500][2] aka the WARC file format. One of the features of WARC is its use of [Gzip][3] as a packaging format, which allows individual WARC records to be represented as separate members in the larger Gzip file. Or as the spec says:

Per section 2.2 of the GZIP specification, a valid GZIP file consists of any number of gzip "members", each independently compressed. Where possible, this property should be exploited to compress each record of a WARC file independently. This results in a valid GZIP file whose per-record subranges also stand alone as valid GZIP files. External indexes of WARC file content may then be used to record each record's starting position in the GZIP file, allowing for random access of individual records without requiring decompression of all preceding records.

We ran into di

	FROM ubuntu:15.04
	MAINTAINER Daniel Krech <[email protected]>
	ENV DEBIAN_FRONTEND noninteractive
	RUN apt-get -qq update && apt-get -qqy install software-properties-common python3-software-properties
	RUN add-apt-repository ppa:snappy-dev/beta && apt-get -qq update && apt-get -qqy install snappy-tools bzr git
	CMD /bin/bash

	ubuntu@rachael:~/src/github.com/nogiushi/marvin$ sudo apt-get install marvin
	Reading package lists... Done
	Building dependency tree
	Reading state information... Done
	The following packages will be upgraded:
	marvin
	1 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
	Need to get 1,607 kB of archives.
	After this operation, 42.9 MB disk space will be freed.
	Get:1 http://nogiushi.com/ubuntu/ saucy/main marvin armhf 0.8.2-1 [1,607 kB]

	package main

	import (
	"bufio"
	"fmt"
	"io"
	"io/ioutil"
	"log"
	"os"

	type stateServer struct {
	marvin *marvin.Marvin
	send chan marvin.State
	}

	func (s stateServer) wsHandler(ws *websocket.Conn) {
	s.send = make(chan marvin.State)
	s.marvin.Register(&s.send)
	defer func() { s.marvin.Unregister(&s.send) }()
	go func() {

	type stateChanged struct {
	states chan State
	register chan *chan State
	unregister chan *chan State
	observers map[*chan State]bool
	}

	func (sc stateChanged) Register(c chan State) {
	sc.register <- c
	}

	passwd

	sudo dpkg-reconfigure tzdata

	sudo apt-get install emacs24-nox

	- set hostname:

	sudo emacs /etc/hostname
	sudo service hostname stop

	package main

	import (
	"encoding/csv"
	"flag"
	"fmt"
	"io"
	"log"
	"os/exec"
	"sort"

	Request URL:http://chroniclingamerica2.loc.gov/lccn/sn92070581/1913-03-12/ed-1/seq-1/image_203x258_from_4088,4088_to_5712,6152.jpg
	Request Method:GET
	Status Code:200 OK
	Request Headersview source
	Accept:/
	Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
	Accept-Encoding:gzip,deflate,sdch
	Accept-Language:en-US,en;q=0.8
	Cache-Control:no-cache
	Connection:keep-alive