Skip to content

Instantly share code, notes, and snippets.

You got the documents. Now what?

[omg documents.png]

Congratulations! Your Freedom of Information request finally yielded a big brown envelope in the mail. You are the lucky recipient of a juicy leak. You've managed to scrape all the PDFs from that stone-age government portal. Now all you have to do is the reporting.

Would that it were so easy. Your next steps depend on what you've got and what you're trying to do. You might have one page or one million pages. You could be starting with a tall stack of paper or a CSV file or anything in between. Maybe you already know exactly what you're looking for, or maybe that anonymous tip was maddeningly non-specific. In the course of my work on the Overview document-mining software I've seen just about every problem that a journalist can have with a document-driven story. These are the tales of unreadable formats, heaps of paper, and late nights reading. This post is organized as a sort of flowchart, a series of questions you can ask

@caseyg
caseyg / metadata.md
Last active December 30, 2016 15:07

Author/Contributor Metadata

Information about an author can include: firstname, lastname, fullname, lastname-comma-firstname, role, primary-or-notprimary, photo, photo credit, photo orientation, residence, biographical information. Some books have multiple contributors.

Site Notes
iBooks Author information is entered in a native Mac app, iTunes Producer, on the "details" tab of a new book file, and stored locally as an .itmsp file. Dropdown for Role (with an insane long list of predetermined roles), dropdown specifying whether the role is Primary or Not Primary, two text areas for author name: "Firstname Lastname", and "Lastname, Firstname"
Nook Separate fields for First Name, Last Name, and Role, entered on Title & Description page of NOOK Book Details section. Max of 5 "contributor" objects. Plain-textarea limited to 2,500 characters for "About the Author(s)".
Kobo Single text field for "Firstname Lastname". First entry is automatically designated as Primary autho

So you want your own Reading.am Twitter Bot?

@caseyg
caseyg / getting-started.md
Last active January 14, 2016 05:20
Getting Started with Jekyll and Github Pages

Here're a bunch of links to solid resources on how to make your first Github Pages/Jekyll project:

Tools:

Pomodoro 1 (25m) {time:short}

Break 1 (5m) {time +25m:short}

Pomodoro 2 (25m) {time +25m+5m:short}