Skip to content

Instantly share code, notes, and snippets.

View benwbrum's full-sized avatar

Ben W. Brumfield benwbrum

View GitHub Profile
@benwbrum
benwbrum / unscored_words_and_frequencies.csv
Created July 1, 2015 01:33
Words occuring within Barker's Papers of Stephen F. Austin which do not have scores in the textmood sentiment analysis dictionary
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
word total_count
the 57615
of 38252
to 38050
and 34108
i 21368
in 18440
a 16188
that 13057
you 11489
@benwbrum
benwbrum / titles_weights_and_categories.json
Created May 30, 2015 15:58
Export of titles, weights, and categories for bubble graph
[{"title":"Long Island, Virginia","page_count":48,"category_title":"Towns"},{"title":"clear","page_count":117,"category_title":"Weather"},{"title":"cold","page_count":50,"category_title":"Weather"},{"title":"Benjamin Franklin Brumfield, Sr.","page_count":975,"category_title":"Julia's Children"},{"title":"Sally Joseph Carr Brumfield","page_count":916,"category_title":"Julia's Household"},{"title":"Edmonds","page_count":14,"category_title":"People"},{"title":"Renan, Virginia","page_count":92,"category_title":"Towns"},{"title":"Charles Brumfield","page_count":34,"category_title":"Julia's Children"},{"title":"Helen Brumfield","page_count":15,"category_title":"Lee Brumfield Family"},{"title":"Virginia Harvey","page_count":50,"category_title":"Kate Brumfield Harvey Family"},{"title":"breakfast","page_count":39,"category_title":"Food"},{"title":"milking","page_count":72,"category_title":"Animals"},{"title":"cows","page_count":99,"category_title":"Animals"},{"title":"dinner","page_count":685,"category_title":"Food"},
@benwbrum
benwbrum / trash.txt
Created April 14, 2015 22:00
Field distribution of vld file DUR/RG093789.VLD from FreeCEN
unknown file
@benwbrum
benwbrum / fields_from_query.rb
Last active August 29, 2015 14:19
sample workings of parser to extract fields from query parameters
1.9.3-p327 :079 > SearchQuery.all.each do |query|
1.9.3-p327 :080 > pp query.search_params
1.9.3-p327 :081?> pp SearchRecord.fields_from_params(query.search_params)
1.9.3-p327 :082?> print "\n"
1.9.3-p327 :083?> end;0
{:chapman_code=>{"$in"=>["BDF"]},
"search_names"=>{"$elemMatch"=>{"type"=>"p", "last_name"=>"smith"}}}
["chapman_code", "search_names.type", "search_names.last_name"]
{:record_type=>"ba",
@benwbrum
benwbrum / v2_pearls_pages_tei.xml
Created January 24, 2015 15:20
ocr to tei pearls adams three pages
<pb xml:id="F9850" n="10" facs="http://www.archive.org/download/stringofpearlsor00ryme/page/leaf9.jpg"/>
<div xml:id="P9850">
<fw type="pageNum">
untitled page 10
</fw>
<p>
g ' THE STRING OF PEARLS.
</p>
@benwbrum
benwbrum / plowing_with_weather.json
Created January 15, 2015 03:23
plowing with related subjects in the category weather
{
"root":{"title":"plowing","id":283},
"related_subjects":[
{"title":"beautiful","id":279,"relatedness":3},
{"title":"clear","id":21,"relatedness":8},
{"title":"cloud","id":1,"relatedness":21},
{"title":"cloudy","id":187,"relatedness":3},
{"title":"cold","id":13,"relatedness":4},
{"title":"colde","id":443,"relatedness":1},
{"title":"cool","id":274,"relatedness":7},
@benwbrum
benwbrum / v2_vulgate_pages.xml
Created January 8, 2015 22:37
ocr to tei v2 vulgate pages
<pb xml:id="F9431" n="13" facs="http://www.archive.org/download/vulgateversionof04sommuoft/page/leaf12.jpg"/>
<div xml:id="P9431">
<fw type="pageNum">
untitled page 13
</fw>
<p>
LE LIVRE DE LANCELOT DEL LAC
</p>
@benwbrum
benwbrum / v2_moreniana_pages.xml
Created January 8, 2015 22:32
ocr to tei v2 moreniana three pages
<pb xml:id="F5995" n="5" facs="http://www.archive.org/download/MorenianaVol11396Test1/page/leaf4.jpg"/>
<div xml:id="P5995">
<fw type="pageNum">
untitled page 5
</fw>
<p>
Cart, Sec. XVI, mm. 225 X 165. Carte 58: bianche le ce. 1, 3 e le<lb/>
ultime sei. Scrittura regolare con 18 r. circa per f. Appartenne a Dome-<lb/>
nico Maria Manni che sotto il titolo, a c. 2 a , scrisse alcune annotazioni. —<lb/>
@benwbrum
benwbrum / v2_mazzatinti_pages.xml
Created January 8, 2015 22:29
ocr to tei v2 mazzatinti three pages
<pb xml:id="F6697" n="11" facs="http://www.archive.org/download/MazzatintiALLTest1/page/leaf10.jpg"/>
<div xml:id="P6697">
<fw type="pageNum">
8
</fw>
<p>
8
</p>
@benwbrum
benwbrum / v2_animals_pages.xml
Created January 8, 2015 22:25
ocr to tei v2 animals three pages
<pb xml:id="F8640" n="18" facs="http://www.archive.org/download/animalmanagement00grea/page/leaf17.jpg"/>
<div xml:id="P8640">
<fw type="pageNum">
untitled page 18
</fw>
<p>
CHAPTER I.
</p>