Skip to content

Instantly share code, notes, and snippets.

View brendano's full-sized avatar

Brendan O'Connor brendano

View GitHub Profile
#!/usr/bin/env python
r"""
vertunion File1 File2 ....
Iterates through parallel files, each row
"DocID \t JSON1" "DocID \t JSON2" ....
and outputs
"DocID \t UnionOfJSONs"
Union of key-value pairs, that is.
The economy 's temperature will be taken from several vantage points this week , with readings on trade , output , housing and inflation . {"root_mar":[0.00008,0.00798,0.00182,0.01688,0.87625,0.01018,0.01229,0.00165,0.00052,0.00076,0.01282,0.00046,0.00662,0.00095,0.01759,0.00537,0.00439,0.00125,0.00062,0.00356,0.00063,0.01118,0.00293,0.00226,0.00097],"edge_mar":[[0.00001,0.00047,0.00121,0.00102,0.00128,0.00042,0.00087,0.00035,0.00027,0.00022,0.00134,0.00021,0.0014,0.00228,0.00408,0.00097,0.00177,0.00039,0.00188,0.00074,0.00187,0.00175,0.00159,0.00052,0.0029],[0.83539,0.00001,0.93406,0.0138,0.01333,0.00313,0.01055,0.0033,0.0029,0.00128,0.00686,0.00241,0.008,0.02723,0.02736,0.00517,0.0124,0.00233,0.0233,0.00504,0.02344,0.01223,0.01524,0.00345,0.01125],[0.00251,0.00049,0.00001,0.00273,0.00634,0.00138,0.00285,0.00112,0.00076,0.00072,0.004,0.00061,0.00469,0.00675,0.01124,0.00284,0.00427,0.00119,0.00533,0.00235,0.00504,0.00571,0.00479,0.00163,0.01046],[0.08547,0.75282,0.01562,0.00001,0.01579,0.00701,0.01237,0.0046,
@brendano
brendano / sim.py
Created March 24, 2012 02:35
matrix-tree thm for CRF marginal dependencies via matrix inversion, from koo et al. 2007
In [9]: run -i sim
for each word: prob connect to root
[ 0.28026994 0.16394082 0.10616135 0.17767563 0.12675216 0.1452001 ]
for (head,child) entries: P(head <- child)
[[ 0. 0.12563458 0.27335659 0.17451717 0.24229165 0.24617475]
[ 0.25410789 0. 0.21883649 0.09361327 0.17785164 0.2048422 ]
[ 0.12280784 0.12047786 0. 0.13921346 0.12119944 0.11342211]
[ 0.11823039 0.27609723 0.15487263 0. 0.22541093 0.15197973]
[ 0.11249058 0.1968143 0.09766069 0.21988013 0. 0.13838112]
[ 0.11209336 0.11703521 0.14911225 0.19510033 0.10649417 0. ]]
@brendano
brendano / grep.sh
Created February 28, 2012 06:12
view framenet 1.5 annotations xml files
#!/bin/zsh
pat="$1"
before=10
after=10
grep -l "$pat" fndata-1.5/fulltext/*.xml | xargs python $(dirname $0)/view.py | grep --color=always -B$before -A$after "$pat"
How to install RPy (the original version, not RPy2, which is hard to use)
"pip" is from: http://packages.python.org/distribute/
$ pip install https://rpy.svn.sourceforge.net/svnroot/rpy/trunk/rpy/
$ python
>>> import rpy
>>> rpy.r.rnorm(10)
[-0.13543865314803508, 1.4040046442151894, -0.7555979642938236, -0.5311732761616562, 0.9271128349911155, -1.4425396174821694, -1.136552652395736, 0.4320017586687098, 0.4661676224124814, -1.0162902394860223]
ALLWORDS -- only for tracks with lyrics
preds
1 2 3 4 5 6 7 8 9 10
classic_pop_and_rock 102 2 11 54 6 2 71 75 132 31
classical 3 0 0 5 5 0 1 14 3 1
dance_and_electronica 8 0 7 9 3 1 26 13 37 1
folk 52 0 5 126 3 3 25 58 74 5
hiphop 10 2 4 4 199 0 16 31 18 10
jazz 12 1 2 6 2 0 10 11 20 2
# -*- encoding: utf8 -*-
"""
Utilities for parse trees
Brendan O'Connor Nov 2011
Represents s-expressions simply as lists of lists and strings.
Console print example:
% echo '(ROOT (S (N bob) (VP (V is) (V running) )))' | python parsetools.py dump
# -*- encoding: utf8 -*-
"""
Utilities for parse trees
Brendan O'Connor Jan 2011
Represents s-expressions simply as lists of lists and strings.
"""
import sys
def parse_sexpr(s, add_root=True):
Their menu is very large -LRB- pasta to gyros to salads , not to mention their actual pizzas -RRB- which I think hurts them a bit .
PRP$ NN VBZ RB JJ -LRB- NN TO NNS TO NNS , RB TO VB PRP$ JJ NNS -RRB- WDT PRP VBP VBZ PRP DT NN .
NP-------- CONJP--------- NP---------- NP----
NP----------------- NP---------
NP------------------------------------------ SBAR-------------
PP--------------------------------------------- S----------------
NP--------------------------------------------------- VP---------------
PP-----------