This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"translatorID": "8a68255b-24e5-47e0-afe5-f65fff578170", | |
"translatorType": 3, | |
"label": "BibTeX (nschneid, long form)", | |
"creator": "Simon Kornblith and Richard Karnesky and Nathan Schneider", | |
"target": "bib", | |
"minVersion": "2.1.9", | |
"maxVersion": null, | |
"priority": 200, | |
"inRepository": true, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
WSJ/00/WSJ_0001.MRG news | |
WSJ/00/WSJ_0002.MRG news | |
WSJ/00/WSJ_0003.MRG news | |
WSJ/00/WSJ_0004.MRG news | |
WSJ/00/WSJ_0005.MRG news | |
WSJ/00/WSJ_0006.MRG news | |
WSJ/00/WSJ_0007.MRG news | |
WSJ/00/WSJ_0008.MRG news | |
WSJ/00/WSJ_0009.MRG news | |
WSJ/00/WSJ_0010.MRG news |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
abandonment.01 verb-abandon.01 - | |
abatement.01 verb-abate.01 - | |
abduction.01 verb-abduct.01 - | |
abolition.01 verb-abolish.01 - | |
abomination.01 verb-abominate.01 ARG1 | |
abortion.01 verb-abort.01 - | |
absence.01 verb-absent.01 - | |
absorber.01 verb-absorb.01 ARG0 | |
absorption.01 verb-absorb.01 - | |
abuse.01 verb-abuse.01 - |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"translatorID":"04623cf0-313c-11df-9aae-0800200c9a66", | |
"translatorType":2, | |
"label":"ZotSelect Link", | |
"creator":"Scott Campbell, Avram Lyon, Nathan Schneider", | |
"target":"html", | |
"minVersion":"2.0", | |
"maxVersion":"", | |
"priority":200, | |
"inRepository":false, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# http://nlp.cs.nyu.edu/wiki/corpuswg/AnnotationCompatibilityReport | |
# Table 1: Part of Speech Compatibility | |
# (Initial Version from Manning and Schutz 1998, pp. 141-142) | |
# Extended to cover Claws1 and ICE | |
# cf. http://www.scs.leeds.ac.uk/ccalas/tagsets/brown.html | |
# Nathan Schneider, 2011-02-19: | |
# * Fixed some errors in brown column, e.g.: DT1 => DTI, PP0 => PPO, NRS => NPS | |
# * Added last column (Twitter tagset) and several special tags at the end | |
Category Examples Claws c5, Claws1 Brown PTB ICE Twitter | |
Adjective happy, bad AJ0 JJ JJ ADJ.ge A |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#coding=UTF-8 | |
''' | |
to run the code: | |
METHOD 1: .stem_pos files | |
$ export PYTHONPATH=/path/to/AQMAR | |
$ python2.7 supersenseDefaults.py [mode] ar.stem_pos > ar.lexiconsst | |
METHOD 2: parallel .tok and .wd_pos_ne.txt files |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import scipy | |
import random | |
import math | |
import sys | |
INFINITY = float('inf') | |
def logadd(a,b): | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Git pre-commit hook to look for untracked files mentioned in the LaTeX and BibTeX logs. | |
# Fail if any are found. Note that this is not foolproof, as included .tex files | |
# not generating any errors or warnings may not be mentioned in the log. | |
# | |
# Goes in file .git/hooks/pre-commit under the repository root. | |
# | |
# Nathan Schneider ([email protected]), 2015-02-26 | |
# Adapted from http://stackoverflow.com/a/10932301 | |
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python2.7 | |
''' | |
Converts new-style PTB POS tags to the English tagset from the Universal Dependencies project | |
(see universal-pos-en.html, from http://universaldependencies.github.io/docs/en/pos/all.html). | |
There are 17 such tags, expanded from the original 12 Universal POS tags of Petrov et al. 2011. | |
See "limitations" comment below for some details on our interpretation of the difficult-to-map | |
categories. | |
In new-style PTB, TO only applies to prepositional (not infinitival) "to". |
OlderNewer