Skip to content

Instantly share code, notes, and snippets.

View nschneid's full-sized avatar

Nathan Schneider nschneid

View GitHub Profile
@nschneid
nschneid / POSMappings.txt
Created September 7, 2013 15:50
Scripts for working with part-of-speech tagsets: describing the morphosyntactic attributes encoded by tags, and converting between different tagsets. Cf. https://gist.github.com/nschneid/4231292
# http://nlp.cs.nyu.edu/wiki/corpuswg/AnnotationCompatibilityReport
# Table 1: Part of Speech Compatibility
# (Initial Version from Manning and Schutz 1998, pp. 141-142)
# Extended to cover Claws1 and ICE
# cf. http://www.scs.leeds.ac.uk/ccalas/tagsets/brown.html
# Nathan Schneider, 2011-02-19:
# * Fixed some errors in brown column, e.g.: DT1 => DTI, PP0 => PPO, NRS => NPS
# * Added last column (Twitter tagset) and several special tags at the end
Category Examples Claws c5, Claws1 Brown PTB ICE Twitter
Adjective happy, bad AJ0 JJ JJ ADJ.ge A
@nschneid
nschneid / universal_tags.py
Created December 7, 2012 06:50
Utility for mapping to universal part-of-speech tagset
'''
Interface for converting POS tags from various treebanks
to the universal tagset of Petrov, Das, & McDonald.
The tagset consists of the following 12 coarse tags:
VERB - verbs (all tenses and modes)
NOUN - nouns (common and proper)
PRON - pronouns
ADJ - adjectives
@nschneid
nschneid / zotselect-link.js
Last active November 4, 2024 20:26
Zotero export translator that generates a zotero://select link to an item in the Zotero library. (First a simple version, as well as a version that displays minimal citation information and stores further details as title text.)
@nschneid
nschneid / deverbals-from-nombank-examples.txt
Created June 25, 2012 22:44
List deverbal nominalizations in NomBank
abandonment.01 verb-abandon.01 -
abatement.01 verb-abate.01 -
abduction.01 verb-abduct.01 -
abolition.01 verb-abolish.01 -
abomination.01 verb-abominate.01 ARG1
abortion.01 verb-abort.01 -
absence.01 verb-absent.01 -
absorber.01 verb-absorb.01 ARG0
absorption.01 verb-absorb.01 -
abuse.01 verb-abuse.01 -
@nschneid
nschneid / allcats.txt
Created June 25, 2012 15:52
Document-to-category mapping for NLTK ptb module (full Penn Treebank corpus reader)
WSJ/00/WSJ_0001.MRG news
WSJ/00/WSJ_0002.MRG news
WSJ/00/WSJ_0003.MRG news
WSJ/00/WSJ_0004.MRG news
WSJ/00/WSJ_0005.MRG news
WSJ/00/WSJ_0006.MRG news
WSJ/00/WSJ_0007.MRG news
WSJ/00/WSJ_0008.MRG news
WSJ/00/WSJ_0009.MRG news
WSJ/00/WSJ_0010.MRG news
@nschneid
nschneid / BibTeX_nschneid_longform.js
Last active May 11, 2019 23:31
Custom BibTeX exporters for Zotero
{
"translatorID": "8a68255b-24e5-47e0-afe5-f65fff578170",
"translatorType": 3,
"label": "BibTeX (nschneid, long form)",
"creator": "Simon Kornblith and Richard Karnesky and Nathan Schneider",
"target": "bib",
"minVersion": "2.1.9",
"maxVersion": null,
"priority": 200,
"inRepository": true,