Skip to content

Instantly share code, notes, and snippets.

@kowey
Created April 14, 2015 09:47
Show Gist options
  • Save kowey/e19ff1521f457608510b to your computer and use it in GitHub Desktop.
Save kowey/e19ff1521f457608510b to your computer and use it in GitHub Desktop.
"""
Basic interface that all parsers should respect
"""
from abc import ABCMeta, abstractmethod
from six import with_metaclass
class Parser(with_metaclass(ABCMeta, object)):
"""
Parsers follow the scikit fit/transform idiom. They are learned from some
training data via the `fit()` function. Once fitted to the training data,
they can be set loose on anything you might want to parse: the `transform`
function will produce graphs from the EDUs.
If the learning process is expensive, it would make sense to offer the
ability to initialise a parser from a cached model
"""
@abstractmethod
def fit(self, mpack):
"""
Extract whatever models or other information from the multipack
that is necessary to make the parser operational
Parameters
----------
mpack : MultiPack
"""
raise NotImplementedError
@abstractmethod
def transform(self, dpack):
"""
Parse a single document into a list of N-best graphs (ordered
by what the parser considers to be match quality, best first)
Parameters
----------
dpack: DataPack
Returns
-------
predictions: [ [(string,string,string)] ]
list of list of (id, id, label) tuples
predictions are ordered by what the parser considers to be
match quality, best first)
"""
raise NotImplementedError
class Labeller(with_metaclass(ABCMeta, object)):
"""
A labeller assigns labels to edges in a graph. Note, it may
sometimes make sense to implement labellers that preserve
pre-existing labels (ie. only rewriting the ones which have
the `UNKNOWN` label). Doing so would allow you to stack
multiple labellers.
"""
@abstractmethod
def fit(self, mpack):
"""
Extract whatever models or other information from the multipack
that is necessary to make the labeller operational
:type mpack: MultiPack
"""
raise NotImplementedError
@abstractmethod
def transform(self, dpack):
"""
Parse a single document into a list of N-best graphs (ordered
by what the parser considers to be match quality, best first)
:type dpack: DataPack
:rtype [ [(string,string,string)] ]
"""
raise NotImplementedError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment