This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"title": "Schema for validation Qordoba Spacy Match/Replace format", | |
"type": "object", | |
"definitions": { | |
"spacyMatch": { | |
"type": "array", | |
"items": { | |
"$ref": "#/definitions/spacyAttribute" | |
}, | |
"minItems": 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from fitbert import FitBert | |
# currently supported models: bert-large-uncased and distilbert-base-uncased | |
# this takes a while and loads a whole big BERT into memory | |
fb = FitBert() | |
masked_string = "Why Bert, you're looking ***mask*** today!" | |
options = ['buff', 'handsome', 'strong'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
unmasked_string = "Why Bert, you're looks handsome today!" | |
span_to_mask = (17, 22) | |
filled_in = fb.mask_fitb(unmasked_string, span_to_mask) | |
# >>> "Why Bert, you're looking handsome today!" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
masked_string = "Your 17 ***mask*** burritos are on their way !" | |
options = ['hot', 'cold', 'sweet', 'delicious', 'artisanal'] | |
fb.fitb(masked_string, options=options) | |
# >>> 'Your 17 delicious burritos are on their way !' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# example from "Targeted Syntactic Evaluation of Language Models" | |
# https://arxiv.org/abs/1808.09031 | |
masked_string = "the author that the guard likes ***mask***" | |
options = ['laugh', 'laughs'] | |
fb.rank_with_prob(masked_string, options) | |
# >>> (['laughs', 'laugh'], [4.14195717654553e-12, 3.3748110100755013e-13]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mkpoetryproj () | |
{ | |
if [ $# -eq 1 ]; then | |
poetry new "$1" | |
cd "$1" || exit | |
# get gitignore | |
curl https://raw.githubusercontent.com/github/gitignore/master/Python.gitignore -o .gitignore | |
{ | |
echo "" | |
echo ".vscode/" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import itertools | |
from typing import Tuple | |
def ucp_to_utf16_charmap(s: str): | |
""" | |
mostly copied from | |
https://stackoverflow.com/questions/56280011/keeping-java-string-offsets-with-unicode-consistent-in-python | |
converts from python indices (unicode code points) to indices |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import torch | |
from torch import nn | |
from torch.nn import BCEWithLogitsLoss, CrossEntropyLoss, MSELoss | |
from transformers.modeling_outputs import SequenceClassifierOutput | |
class T5EncoderClassificationHead(nn.Module): | |
"""Head for sentence-level classification tasks.""" | |
def __init__(self, config): |