Skip to content

Instantly share code, notes, and snippets.

@rdemorais
Created July 23, 2022 12:31
Show Gist options
  • Save rdemorais/51996e695e38a1bef645296ef826d6d3 to your computer and use it in GitHub Desktop.
Save rdemorais/51996e695e38a1bef645296ef826d6d3 to your computer and use it in GitHub Desktop.
Import transformer model into spacy v3
from thinc.api import Config
import spacy
DEFAULT_CONFIG_STR = """
[transformer]
max_batch_items = 4096
[transformer.set_extra_annotations]
@annotation_setters = "spacy-transformers.null_annotation_setter.v1"
[transformer.model]
@architectures = "spacy-transformers.TransformerModel.v3"
name = "shc-cn-v2"
tokenizer_config = {"use_fast": true}
transformer_config = {}
mixed_precision = false
grad_scaler_config = {}
[transformer.model.get_spans]
@span_getters = "spacy-transformers.strided_spans.v1"
window = 128
stride = 96
"""
DEFAULT_CONFIG = Config().from_str(DEFAULT_CONFIG_STR)
nlp = spacy.blank("pt")
trf = nlp.add_pipe("transformer", config=DEFAULT_CONFIG["transformer"])
trf.model.initialize()
doc = nlp('text here')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment