Created
May 5, 2020 18:35
-
-
Save a7v8x/9d264bb529bc173103eb11094f5f6845 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bert_input = tokenizer.encode_plus( | |
test_sentence, | |
add_special_tokens = True, # add [CLS], [SEP] | |
max_length = max_length_test, # max length of the text that can go to BERT | |
pad_to_max_length = True, # add [PAD] tokens | |
return_attention_mask = True, # add attention mask to not focus on pad tokens | |
) | |
print('encoded', bert_input) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment