Last active
January 14, 2021 03:33
-
-
Save howard-haowen/11392b53ee0fae9f4f0201e0f14d9fe9 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Set the runtime to GPU first!!! | |
# !pip install -U ckiptagger[tf,gdown] | |
import os | |
from ckiptagger import data_utils, WS | |
data_utils.download_data_gdown("./") | |
os.environ["CUDA_VISIBLE_DEVICES"] = "0" | |
ws = WS("./data", disable_cuda=False) | |
def tokenize(myStr): | |
res = ws([myStr]) # The argument has to be a list. | |
return res[0] # The index 0 is necessary because the input is a list with a single element. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment