This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Print index info before (info() is also new!) | |
embeddings.info() | |
# Reindex | |
embeddings.reindex({"path": "sentence-transformers/paraphrase-MiniLM-L3-v2"}) | |
print("------") | |
# Print index info after | |
embeddings.info() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Save index as tar.xz | |
embeddings.save("index.tar.xz") | |
#tar -tvJf index.tar.xz | |
#echo | |
#xz -l index.tar.xz | |
#echo | |
# Reload index | |
embeddings.load("index.tar.xz") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import requests | |
def transform(inputs): | |
response = requests.post("https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/nli-mpnet-base-v2", | |
json={"inputs": inputs}) | |
return np.array(response.json(), dtype=np.float32) | |
# Index data using vectors from Inference API |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# pip install spacy --upgrade | |
# python -m spacy download en_core_web_md | |
import spacy | |
# Load spacy | |
nlp = spacy.load("en_core_web_md") | |
def transform(inputs): | |
return [result.vector for result in nlp.pipe(inputs)] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
embeddings: | |
path: sentence-transformers/nli-mpnet-base-v2 | |
content: true | |
tabular: | |
idcolumn: url | |
textcolumns: | |
- title | |
workflow: | |
index: | |
tasks: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
summary: | |
path: sshleifer/distilbart-cnn-12-6 | |
textractor: | |
join: true | |
lines: false | |
minlength: 100 | |
paragraphs: true | |
sentences: false | |
workflow: | |
summary: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
translation: {} | |
workflow: | |
translate: | |
tasks: | |
- action: translation | |
args: | |
- fr |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
summary: | |
path: sshleifer/distilbart-cnn-12-6 | |
textractor: | |
join: true | |
lines: false | |
minlength: 100 | |
paragraphs: true | |
sentences: false | |
translation: {} | |
workflow: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
embeddings: | |
path: sentence-transformers/nli-mpnet-base-v2 | |
content: true | |
functions: | |
- {name: translation, argcount: 2, function: translation} | |
tabular: | |
idcolumn: url | |
textcolumns: | |
- title | |
translation: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from txtai import Embeddings | |
data = [ | |
"US tops 5 million confirmed virus cases", | |
"Canada's last fully intact ice shelf has suddenly collapsed, forming a Manhattan-sized iceberg", | |
"Beijing mobilises invasion craft along coast as Taiwan tensions escalate", | |
"The National Park Service warns against sacrificing slower friends in a bear attack", | |
"Maine man wins $1M from $25 lottery ticket", | |
"Make huge profits without work, earn up to $100,000 a day" | |
] |