Skip to content

Instantly share code, notes, and snippets.

@tazarov
Created August 15, 2023 13:43
Show Gist options
  • Save tazarov/fa7f2b3a1e404f08c816dd2e6ded399f to your computer and use it in GitHub Desktop.
Save tazarov/fa7f2b3a1e404f08c816dd2e6ded399f to your computer and use it in GitHub Desktop.
This gist illustrates how to store vectors of your documents in chroma without providing your actual text documents:
import uuid
from chromadb.utils import embedding_functions
import chromadb
ef = embedding_functions.DefaultEmbeddingFunction()
docs = ["Article by john", "Article by Jack", "Article by Jill"]
client = chromadb.Client()
embeddings = ef(docs)
collection = client.get_or_create_collection("test-where-list")
collection.upsert(documents=["" for _ in range(len(docs))], embeddings=embeddings, metadatas=[{"source": "blogger.com","author":"John"}, {"source": "medium","author":"Jack"}, {"source": "notion","author":"Jill"}],ids=[str(uuid.uuid4()) for _ in range(len(docs))])
#collection.get(include=["embeddings","metadatas"])
qr = collection.query(query_texts=["All articles by John"], include=["metadatas","distances"])
print(qr)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment