Skip to content

Instantly share code, notes, and snippets.

@AFirooz
Last active June 16, 2025 17:01
Show Gist options
  • Save AFirooz/3d21a6f4a476d0e0e4b13e7e9ab1285b to your computer and use it in GitHub Desktop.
Save AFirooz/3d21a6f4a476d0e0e4b13e7e9ab1285b to your computer and use it in GitHub Desktop.
Vector Stores

Rules of Thumb

  1. For Small, Fast Retrieval (≤10M vectors) → Use FAISS or Annoy (in-memory)
  2. For prototyping with serverless persistent search → Use Chroma DB (duckdb+parquet)
  3. For On-Premise Search with Persistence → Use Milvus, Weaviate, or Qdrant
  4. For Fully Managed, Scalable Search (Cloud) → Use Pinecone or Weaviate Cloud Service (WCS)
  5. For Multi-Modal Hybrid Search (keywords + Vectors) → Use Weaviate, Qdrant, Elasticsearch, or MongoDB.
  6. For Large-Scale AI Pipelines → Use Milvus, Weaviate, or a hybrid FAISS+OLTP Database (e.g. Postgres) setup.

Pros and Cons

FAISS

  • Pros: Extremely fast, efficient for in-memory search, great for research and prototyping.
  • Cons: No built-in persistence, limited distributed support, not production-ready for large-scale or multi-user.

Annoy

  • Pros: Simple, lightweight, good for read-heavy workloads, easy to use.
  • Cons: In-memory, slower build times, limited scalability, no advanced filtering.

Chroma DB with Duckdb + Parquet

There is no separate database server to install, run, or connect to. More powerful than [[#FAISS|option 1]] , but simpler than [[#Milvus|option 3]]. Compared to [[#Hybrid FAISS + OLTP Database]], this is an integrated, serverless architecture. It's a single, self-contained system, where both the vectors and the metadata payloads are managed together within Chroma.

  1. DuckDB acts as an embedded query engine for the metadata, and
  2. Parquet files are used for on-disk storage of everything.
  3. The system is the glue. Chroma’s API handles the entire hybrid search process internally. You submit one query that specifies both the vector search and the metadata filters (where clauses), and Chroma figures out how to execute it. It runs entirely within your Python process, reading and writing to local files.
  • Pros: Zero setup friction, python-native experience, no network latency, persistent.
  • Cons: limited scalability

Milvus

  • Pros: Scalable, supports persistence, distributed, strong community, supports multiple index types.
  • Cons: Requires setup and resources, more complex to operate than in-memory libraries.

Weaviate

  • Pros: Hybrid search (text + vector), schema support, RESTful API, cloud and on-premise, multi-modal, easy to use.
  • Cons: Can be resource-intensive, some advanced features may require cloud version.

Qdrant

  • Pros: Fast, persistent, easy to deploy, supports filtering and payloads, open-source.
  • Cons: Fewer integrations than Weaviate, smaller community.

Pinecone

  • Pros: Fully managed, scalable, easy to use, high availability, no infrastructure management.
  • Cons: Cloud-only, can be costly at scale, less control over infrastructure.

Elasticsearch

  • Pros: Mature, supports hybrid search, strong filtering and analytics, large ecosystem.
  • Cons: Vector search is newer, less efficient for pure vector workloads, can be complex to tune.

Hybrid FAISS + OLTP Database

This is a decoupled, specialized architecture.

  1. The OLTP database (like Postgres) is the "source of truth." It stores all your metadata (IDs, text, prices, user info) and is responsible for data consistency, transactions, and complex, filtered queries.
  2. While FAISS is a highly specialized, in-memory search index library. It only stores the vector embeddings and their corresponding IDs from the database. It does one thing: finds the nearest neighbor IDs for a given query vector, and does it extremely fast.
  3. Your application code is the glue. A typical query involves first hitting FAISS to get a list of promising IDs, then taking those IDs and performing a structured WHERE id IN (...) query against the OLTP database to retrieve the full data and apply any final filters.
  • Pros: Combines fast vector search with relational data, flexible.
  • Cons: More complex to set up and maintain, not as seamless as dedicated vector DBs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment