Note: This gist has been updated to be far simpler than the original implementation, focusing on a more streamlined approach to selectively querying documents based on metadata.
When working with Llama Index and other Retrieval-Augmented Generation (RAG) systems, most tutorials focus on ingesting and querying a single document. You typically read the document from a source, parse it, embed it, and store it in your vector store. Once there, querying is straightforward. But what if you have multiple documents and want to selectively query only one, such as Document #2 (doc_id=2
), from your vector store?
This article demonstrates how to encapsulate the creation of a filtered query engine, which allows you to specify the nodes to query based on custom metadata. This approach provides a more structured and efficient way to retrieve relevant information, making it easier to manage and scale your querying process.