Hybrid Search Methods for RAG Systems Beyond RAG Fusion

This Markdown file addresses the question of hybrid search methods other than RAG Fusion, alongside full-text search (e.g., proximity-based queries like words near each other) and traditional keyword search. It complements the lecture content on Retrieval-Augmented Generation (RAG) systems, particularly the sections on "Similarity Search in Action" and "How Similarity Search Works," by exploring advanced retrieval strategies. The file also highlights Azure AI Search’s out-of-the-box support for these methods, as referenced in the question and supported by recent documentation (e.g., Microsoft Learn, 2025).

Question

Besides RAG Fusion, what other hybrid search methods, as well as full-text (e.g., words near each other) and traditional keyword search, are available? Some services like Azure AI Search offer this out of the box now.

Answer

Overview

In RAG systems, the retrieval step is critical for identifying relevant documents to ground the language model’s responses, as emphasized in the lecture’s discussion of vector databases and similarity search. While RAG Fusion (using Reciprocal Rank Fusion, RRF) combines vector and keyword search results to improve relevance, other hybrid search methods, full-text search techniques (e.g., proximity queries), and traditional keyword search offer complementary approaches. Azure AI Search, as noted, provides robust support for these methods out of the box, making it a versatile platform for RAG applications. Below, we explore these methods, their mechanics, use cases in RAG, and how they align with the lecture’s focus on practical implementation, such as the Berkshire Hathaway shareholder letters example.

1. Hybrid Search Methods Beyond RAG Fusion

Hybrid search combines vector search (semantic similarity) and keyword search (exact term matching) to leverage their complementary strengths, as described in the lecture and Azure AI Search documentation. Beyond RAG Fusion (which uses RRF to merge results), other hybrid approaches include weighted hybrid search, multi-vector hybrid search, and semantic ranking-enhanced hybrid search. These methods are supported natively by Azure AI Search, as per recent documentation (e.g., Microsoft Learn, 2025).

Weighted Hybrid Search

Definition: Assigns different weights to vector and keyword search results to prioritize one method over the other, adjusting the influence of each in the final ranking.
Mechanics:
- Executes vector and keyword queries in parallel, as in RAG Fusion.
- Instead of RRF’s equal weighting, applies user-defined weights to each result set. For example, vector search might be weighted 2x higher than keyword search for semantic-heavy queries.
- Azure AI Search supports this via the hybridSearch query parameter, allowing configuration of weights (e.g., [2, 1] for vector vs. keyword) in the RRF function, as noted in Azure Cosmos DB documentation ().
Use Case in RAG:
- Ideal when one search type is more relevant for your domain. For the lecture’s Berkshire Hathaway example, weighting vector search higher could prioritize semantic matches for queries like “Buffett’s investment philosophy,” while keyword search ensures exact matches for terms like “dividend.”
- Example query in Azure AI Search:
```
query = "investment strategy"
results = search_client.search(
    search_text=query,
    vector=Vector(value=generate_embeddings(query), k=3, fields="content_vector"),
    hybridSearch={"maxTextRecallSize": 50, "weights": [2, 1]}  # Vector 2x, keyword 1x
)
```
Azure Support: Available in Azure AI Search’s preview APIs, allowing fine-tuning of weights to balance semantic and exact matches ().
Pros and Cons:
- Pros: Flexible; allows domain-specific tuning; supported out of the box by Azure AI Search.
- Cons: Requires experimentation to set optimal weights; may overemphasize one method if misconfigured.

Multi-Vector Hybrid Search

Definition: Combines multiple vector queries (targeting different vector fields) with keyword search, merging results into a unified set.
Mechanics:
- A search index contains multiple vector fields (e.g., embeddings for title, content, and metadata) alongside text fields.
- Executes multiple vector queries (e.g., one per field) and a keyword query in parallel, merging results with RRF or custom scoring.
- Azure AI Search supports this for indices with multiple vector fields, with up to 11 query executions for complex scenarios ().
Use Case in RAG:
- Useful for complex documents like shareholder letters, where separate embeddings for sections (e.g., financial summary vs. narrative) improve retrieval precision.
- For example, a query like “Buffett’s risk management in 2023” could search title vectors, content vectors, and keyword fields simultaneously, ensuring comprehensive coverage.
Azure Support: Natively supported in Azure AI Search, with the ability to target multiple vector fields in a single request ().
Pros and Cons:
- Pros: Captures diverse aspects of documents; improves recall for multifaceted queries.
- Cons: Increases computational complexity; requires careful index design with multiple vector fields.

Semantic Ranking-Enhanced Hybrid Search

Definition: Applies a secondary deep learning-based ranking step to hybrid search results, improving relevance beyond RRF’s initial fusion.
Mechanics:
- Combines vector and keyword search results using RRF, as in RAG Fusion.
- Uses a semantic ranker (adapted from Microsoft Bing) to re-rank the top 50 results based on query-document semantic alignment, using fields like title and content (up to 2,000 tokens) ().
- Azure AI Search enables this with the query_type=semantic parameter, enhancing results for text-heavy queries ().
Use Case in RAG:
- Enhances relevance for generative AI scenarios, as emphasized in the lecture, by promoting the most contextually relevant chunks. For example, in the Berkshire Hathaway repo, it ensures queries like “value investing principles” retrieve the most semantically accurate passages.
- Benchmarks show hybrid search with semantic ranking outperforms vector-only or keyword-only search ().
Azure Support: Available in Azure AI Search for Basic tier or higher, with semantic ranker enabled ().
Pros and Cons:
- Pros: Significantly improves relevance; leverages deep learning; seamless in Azure AI Search.
- Cons: Limited to top 50 results; requires additional compute; not useful for non-text fields.

2. Full-Text Search (e.g., Words Near Each Other)

Full-text search, as described in the lecture’s discussion of traditional search methods, focuses on matching query terms to document content using techniques like inverted indices and relevance scoring.

Proximity-Based Search

Definition: Prioritizes documents where query terms appear close to each other (e.g., within a specified number of words), capturing contextual relationships.
Mechanics:
- Uses an inverted index to locate documents containing query terms.
- Applies proximity scoring (e.g., BM25 with phrase matching) to boost results where terms are near each other, indicating stronger relevance.
- Azure AI Search supports proximity queries via phrase search syntax (e.g., "risk management"~5 to find “risk” and “management” within 5 words) ().
Use Case in RAG:
- Ideal for queries requiring specific phrases or concepts, such as “Buffett’s risk management strategy” in the lecture’s example, where “risk” and “management” appearing together are critical.
- Enhances retrieval when semantic search alone might miss exact phrase matches.
Azure Support: Out-of-the-box support in Azure AI Search with BM25 scoring and phrase queries, integrable with hybrid search ().
Pros and Cons:
- Pros: Precise for phrase-based queries; captures local context; fast with inverted indices.
- Cons: Misses semantic relationships; sensitive to exact wording; less effective for broad queries.

Other Full-Text Techniques

Fuzzy Search: Matches terms with minor misspellings or variations (e.g., “risck” matches “risk”). Useful for user-generated queries in RAG, supported by Azure AI Search ().
Wildcard Search: Allows partial term matches (e.g., “invest*” matches “investment,” “investing”). Helps with incomplete queries in the lecture’s Q&A scenarios.
Boolean Queries: Combines terms with AND, OR, NOT operators for complex filtering (e.g., “risk AND management NOT loss”). Enhances precision in structured RAG queries.

3. Traditional Keyword Search

Definition: Matches query terms exactly or partially against document text using an inverted index, typically scored with BM25 (a probabilistic ranking function).
Mechanics:
- Breaks documents into terms during indexing, storing them in an inverted index for fast lookup.
- Scores documents based on term frequency, inverse document frequency, and document length (BM25), as noted in Azure AI Search documentation ().
- Supports filtering, faceting, and sorting, as highlighted in the lecture’s discussion of traditional search methods.
Use Case in RAG:
- Excels at exact matches, such as retrieving documents with specific terms like “dividend” or “2023” in the Berkshire Hathaway letters.
- Complements vector search in hybrid setups to ensure precision for product codes, jargon, or names, as noted in Azure AI Search benchmarks ().
Azure Support: Core feature of Azure AI Search, with BM25 scoring and advanced query syntax (e.g., search=dividend AND 2023) ().
Pros and Cons:
- Pros: Fast; precise for exact matches; supports structured queries; widely supported.
- Cons: Limited to lexical matches; misses semantic relationships; sensitive to query phrasing.

Integration in RAG Systems

Hybrid Search: Azure AI Search’s hybrid search combines vector and keyword queries in a single request, merging results with RRF or weighted scoring. Adding semantic ranking further refines results, aligning with the lecture’s goal of context-enhanced responses ().
Full-Text and Keyword Search: Use proximity or keyword queries for precise term matching, then pass results to the LLM for generation. For example, a query like "risk management"~5 in the Berkshire Hathaway repo retrieves chunks with exact phrasing, which the LLM can synthesize into a coherent answer.
Practical Example:
- Setup: Index shareholder letters with text fields (content) and vector fields (embeddings from all-MiniLM-L6-v2).
- Query: For “Buffett’s risk management,” use Azure AI Search’s hybrid query:
```
results = search_client.search(
    search_text="risk management",
    vector=Vector(value=generate_embeddings("risk management"), k=3, fields="content_vector"),
    query_type="semantic",  # Enable semantic ranking
    select=["title", "content"]
)
```
- Outcome: Combines semantic matches (vector search), exact term matches (keyword search), and proximity-based results, with semantic ranking promoting the most relevant chunks.

Why Azure AI Search?

Azure AI Search provides out-of-the-box support for these methods, as noted in the question. It integrates:

Hybrid Search: Combines vector and keyword search with RRF, weighted scoring, or multi-vector queries (,).
Semantic Ranking: Enhances hybrid results with deep learning models ().
Full-Text Features: Supports proximity, fuzzy, wildcard, and boolean queries ().
Scalability: Handles enterprise-scale RAG with indexing, chunking, and embedding pipelines ().

This aligns with the lecture’s emphasis on robust retrieval for generative AI, making Azure AI Search ideal for implementing these methods in the Berkshire Hathaway example.

Conclusion

Beyond RAG Fusion, hybrid methods like weighted hybrid search, multi-vector hybrid search, and semantic ranking-enhanced hybrid search offer flexible ways to balance semantic and exact matching in RAG systems. Full-text search, with proximity queries, and traditional keyword search provide precision for specific terms or phrases, complementing vector search. Azure AI Search’s out-of-the-box support for these methods, with features like BM25, RRF, and semantic ranking, makes it a powerful platform for RAG, as seen in the lecture’s focus on practical retrieval for shareholder letters. Experiment with these methods in your vector database to optimize retrieval for your specific use case.

decagondev/varied-search-techniques.md