Created
January 24, 2025 16:04
-
-
Save fzn0x/d535fe8ba3d070769ae7e281c28a9883 to your computer and use it in GitHub Desktop.
RAG and Fine Tuning
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
How RAG Works: | |
RAG consists of two main components: | |
Retriever: | |
The retriever is responsible for fetching relevant documents or passages from a knowledge base (e.g., Wikipedia, a database, or a custom corpus). | |
It does not fetch from the generative model itself. Instead, it uses a separate mechanism (e.g., dense retrieval like DPR or sparse retrieval like BM25) to search through a large collection of documents. | |
Generator: | |
The generator is a pre-trained sequence-to-sequence model (e.g., BART or T5) that takes the query and the retrieved documents as input and generates a response. | |
Key Point: | |
The retriever and the generator are separate components: | |
The retriever is trained to find relevant documents from an external knowledge base. | |
The generator is trained to produce coherent and contextually relevant responses based on the query and the retrieved documents. | |
Example Workflow: | |
Query: "What is the capital of France?" | |
Retriever: | |
Searches a knowledge base (e.g., Wikipedia) and retrieves relevant documents, such as: | |
"Paris is the capital of France." | |
"France is a country in Europe, and its capital is Paris." | |
Generator: | |
Combines the query and the retrieved documents to generate a response: | |
"The capital of France is Paris." | |
Why the Retriever Doesn't Fetch from the Generative Model: | |
The generative model (e.g., BART, T5) is not designed to store or retrieve factual knowledge. It is trained to generate text based on patterns learned during training. | |
The retriever, on the other hand, is specifically designed to search through a large collection of documents to find relevant information. | |
How the Retriever Works: | |
The retriever in RAG is typically implemented using one of the following approaches: | |
Dense Retrieval: | |
Uses dense vector representations (embeddings) to find semantically similar documents. | |
Example: Dense Passage Retrieval (DPR), which encodes queries and documents into dense vectors and retrieves documents based on vector similarity. | |
Sparse Retrieval: | |
Uses traditional information retrieval techniques like TF-IDF or BM25 to find documents based on keyword matching. | |
Example: BM25, which ranks documents based on term frequency and inverse document frequency. | |
Example of RAG with a Custom Knowledge Base: | |
If you want to use RAG with a custom knowledge base, you would: | |
Build a document index (e.g., a collection of text files, a database, or a Wikipedia dump). | |
Train or use a pre-trained retriever to search this index. | |
Use a pre-trained generator (e.g., BART or T5) to produce responses based on the retrieved documents. | |
Here’s an example of how you might set this up using Hugging Face's transformers library: | |
python | |
Copy | |
from transformers import RagTokenizer, RagRetriever, RagSequenceForGeneration | |
# Load RAG components | |
tokenizer = RagTokenizer.from_pretrained("facebook/rag-token-base") | |
retriever = RagRetriever.from_pretrained("facebook/rag-token-base", index_name="custom") # Custom knowledge base | |
model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-base", retriever=retriever) | |
# Input query | |
query = "What is the capital of France?" | |
# Tokenize and generate | |
input_ids = tokenizer(query, return_tensors="pt").input_ids | |
generated = model.generate(input_ids) | |
response = tokenizer.batch_decode(generated, skip_special_tokens=True)[0] | |
print(response) # Output: "The capital of France is Paris." | |
Custom Knowledge Base: | |
To use a custom knowledge base, you would: | |
Prepare your documents (e.g., text files, JSON, or a database). | |
Build an index using a library like FAISS (for dense retrieval) or Elasticsearch (for sparse retrieval). | |
Configure the retriever to use this index. | |
Summary: | |
The retriever in RAG fetches documents from an external knowledge base, not from the generative model. | |
The generative model uses the retrieved documents to produce a response. | |
You can use a custom knowledge base by building an index and configuring the retriever to search it. | |
Let me know if you'd like a step-by-step guide on setting up RAG with a custom knowledge base! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
https://manytools.org/hacker-tools/steganography-encode-text-into-image/go/