Skip to content

Instantly share code, notes, and snippets.

View ag-chirag's full-sized avatar

Chirag Agrawal ag-chirag

View GitHub Profile
@ag-chirag
ag-chirag / prompt.py
Created September 24, 2024 03:17
Data Gemma Query Expansion Pormpt
RAG_IN_CONTEXT_PROMPT = """
Given a QUERY below, your task is to come up with a maximum of 25
STATISTICAL QUESTIONS that help in answering QUERY.
Here are the only forms of STATISTICAL QUESTIONS you can generate:
1. "What is $METRIC in $PLACE?"
2. "What is $METRIC in $PLACE $PLACE_TYPE?"
3. "How has $METRIC changed over time in $PLACE $PLACE_TYPE?"
@ag-chirag
ag-chirag / data_commons.py
Created September 24, 2024 03:14
Util method to convert DataCommons response to markdown
@staticmethod
def pretty_print(q2resp: dict[str, dg.base.DataCommonsCall]):
markdown_output = "# Data Commons Response\n"
for k, v in q2resp.items():
markdown_output += f"**{k}**\n\n"
markdown_output += f"{v.answer()}\n\n"
return markdown_output
@ag-chirag
ag-chirag / data_commons.py
Created September 24, 2024 03:12
Simple Client to call DataCommons NL API
class DataCommonsClient:
def __init__(self):
self.data_fetcher = dg.DataCommons(api_key=DC_API_KEY)
def call_dc(self, questions: list[str]) -> dict[str, dg.base.DataCommonsCall]:
try:
q2resp = self.data_fetcher.calln(questions, self.data_fetcher.point)
except Exception as e:
logging.warning(e)
@ag-chirag
ag-chirag / model.py
Created September 24, 2024 03:10
Loading Data Gemma from HuggingFace
class DataGemma:
def __init__(self, model_id: str = "bartowski/datagemma-rag-27b-it-GGUF", model_file: str = "datagemma-rag-27b-it-Q2_K.gguf"):
self.generation_kwargs = {
"max_tokens": 4096, # Max number of new tokens to generate
}
self.model_path = hf_hub_download(model_id, model_file)
self.llm = Llama(
self.model_path
)
self.name = "DataGemma"