Skip to content

Instantly share code, notes, and snippets.

@grahama1970
grahama1970 / get_env_value.sh
Last active January 22, 2025 01:10
Efficient .env Key Retrieval for Raycast A Raycast Script Command to search and copy environment variable values from a .env file. Supports exact matching, abbreviation shortcuts, and fuzzy search with fzf. Outputs the value to the terminal and clipboard for seamless workflows.
#!/bin/bash
# Description:
# A Raycast script for quickly finding environment variables in .env files.
# Matches keys in three ways:
# 1. Abbreviations: "aak" → "AWS_ACCESS_KEY", "gpt" → "GITHUB_PAT_TOKEN"
# 2. Partial matches: "shap" → "SHAPE", "aws" → "AWS_KEY"
# 3. Fuzzy finding: "ath" → "AUTH_TOKEN"
# Matched values are copied to clipboard and printed to terminal using pbcopy (brew install pbcopy or similar)
@grahama1970
grahama1970 / bm25_embedding_keyword_combined.aql
Last active January 16, 2025 14:03
ArangoDB hybrid search implementation combining BM25 text search, embedding similarity (using sentence-transformers), and keyword matching. Includes Python utilities and AQL query for intelligent document retrieval with configurable thresholds and scoring. Perhaps, use RapidFuzz for post-processing later
LET results = (
// Get embedding results
LET embedding_results = (
FOR doc IN glossary_view
LET similarity = COSINE_SIMILARITY(doc.embedding, @embedding_search)
FILTER similarity >= @embedding_similarity_threshold
SORT similarity DESC
LIMIT @top_n
RETURN {
doc: doc,
@grahama1970
grahama1970 / aql_utils.py
Last active January 11, 2025 14:59
This script implements a hybrid search system using ArangoDB that combines: 1. Vector similarity search using COSINE_SIMILARITY 2. BM25 text search with custom text analyzer 3. Fuzzy string matching using Levenshtein distance
import os
from loguru import logger
from typing import List, Dict
def load_aql_query(filename: str) -> str:
"""
Load an AQL query from a file.
"""
try:
file_path = os.path.join("app/backend/vllm/beta/utils/aql", filename)
@grahama1970
grahama1970 / hf_only_inference_sanity_check.py.py
Last active December 27, 2024 21:07
For dynamic adaptor loading and inferencing, the Unsloth Inference works fine--using Hugging Face does not work--outputs garbled
# Doesn't Work. Outputs are garbled
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
from loguru import logger
# Configuration
BASE_MODEL_NAME = "unsloth/Phi-3.5-mini-instruct"
ADAPTER_PATH = "/home/grahama/dev/vllm_lora/training_output/Phi-3.5-mini-instruct_touch-rugby-rules_adapter/final_model"
@grahama1970
grahama1970 / tinyllama_custom_adaptor.py
Last active December 21, 2024 22:26
tinyllama_model_merge_wip: well I thought I could make this work....maybe leave for another time
import os
import torch
import gc
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel
from huggingface_hub import snapshot_download
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
pipeline,
@grahama1970
grahama1970 / full output.txt
Last active December 20, 2024 14:44
Comaparing lorax requests to openai call: curl = 11 seconds, request = 19 seconds, OpenAI: 30 seconds
2024-12-20 09:39:32.640 | INFO | __main__:run_curl_version:10 -
=== Running curl version ===
2024-12-20 09:39:32.641 | INFO | __main__:run_curl_version:43 - Initial request time: 0.00 seconds
2024-12-20 09:39:32.641 | INFO | __main__:run_curl_version:47 - Response tokens:
To determine the number of rugby players on a touch rugby team, we can refer to the relevant section of the document.
1. **Understanding Team Composition**: The document states that a team consists of a maximum of 14 players. However, this number includes reserves, meaning that only six (6) players are allowed on the field at any given time during a match.
2. **Player Limitation**: Additionally, teams are encouraged to include mixed genders (four males and four females), indicating that2024-12-20 09:39:43.862 | INFO | __main__:run_curl_version:68 -
Tokens generated: 100
@grahama1970
grahama1970 / get_project_root.py
Last active January 26, 2025 21:39
Analyzes Python project files to generate a structured report of directory trees, dependencies, and imports. Helps LLMs understand project architecture and relationships between files.
from pathlib import Path
from dotenv import load_dotenv
def get_project_root(marker_file=".git"):
"""
Find the project root directory by looking for a marker file.
Args:
marker_file (str): File/directory to look for (default: ".git")
@grahama1970
grahama1970 / arango_utils.py
Created November 21, 2024 14:47
This Python script orchestrates the lifecycle of a RunPod container, executes LLM requests using Qwen2.5-1.5B, caches results in ArangoDB, and ensures clean-up in all scenarios with a robust finally block for stopping the container. It supports scalable, efficient, and reliable LLM pipelines.
import asyncio
from loguru import logger
from verifaix.arangodb_helper.arango_client import connect_to_arango_client
async def truncate_cache_collection(arango_config, db=None):
logger.info(f"Attempting to truncate cache collection '{arango_config['cache_collection_name']}'")
if db is None:
logger.info(f"Connecting to ArangoDB at {arango_config['host']}")
db = await asyncio.to_thread(connect_to_arango_client, arango_config)
import asyncio
from loguru import logger
from verifaix.arangodb_helper.arango_client import connect_to_arango_client
async def truncate_cache_collection(arango_config, db=None):
logger.info(f"Attempting to truncate cache collection '{arango_config['cache_collection_name']}'")
if db is None:
logger.info(f"Connecting to ArangoDB at {arango_config['host']}")
db = await asyncio.to_thread(connect_to_arango_client, arango_config)
@grahama1970
grahama1970 / pipeline_ex.py
Last active November 20, 2024 02:24
The pipeline dynamically launches a RunPod container for LLM processing, waits for it to reach a "RUNNING" state, executes queries asynchronously using LiteLLM, tracks container activity, and shuts down after a specified inactivity period, optimizing resource usage.
import asyncio
import os
import runpod
from datetime import datetime, timedelta, timezone
from dotenv import load_dotenv
from loguru import logger
from tenacity import retry, stop_after_attempt, wait_fixed, retry_if_exception_type
from verifaix.llm_client.get_litellm_response import get_litellm_response
from verifaix.arangodb_helper.arango_client import (
connect_to_arango_client,