Skip to content

Instantly share code, notes, and snippets.

View philschmid's full-sized avatar

Philipp Schmid philschmid

View GitHub Profile
Classify user search queries as either "Good Google Search Query" or "Bad Google Search Query" based on their likelihood of yielding relevant and helpful results from Google Search.
Input: User search query (text string).
Output: Classification label:
* Good Google Search Query: The query is likely to be effectively answered by Google Search.
* Bad Google Search Query: The query is unlikely to be effectively answered by Google Search. Further categorize "Bad" queries into subtypes for better understanding and classifier training (optional but highly recommended):
* Chit-Chat/Conversational/Social
* Personal/Subjective/Opinion-Based (Un-searchable)
* Vague/Ambiguous/Lacking Specificity
@philschmid
philschmid / get_memory_size.py
Created January 16, 2025 13:53
Get needed GPU per precision for a Hugging Face Model Id
from typing import Dict, Union
from huggingface_hub import get_safetensors_metadata
import argparse
import sys
# Example:
# python get_gpu_memory.py Qwen/Qwen2.5-7B-Instruct
# Dictionary mapping dtype strings to their byte sizes
bytes_per_dtype: Dict[str, float] = {
from time import time
from datasets import load_dataset
from semhash import SemHash
# if greater than 0.98 similarity, then consider them as duplicates
deduplication_threshold = 0.98
# Load a dataset to deduplicate
ds = load_dataset("arcee-ai/The-Tome", split="train")
# convert message to prompt test
# pip install google-genai
from google import genai
# create client
client = genai.Client(api_key='API_KEY')
# use Gemini 2.0 with Flash Thinking
stream = client.models.generate_content_stream(
model='gemini-2.0-flash-thinking-exp-1219',
contents=f"""Can you crack the code?
import asyncio
import base64
import json
import os
import pyaudio
from websockets.asyncio.client import connect
class SimpleGeminiVoice:
def __init__(self):
Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:
0.8+: Continue current approach
0.5-0.7: Consider minor adjustments
Below 0.5: Seriously consider backtracking and trying a different approach
Understand the Task: Grasp the main objective, goals, requirements, constraints, and expected output.
- Minimal Changes: If an existing prompt is provided, improve it only if it's simple. For complex prompts, enhance clarity and add missing elements without altering the original structure.
- Reasoning Before Conclusions: Encourage reasoning steps before any conclusions are reached. ATTENTION! If the user provides examples where the reasoning happens afterward, REVERSE the order! NEVER START EXAMPLES WITH CONCLUSIONS!
- Reasoning Order: Call out reasoning portions of the prompt and conclusion parts (specific fields by name). For each, determine the ORDER in which this is done, and whether it needs to be reversed.
- Conclusion, classifications, or results should ALWAYS appear last.
- Examples: Include high-quality examples if helpful, using placeholders [in brackets] for complex elements.
- What kinds of examples may need to be included, how many, and whether they are complex enough to benefit from p
import os
import asyncio
import subprocess
import time
from typing import List, Dict
import torch
from openai import AsyncOpenAI
from tqdm.asyncio import tqdm
import logging
from typing import Dict, List
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
class ArmoRMPipeline:
def __init__(self, model_id, device_map="auto", torch_dtype=torch.bfloat16, truncation=True, trust_remote_code=False, max_length=4096):
self.model = AutoModelForSequenceClassification.from_pretrained(
model_id,
device_map=device_map,
import requests as r
from huggingface_hub import HfFolder
from tqdm import tqdm
from datasets import Dataset
headers = {"Authorization": f"Bearer {HfFolder.get_token()}"}
sess = r.Session()
sess.headers.update(headers)