This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Classify user search queries as either "Good Google Search Query" or "Bad Google Search Query" based on their likelihood of yielding relevant and helpful results from Google Search. | |
Input: User search query (text string). | |
Output: Classification label: | |
* Good Google Search Query: The query is likely to be effectively answered by Google Search. | |
* Bad Google Search Query: The query is unlikely to be effectively answered by Google Search. Further categorize "Bad" queries into subtypes for better understanding and classifier training (optional but highly recommended): | |
* Chit-Chat/Conversational/Social | |
* Personal/Subjective/Opinion-Based (Un-searchable) | |
* Vague/Ambiguous/Lacking Specificity |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from typing import Dict, Union | |
from huggingface_hub import get_safetensors_metadata | |
import argparse | |
import sys | |
# Example: | |
# python get_gpu_memory.py Qwen/Qwen2.5-7B-Instruct | |
# Dictionary mapping dtype strings to their byte sizes | |
bytes_per_dtype: Dict[str, float] = { |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from time import time | |
from datasets import load_dataset | |
from semhash import SemHash | |
# if greater than 0.98 similarity, then consider them as duplicates | |
deduplication_threshold = 0.98 | |
# Load a dataset to deduplicate | |
ds = load_dataset("arcee-ai/The-Tome", split="train") | |
# convert message to prompt test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# pip install google-genai | |
from google import genai | |
# create client | |
client = genai.Client(api_key='API_KEY') | |
# use Gemini 2.0 with Flash Thinking | |
stream = client.models.generate_content_stream( | |
model='gemini-2.0-flash-thinking-exp-1219', | |
contents=f"""Can you crack the code? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import asyncio | |
import base64 | |
import json | |
import os | |
import pyaudio | |
from websockets.asyncio.client import connect | |
class SimpleGeminiVoice: | |
def __init__(self): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches. | |
Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed. | |
Use <count> tags after each step to show the remaining budget. Stop when reaching 0. | |
Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress. | |
Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process. | |
Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach: | |
0.8+: Continue current approach | |
0.5-0.7: Consider minor adjustments | |
Below 0.5: Seriously consider backtracking and trying a different approach |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Understand the Task: Grasp the main objective, goals, requirements, constraints, and expected output. | |
- Minimal Changes: If an existing prompt is provided, improve it only if it's simple. For complex prompts, enhance clarity and add missing elements without altering the original structure. | |
- Reasoning Before Conclusions: Encourage reasoning steps before any conclusions are reached. ATTENTION! If the user provides examples where the reasoning happens afterward, REVERSE the order! NEVER START EXAMPLES WITH CONCLUSIONS! | |
- Reasoning Order: Call out reasoning portions of the prompt and conclusion parts (specific fields by name). For each, determine the ORDER in which this is done, and whether it needs to be reversed. | |
- Conclusion, classifications, or results should ALWAYS appear last. | |
- Examples: Include high-quality examples if helpful, using placeholders [in brackets] for complex elements. | |
- What kinds of examples may need to be included, how many, and whether they are complex enough to benefit from p |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import asyncio | |
import subprocess | |
import time | |
from typing import List, Dict | |
import torch | |
from openai import AsyncOpenAI | |
from tqdm.asyncio import tqdm | |
import logging |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from typing import Dict, List | |
import torch | |
from transformers import AutoModelForSequenceClassification, AutoTokenizer | |
class ArmoRMPipeline: | |
def __init__(self, model_id, device_map="auto", torch_dtype=torch.bfloat16, truncation=True, trust_remote_code=False, max_length=4096): | |
self.model = AutoModelForSequenceClassification.from_pretrained( | |
model_id, | |
device_map=device_map, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests as r | |
from huggingface_hub import HfFolder | |
from tqdm import tqdm | |
from datasets import Dataset | |
headers = {"Authorization": f"Bearer {HfFolder.get_token()}"} | |
sess = r.Session() | |
sess.headers.update(headers) |
NewerOlder