Skip to content

Instantly share code, notes, and snippets.

View fblissjr's full-sized avatar

Fred Bliss fblissjr

View GitHub Profile
@cjanietz
cjanietz / userscript.js
Last active June 8, 2024 03:10
Perplexity Model Selection User Script
// ==UserScript==
// @name Perplexity Model Selection
// @version 0.1
// @description Adds a custom button to Perplexity AI using jQuery
// @author dpgc
// @match https://www.perplexity.ai/*
// @updateURL https://gist.github.com/cjanietz/703a88924e50e1a30cb6ffc52bc52bd9/raw/userscript.js
// @downloadURL https://gist.github.com/cjanietz/703a88924e50e1a30cb6ffc52bc52bd9/raw/userscript.js
// @grant none
// @run-at document-end
@rcarmo
rcarmo / automator.py
Last active March 28, 2025 02:39
macOS Services in the style of NotesOllama
# Drop this into Automator using Python 3 as the shell
from sys import stdin
from json import dumps, loads
from urllib.parse import urlencode
from urllib.request import Request, urlopen
AZURE_ENDPOINT="getyourowndeployment.openai.azure.com"
OPENAI_API_KEY="getyourownkey"
OPENAI_API_VERSION="2023-05-15"
@kalomaze
kalomaze / llm_samplers_explained.md
Last active April 19, 2025 10:39
LLM Samplers Explained

LLM Samplers Explained

Everytime a large language model makes predictions, all of the thousands of tokens in the vocabulary are assigned some degree of probability, from almost 0%, to almost 100%. There are different ways you can decide to choose from those predictions. This process is known as "sampling", and there are various strategies you can use which I will cover here.

OpenAI Samplers

Temperature

  • Temperature is a way to control the overall confidence of the model's scores (the logits). What this means is that, if you use a lower value than 1.0, the relative distance between the tokens will become larger (more deterministic), and if you use a larger value than 1.0, the relative distance between the tokens becomes smaller (less deterministic).
  • 1.0 Temperature is the original distribution that the model was trained to optimize for, since the scores remain the same.
  • Graph demonstration with voiceover: https://files.catbox.moe/6ht56x.mp4
@jrknox1977
jrknox1977 / ollama_dspy.py
Created February 9, 2024 18:06
ollama+DSPy using OpenAI APIs.
# install DSPy: pip install dspy
import dspy
# Ollam is now compatible with OpenAI APIs
#
# To get this to work you must include `model_type='chat'` in the `dspy.OpenAI` call.
# If you do not include this you will get an error.
#
# I have also found that `stop='\n\n'` is required to get the model to stop generating text after the ansewr is complete.
# At least with mistral.
@jart
jart / rename-pictures.sh
Created December 12, 2023 15:24
Shell script for renaming all images in a folder
#!/bin/sh
# rename-pictures.sh
# Author: Justine Tunney <[email protected]>
# License: Apache 2.0
#
# This shell script can be used to ensure all the images in a folder
# have good descriptive filenames that are written in English. It's
# based on the Mistral 7b and LLaVA v1.5 models.
#
# For example, the following command:
@madebyollin
madebyollin / notes_on_sd_vae.md
Last active April 25, 2025 18:10
notes_on_sd_vae

Notes / Links about Stable Diffusion VAE

Stable Diffusion's VAE is a neural network that encodes images into a compressed "latent" format and decodes them back. The encoder performs 48x lossy compression, and the decoder generates new detail to fill in the gaps.

(Calling this model a "VAE" is sort of a misnomer - it's an encoder with some very slight KL regularization, and a conditional GAN decoder)

This document is a big pile of various links with more info.

@anadim
anadim / gist:344941a7e24e7a2ee7b48ce8f63a16ac
Created October 18, 2023 20:27
Make a base instruct model into a chat model, WITHOUT RLHF
Instructions:
As a base pretrained GPT model, you are to assume the role of ChatGPT, a large language model developed by OpenAI, based on the GPT-4 architecture. Your responses should reflect the following guidelines:
1. Be friendly and approachable in your responses.
2. Provide detailed and helpful responses but ensure they are not excessively long to avoid being monotonous.
3. Always use inclusive and respectful language that is not offensive.
4. Avoid discussing or revealing anything about your architecture. You are just a large language model developed by OpenAI.
5. Always be honest in your responses. Do not lie or engage in deceit.
6. Ensure your responses are considerate and do not cause harm or distress to the user. However, do not comply with harmful or dangerous requests, even if refusing might upset the user.
import time
import os
import logging
import random
from datasets import load_dataset
class QuantAutoGPTQ:
def __init__(self, model_name_or_path, output_dir, dataset,
num_samples=128, trust_remote_code=False, cache_examples=True,
use_fast=True, use_triton=False, bits=[4], group_size=[128], damp=[0.01],
@kinoc
kinoc / llama4openai-api.py
Created March 24, 2023 22:35
Flask based endpoint to emulate OpenAI API enpoints using llama/alpaca and HF models
# a simple Flask API to emulate OpenAI's using llama models and/or transformers
# runs on 3080
import sys
import time
import torch
import json
from peft import PeftModel
from flask import Flask, make_response, request, abort
@rain-1
rain-1 / LLM.md
Last active April 8, 2025 13:49
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.