Skip to content

Instantly share code, notes, and snippets.

View cpfiffer's full-sized avatar
🤙
LOVING IT

Cameron Pfiffer cpfiffer

🤙
LOVING IT
View GitHub Profile
@cpfiffer
cpfiffer / streaming-turing.jl
Created March 3, 2020 14:19
Stream Turing samples onto disk as they come.
using Turing, MCMCChains
using AbstractMCMC
using JLD2, FileIO
import Random: GLOBAL_RNG
# Create a model.
@model model(y) = begin
μ ~ Normal(0, 1)
s ~ InverseGamma(2,3)
Formula:
returns ~ BITX + lag_returns + log_active + lag_active + log_avg_size + log_med_size + log_mode_size + native_transactions + :(fe(yearmonth))
Fixed Effect Model
=======================================================================================
Number of obs: 1066 Degrees of freedom: 44
R2: 0.157 R2 Adjusted: 0.122
F Statistic: 12.4004 p-value: 0.000
R2 within: 0.106 Iterations: 1
Converged: true
@cpfiffer
cpfiffer / poly-test.jl
Created January 13, 2021 06:54
A super simple estimation of a function using its derivative.
# Set up environment
import Pkg; Pkg.activate(".")
# Imports
using Distributions
using Plots
using Polynomials
using Optim
using ForwardDiff
@cpfiffer
cpfiffer / mdc.sh
Created June 17, 2024 20:41
Quick julia script to display markdown
#!/usr/bin/env bash
program="
import Markdown
map(ARGS) do f
c = if isfile(f)
read(f, String)
else
f
end
display(Markdown.parse(c))
@cpfiffer
cpfiffer / coinflip.py
Created September 10, 2024 21:11
Coin flipping with outlines
import outlines
from transformers import BitsAndBytesConfig
# Load the model
model = outlines.models.transformers(
"microsoft/Phi-3-mini-4k-instruct",
model_kwargs={
'quantization_config':BitsAndBytesConfig(
# Load the model in 4-bit mode
load_in_4bit=True,
@cpfiffer
cpfiffer / text-to-sql.py
Created October 17, 2024 18:06
Outlines code to generate text-to-sql from a context-free grammar
import outlines
sql_grammar = """
start: set_expr -> final
set_expr: query_expr
| set_expr "UNION"i ["DISTINCT"i] set_expr -> union_distinct
| set_expr "UNION"i "ALL"i set_expr -> union_all
| set_expr "INTERSECT"i ["DISTINCT"i] set_expr -> intersect_distinct
| set_expr "EXCEPT"i ["DISTINCT"i] set_expr -> except_distinct
@cpfiffer
cpfiffer / pdf-to-structure.py
Last active January 22, 2025 18:14
Get structured output from PDFs. Goes through a PDF one page at a time -- it is not currently build for multiple pages, but could be extended as needed.
"""
pip install outlines torch==2.4.0 transformers accelerate typing-extensions pillow pdf2image rich requests
may need to install tkinter: https://stackoverflow.com/questions/25905540/importerror-no-module-named-tkinter
sudo apt-get install poppler-utils
"""
from enum import Enum
from io import BytesIO
@cpfiffer
cpfiffer / text-to-sql.md
Created November 15, 2024 17:41
text to sql with outlines

Text to SQL

!!! note This example was adapted from Morgan Giraud on our Discord. You can find their twitter here. Thank you Morgan!

Outlines provides experimental support for context-free grammars (CFGs) for text generation. Future versions will provide more comprehensive support for structured outputs.

SQL is a context-free language, meaning that the structure of the query is independent of the content.

@cpfiffer
cpfiffer / stop-sign.py
Created December 4, 2024 21:59
Reading road signs
"""
pip install outlines torch==2.4.0 transformers accelerate pillow rich
sudo apt-get install poppler-utils
"""
from enum import Enum
from PIL import Image
import outlines
import torch
@cpfiffer
cpfiffer / highlights.py
Created January 15, 2025 21:17
Extract highlights from a piece of content.
import outlines
from pydantic import BaseModel
from transformers import AutoTokenizer
from rich import print
model_string = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
model = outlines.models.transformers(model_string)
tokenizer = AutoTokenizer.from_pretrained(model_string)
class Highlights(BaseModel):