Skip to content

Instantly share code, notes, and snippets.

View pszemraj's full-sized avatar

Peter pszemraj

View GitHub Profile
@pszemraj
pszemraj / wsl_clipboard.md
Created September 16, 2025 15:35
CLI/'programmatically' copy to clipboard from wsl

copy to clipboard from wsl

An adapted fn of what I use on ubuntu but leverages WSL exposing clip.exe to get stuff on your clipboard

# Copy file contents or stdin to clipboard
# Usage: cz [file]
#   cz file.txt  - copy file to clipboard
#   cmd | cz     - copy stdin to clipboard
# Fails on: binary files, files >10MB, non-existent files
@pszemraj
pszemraj / clock.html
Last active September 12, 2025 04:37
simple html clock
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>Clock AI</title>
<style>
:root {
--bg: #0a0b0d;
--panel: #141518;
@pszemraj
pszemraj / lfm_1b6.py
Created September 8, 2025 07:59
LFM2-VL inference with recommended params
from transformers import AutoProcessor, AutoModelForImageTextToText
from transformers.image_utils import load_image
# Load model and processor
model_id = "LiquidAI/LFM2-VL-1.6B"
model = AutoModelForImageTextToText.from_pretrained(
model_id, device_map="auto", torch_dtype="bfloat16", trust_remote_code=True
)
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
@pszemraj
pszemraj / alacritty.toml
Created August 29, 2025 03:00
config for alacritty+0xProto
# Font configuration - all font settings in ONE section
[font]
size = 15.0
builtin_box_drawing = false # 0xProto has its own box drawing chars
[font.normal]
family = "0xProto"
style = "Regular"
[font.bold]
%%writefile emoji_search.py
#!/usr/bin/env python3
"""
Emoji Semantic Search CLI
reqs:
pip install fire sentence-transformers pandas numpy
Usage:
python emoji_search.py "that is flames"
@pszemraj
pszemraj / llm-foundry-config-reference.md
Last active July 3, 2025 21:02
config reference for mosaicml/llm-foundry by opus-4
@pszemraj
pszemraj / test_gemma3n.py
Created June 29, 2025 21:29
test inference with gemma-3n-e2b-it
# -*- coding: utf-8 -*-
"""gemma-3n-test
pip install -U -q git+https://github.com/huggingface/transformers.git
pip install -U -q git+https://github.com/huggingface/pytorch-image-models.git
"""
from transformers import pipeline
import torch
@pszemraj
pszemraj / slice_image.py
Created June 28, 2025 19:53
Slice a tall image into chunks.
#!/usr/bin/env python3
"""
Slice a (possibly very tall) image into fixed-height chunks.
Creates a sibling directory called <image stem>_slices/
and writes slice_000.png, slice_001.png, … inside it.
"""
import argparse
from pathlib import Path
@pszemraj
pszemraj / push_dataset_from_text.py
Last active June 27, 2025 02:56
aggregate and push an hf dataset from text files
"""
Create & save an hf dataset with train/test/val splits from dir w/ text files
Ideal structure:
root / section_name_1 / file 1
root / section_name_1 / file 2
root / section_name_1 / file YYY
root / section_name_2 / file 1
root / section_name_2 / file ZZZ
@pszemraj
pszemraj / run_ocr_nanonets.py
Last active June 18, 2025 01:52
Standalone Asynchronous Nanonets-OCR-s Inference Script using vLLM and PyMuPDF.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Standalone Asynchronous Nanonets-OCR-s Inference Script using vLLM and PyMuPDF.
This script processes PDF files from an input directory using the
nanonets/Nanonets-OCR-s model served locally by vLLM via its OpenAI-compatible API.
It renders each page, sends API requests concurrently for OCR, extracts the
structured markdown/HTML text, and saves the combined text for each PDF into a
corresponding .txt file in the specified output directory.