Skip to content

Instantly share code, notes, and snippets.

View wassname's full-sized avatar
🙃

wassname (Michael J Clark) wassname

🙃
View GitHub Profile
@wassname
wassname / argparse_in_jupyter.py
Last active March 5, 2025 06:41
argparse in jupyter?
"""
sometimes you want to run or adapt a cli script from jupyter, here a decent way to do it
# https://gist.github.com/wassname/f1c23636cc04b39176bd82f45c6e398a
"""
def jupyter_argparse(s: str) -> list:
"""
Usage
s = '''
--rank 16
@wassname
wassname / gpt4v_on_public_eng_docs.ipynb
Created November 7, 2023 00:29
gpt4v on public domain engineering docs
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@wassname
wassname / emojis.json
Created November 4, 2023 01:00
Emoji's and their uses according to llama uncensored
This file has been truncated, but you can view the full file.
{
"🦆": {
"tags": [
"Waterfowl",
"Bird",
"Quack"
],
"usage": [
"🦆🌊: swimming duck",
"🦆🍞: feeding ducks",
@wassname
wassname / STOP_DOING_MATH.md
Last active January 29, 2026 22:31
It turns out LLM's can generate the STOP DOING MATH meme https://knowyourmeme.com/memes/stop-doing-math

STOP DOING MATH

  • NUMBERS WERE NOT SUPPOSED TO BE GIVEN NAMES
  • YEARS OF COUNTING yet NO REAL-WORLD USE FOUND for going higher than your FINGERS
  • Wanted to go higher anyway for a laugh? We had a tool for that: It was called "GUESSING"
  • "Yes please give me ZERO of something. Please give me INFINITE of it" - Statements dreamed up by the utterly Deranged

LOOK at what Mathematicians have been demanding your Respect for all this time, with all the calculators & abacus we built for them

@wassname
wassname / split_by_token.py
Last active October 7, 2023 01:28
Perfect text splitter for LLM's
"""
When splitting text for Language Models, aim for two properties:
- Limit tokens to a maximum size (e.g., 400)
- Use natural boundaries for splits (e.g. ".")
Many splitters don't enforce a token size limit, causing errors like "device assert" or "out of memory." Others focus on character length rather than token length. To address these issues:
- Use RecursiveCharacterTextSplitter from the langchain library
- Set the last separator to an empty string '' to ensure there is always a splitting point, thus maintaining token limits
@wassname
wassname / cuda_11.8_installation_on_Ubuntu_22.04
Last active October 6, 2023 04:20 — forked from MihailCosmin/cuda_11.8_installation_on_Ubuntu_22.04
Instructions for CUDA v11.8 and cuDNN 8.7 installation on Ubuntu 22.04 for PyTorch 2.0.0
#!/bin/bash
### steps ####
# verify the system has a cuda-capable gpu
# download and install the nvidia cuda toolkit and cudnn
# setup environmental variables
# verify the installation
###
### to verify your gpu is cuda enable check
@wassname
wassname / lightning_start.py
Last active April 9, 2025 02:40
This is my cheatsheet, for my current best practices for using pytorch lightning `lightning_start.py`. This is verbose so that I can delete what is not needed. I mainly log to csv to keep things simple.
"""
This is a template for starting with pytorch lightning, it includes many extra things because it's easier to delete than reinvent.
It is written for these versions:
- lightning==2.0.2
- pytorch-optimizer==2.8.0
"""
import torch
import torch.nn as nn
@wassname
wassname / lightning_utils.py
Created May 8, 2023 05:38
lightning_utils.py to read from the csv logger
import pytorch_lightning as pl
from pytorch_lightning.loggers import CSVLogger, WandbLogger
from pathlib import Path
import pandas as pd
def read_metrics_csv(metrics_file_path):
df_hist = pd.read_csv(metrics_file_path)
df_hist["epoch"] = df_hist["epoch"].ffill()
df_histe = df_hist.set_index("epoch").groupby("epoch").mean()
return df_histe
@wassname
wassname / tsai_inception_causal.py
Last active May 8, 2023 08:10
InceptionTimePlus from tsai modified to be causal
"""
Modified from https://github.com/timeseriesAI/tsai/blob/main/tsai/models/InceptionTimePlus.py
I've added my favourite modifications
- causal padding in the conv
- causal padding the max pool
- coord=True by default
- dilation so it can work over larger context lengths
- specify kernels to be [3, 13, 39] instead of the default [13, 19, 39] or whatver. This makes sure we have a small kernel as well which can help performance.
@wassname
wassname / mcdropout.py
Last active December 26, 2022 00:15
pytorch lightning minimal training loop
from typing import Callable
from loguru import logger
import torch
from torch import nn
def convert_layers(model: nn.Module, original: nn.Module, value: bool):
"""
Turn dropout on