Skip to content

Instantly share code, notes, and snippets.

View meetps's full-sized avatar
IDDQD

Meet Shah meetps

IDDQD
View GitHub Profile
@benjaminshafii
benjaminshafii / basel.js
Created August 10, 2023 00:40
Auto-generate OpenAPI spec w/ Anthropic Claude from any programming language
const Anthropic = require('@anthropic-ai/sdk');
const path = require('path');
const YAML = require('yaml');
const fs = require('fs');
// Initialize Anthropic SDK
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
@rain-1
rain-1 / llama-home.md
Last active November 9, 2024 03:49
How to run Llama 13B with a 6GB graphics card

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

  • Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.

ImageNet validation set fix:

  1. The training set is organized in directories, with each directory matching a class, e.g. "n01751748" matching "sea snake." However, the valset is a flat dir of JPEGs. The ImageNet labels provided in the devkit for the validation set (ILSVRC2012_validation_ground_truth.txt) are not consistent with the ordering used by PyTorch/TF/Keras/MXNet/Caffe, etc. for pre-trained models. For example, in the the above ground truth label file, "sea snake" is 490, but in PyTorch/TF, it's 65.
    Proof:
  2. Untar the valset file, you will get a flat dir of JPEGs.
  3. Pull in the unflattening script into the directory where the val images were unpa
# link to package https://github.com/lucidrains/slot-attention
import torch
from torch import nn
class Residual(nn.Module):
def __init__(self, fn):
super().__init__()
self.fn = fn
def forward(self, x):
@mikhailov-work
mikhailov-work / turbo_colormap.py
Created August 8, 2019 23:31
Turbo Colormap Look-up Table
# Copyright 2019 Google LLC.
# SPDX-License-Identifier: Apache-2.0
# Author: Anton Mikhailov
turbo_colormap_data = [[0.18995,0.07176,0.23217],[0.19483,0.08339,0.26149],[0.19956,0.09498,0.29024],[0.20415,0.10652,0.31844],[0.20860,0.11802,0.34607],[0.21291,0.12947,0.37314],[0.21708,0.14087,0.39964],[0.22111,0.15223,0.42558],[0.22500,0.16354,0.45096],[0.22875,0.17481,0.47578],[0.23236,0.18603,0.50004],[0.23582,0.19720,0.52373],[0.23915,0.20833,0.54686],[0.24234,0.21941,0.56942],[0.24539,0.23044,0.59142],[0.24830,0.24143,0.61286],[0.25107,0.25237,0.63374],[0.25369,0.26327,0.65406],[0.25618,0.27412,0.67381],[0.25853,0.28492,0.69300],[0.26074,0.29568,0.71162],[0.26280,0.30639,0.72968],[0.26473,0.31706,0.74718],[0.26652,0.32768,0.76412],[0.26816,0.33825,0.78050],[0.26967,0.34878,0.79631],[0.27103,0.35926,0.81156],[0.27226,0.36970,0.82624],[0.27334,0.38008,0.84037],[0.27429,0.39043,0.85393],[0.27509,0.40072,0.86692],[0.27576,0.41097,0.87936],[0.27628,0.42118,0.89123],[0.27667,0.43134,0.90254],[0.27691,0.44145,0.913

Repo

Haven't decided if I am going to submit code to it yet

Questions

👍 - Unlocked & Done 😊 - Locked & Done 🔒 - Locked & Not Done [] - Not yet Done

@mcarilli
mcarilli / gist_replay.md
Last active July 31, 2020 00:00
Example of batch replay with Amp opt_level=O1 + dynamic gradient scaling

This example is based on main_amp.py from the Apex imagenet amp examples and can be used with the same example commands. It demonstrates batch replay (instead of batch skipping) with the dynamic gradient scaling used by Amp.

Batch replay requires a bit of user-side control flow, but is fairly straightforward.

Ctrl+f "added for batch replay" in main_amp_replay.py below to see what was changed. There should only be 5 instances, found entirely in this section.

Vimdiffing main_amp_replay.py and main_amp.py from the Apex example directory is also instructive. Again, there should be few differences.

See the "Batch replay" example in the Automatic Mixed Precision RFC for a preview of how I plan this will wor

@dojoteef
dojoteef / profile.py
Last active June 5, 2023 11:44
A CUDA memory profiler for pytorch
'''
Memory profiling utilities
'''
import gc
import inspect
import linecache
import os.path
import sys
import time
import threading
@Mahedi-61
Mahedi-61 / cuda_11.8_installation_on_Ubuntu_22.04
Last active November 16, 2024 10:21
Instructions for CUDA v11.8 and cuDNN 8.9.7 installation on Ubuntu 22.04 for PyTorch 2.1.2
#!/bin/bash
### steps ####
# Verify the system has a cuda-capable gpu
# Download and install the nvidia cuda toolkit and cudnn
# Setup environmental variables
# Verify the installation
###
### to verify your gpu is cuda enable check
@SSS135
SSS135 / batch_renormalization.py
Last active June 4, 2018 20:54
Testing Batch Renormalization with Tensor Comprehensions
import time
import tensor_comprehensions as tc
import tensor_comprehensions.tc_unit as tcu
import torch
import torch.cuda
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable, Function