Skip to content

Instantly share code, notes, and snippets.

View abodacs's full-sized avatar

Abdullah Mohammed abodacs

View GitHub Profile
@abodacs
abodacs / whisper-static-cache.ipynb
Created June 3, 2024 09:53 — forked from huseinzol05/whisper-static-cache.ipynb
example of whisper static cache
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@abodacs
abodacs / 00_install_fbsd_14_0_hetzner.md
Created August 12, 2024 11:35 — forked from ctsrc/00_install_fbsd_14_1_hetzner.md
Install FreeBSD 14.0 on Hetzner

Install FreeBSD 14.0 on Hetzner server

Hetzner no longer offers direct install of FreeBSD, but we can do it ourselves. Here is how :)

Boot the hetzner server in Hetnzer Debain based rescue mode. ssh into it. then:

wget https://mfsbsd.vx.sk/files/iso/14/amd64/mfsbsd-14.0-RELEASE-amd64.iso

qemu-system-x86_64 \
@abodacs
abodacs / testRegex.js
Created August 18, 2024 11:02 — forked from hanxiao/testRegex.js
Regex for chunking by using all semantic cues
// Updated: Aug. 15, 2024
// Run: node testRegex.js testText.txt
// Used in https://jina.ai/tokenizer
const fs = require('fs');
const util = require('util');
// Define variables for magic numbers
const MAX_HEADING_LENGTH = 7;
const MAX_HEADING_CONTENT_LENGTH = 200;
const MAX_HEADING_UNDERLINE_LENGTH = 200;
@abodacs
abodacs / gist:2dd057c8611145f00a77e8685cfa9d05
Created August 27, 2024 12:48 — forked from LukasKriesch/gist:e75a0132e93ca989f8870c4f95be734b
Python translation Jina AI chunking regex
import regex as re
import requests
MAX_HEADING_LENGTH = 7
MAX_HEADING_CONTENT_LENGTH = 200
MAX_HEADING_UNDERLINE_LENGTH = 200
MAX_HTML_HEADING_ATTRIBUTES_LENGTH = 100
MAX_LIST_ITEM_LENGTH = 200
MAX_NESTED_LIST_ITEMS = 6
MAX_LIST_INDENT_SPACES = 7
@abodacs
abodacs / llama_torchao_compile.py
Created August 28, 2024 16:35 — forked from SunMarc/llama_torchao_compile.py
`transformers` + `torchao` quantization + `torch.compile` on Llama3.1 8B
# REQUIRES torchao, torch nightly (or torch 2.5) and transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, TorchAoConfig
from transformers import TextStreamer
import torch
from tqdm import tqdm
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false" # To prevent long warnings :)
torch.set_float32_matmul_precision('high')
@abodacs
abodacs / debugHelper.js
Created September 18, 2024 08:36 — forked from karenpayneoregon/debugHelper.js
CSS Helper
var $debugHelper = $debugHelper || {};
$debugHelper = function () {
var href = "lib/debugger.css";
var addCss = function () {
if (styleStyleIsLoaded(href) === true) {
return;
}
const head = document.head;
const link = document.createElement("link");
link.type = "text/css";
@abodacs
abodacs / intercom_export.js
Created October 28, 2024 16:50 — forked from satyrius/intercom_export.js
Export Intercom conversations/chats entire history
require("dotenv").config();
const H = require("highland");
const axios = require("axios");
const fs = require("fs").promises;
const exportDirectory = "./export";
const apiUrl = "https://api.intercom.io";
// config axios for the authorized API request
const apiClient = axios.create({
@abodacs
abodacs / flux_infer.py
Created November 20, 2024 23:33 — forked from gau-nernst/flux_infer.py
FLUX CPU offload
import torch
from diffusers import FluxPipeline
from torch import nn
class ModelOffloaderV2:
def __init__(self, model: nn.Module, record_stream: bool = False):
# move model to pinned memory. keep a model copy in CPU pinned memory.
for p in model.parameters():
p.data = p.data.cpu().pin_memory()
@abodacs
abodacs / wrapper.py
Created November 23, 2024 11:56 — forked from charlesfrye/wrapper.py
Train GPT-2 in five minutes -- for free!
# Train GPT-2 in five minutes -- for free
#
# ```bash
# pip install modal
# modal setup
# modal run wrapper.py
# ```
#
# Note that the end-to-end latency the first time is more like 25 minutes:
# - five minutes to install Torch (rip)
import multiprocessing
manager = multiprocessing.Manager()
all_hashes_set = manager.dict()
def deduplicate(examples, all_hashes_set):
print(len(all_hashes_set))
input_ids = examples['input_ids']
hashes = [
hash(tuple(input_ids[i]))
for i in range(len(input_ids))