Skip to content

Instantly share code, notes, and snippets.

View abodacs's full-sized avatar

Abdullah Mohammed abodacs

View GitHub Profile
@abodacs
abodacs / jserv_hf_fast.py
Created July 5, 2021 09:38 — forked from kinoc/jserv_hf_fast.py
Run HuggingFace converted GPT-J-6B checkpoint using FastAPI and Ngrok on local GPU (3090 or Titan)
# So you want to run GPT-J-6B using HuggingFace+FastAPI on a local rig (3090 or TITAN) ... tricky.
# special help from the Kolob Colab server https://colab.research.google.com/drive/1VFh5DOkCJjWIrQ6eB82lxGKKPgXmsO5D?usp=sharing#scrollTo=iCHgJvfL4alW
# Conversion to HF format (12.6GB tar image) found at https://drive.google.com/u/0/uc?id=1NXP75l1Xa5s9K18yf3qLoZcR6p4Wced1&export=download
# Uses GDOWN to get the image
# You will need 26 GB of space, 12+GB for the tar and 12+GB expanded (you can nuke the tar after expansion)
# Near Simplest Language model API, with room to expand!
# runs GPT-J-6B on 3090 and TITAN and servers it using FastAPI
# change "seq" (which is the context size) to adjust footprint
@abodacs
abodacs / shuffle_tfrecords.py
Created September 6, 2021 23:49 — forked from kingoflolz/shuffle_tfrecords.py
A quick script for shuffling tfrecord datasets
import tensorflow as tf
from tqdm import tqdm
index = open("data/openwebtext2_new_inputs.train.index").read().splitlines()
dataset = tf.data.Dataset.from_tensor_slices(index)
dataset = dataset.interleave(tf.data.TFRecordDataset, cycle_length=128, num_parallel_calls=tf.data.experimental.AUTOTUNE)
d = dataset.shuffle(10000).prefetch(100)
@abodacs
abodacs / double_checked_lock_iterator.py
Created January 23, 2022 18:43 — forked from adamchainz/double_checked_lock_iterator.py
double_checked_lock_iterator.py
# refactor of https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/
# untested
def double_checked_lock_iterator(queryset):
for item_pk in queryset.values_list("pk", flat=True):
with transaction.atomic():
try:
yield queryset.select_for_update(skip_locked=True).get(id=item_pk)
except queryset.model.DoesNotExist:
pass
@abodacs
abodacs / README-Template.md
Created January 31, 2022 12:05 — forked from PurpleBooth/README-Template.md
A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

@abodacs
abodacs / understanding-word-vectors.ipynb
Created February 1, 2022 09:20 — forked from aparrish/understanding-word-vectors.ipynb
Understanding word vectors: A tutorial for "Reading and Writing Electronic Text," a class I teach at ITP. (Python 2.7) Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@abodacs
abodacs / dotnetlayout.md
Created June 26, 2022 16:35 — forked from davidfowl/dotnetlayout.md
.NET project structure
$/
  artifacts/
  build/
  docs/
  lib/
  packages/
  samples/
  src/
 tests/
const routes = {
home: '/',
transactions: '/transactions',
transactionDetails: '/transactions/:uuid',
}
const urls: Record<
keyof typeof routes,
{ get: (params?: any) => string; route: string }
> = new Proxy(routes, {
@abodacs
abodacs / video-subtitles-via-whisper.py
Created September 25, 2022 19:26 — forked from rasbt/video-subtitles-via-whisper.py
Script that creates subtitles (closed captions) for all MP4 video files in your current directory
# Sebastian Raschka 09/24/2022
# Create a new conda environment and packages
# conda create -n whisper python=3.9
# conda activate whisper
# conda install mlxtend -c conda-forge
# Install ffmpeg
# macOS & homebrew
# brew install ffmpeg
# Ubuntu
@abodacs
abodacs / gsoc_2022_work_product.md
Created October 11, 2022 22:33 — forked from yuroitaki/gsoc_2022_work_product.md
This document summarises the work that I have done as part of Google Summer of Code 2022.

Google Summer of Code 2022 Work Product

This document summarises the work that I have done as part of Google Summer of Code 2022 (GSoC).

Summary

@abodacs
abodacs / whisper-transcribe.bash
Created November 9, 2022 08:49 — forked from DaniruKun/whisper-transcribe.bash
Transcribe (and translate) any VOD (e.g. from Youtube) using Whisper from OpenAI and embed subtitles!
#!/usr/bin/env bash
# Small shell script to more easily automatically download and transcribe live stream VODs.
# This uses YT-DLP, ffmpeg and the CPP version of Whisper: https://github.com/ggerganov/whisper.cpp
# Use `./transcribe-vod help` to print help info.
# MIT License
# Copyright (c) 2022 Daniils Petrovs