Skip to content

Instantly share code, notes, and snippets.

View Vaibhavs10's full-sized avatar
🐍

vb Vaibhavs10

🐍
View GitHub Profile
import json
import time
import torch
from transformers import pipeline
pipe = pipeline(
"automatic-speech-recognition",
"openai/whisper-large-v3",
torch_dtype=torch.float16,
device="mps",
import json
import argparse
import torch
from transformers import pipeline
parser = argparse.ArgumentParser(description="Automatic Speech Recognition")
parser.add_argument(
"--file-name",
required=True,
@Vaibhavs10
Vaibhavs10 / zephyr-7b-beta-gptq-transformers.py
Created November 13, 2023 21:55
zephyr-7b-beta-gptq-transformers
!pip install transformers optimum
!pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name_or_path = "TheBloke/zephyr-7B-beta-GPTQ"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
device_map="auto",
trust_remote_code=False,
@Vaibhavs10
Vaibhavs10 / how_to_use_cv11.py
Created February 13, 2023 16:13
How to use Common Voice 11 with πŸ€—Datasets
# Load the dataset (locally)
from datasets import load_dataset
cv_11 = load_dataset("mozilla-foundation/common_voice_11_0", "hi", split="train")
# Stream the dataset
from datasets import load_dataset
#pip install git+https://github.com/huggingface/transformers.git
import datetime
import sys
from transformers import pipeline
from transformers.pipelines.audio_utils import ffmpeg_microphone_live
pipe = pipeline("automatic-speech-recognition", model="openai/whisper-base", device=0)
sampling_rate = pipe.feature_extractor.sampling_rate
user_name=
ssh_key=""
cd /home
sudo useradd -m "$user_name"
sudo mkdir /home/"$user_name"/.ssh
echo "$ssh_key" | sudo tee -a /home/"$user_name"/.ssh/authorized_keys
sudo chsh -s /usr/bin/bash "$user_name"
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs

Hey hey!

We are on a mission to democratise speech, increase the language coverage of current SoTA speech recognition and push the limits of what is possible. Come join us from December 5th - 19th for a community sprint powered by Lambda Labs. Through this sprint, we'll cover 70+ languages, 39M - 1550M parameters & evaluate our models on real-world evaluation datasets.

Register your interest via the Google form here.

What is the sprint about ❓

The goal of the sprint is to fine-tune Whisper in as many languages as possible and make them accessible to the community. We hope that especially low-resource languages will profit from this event.

@Vaibhavs10
Vaibhavs10 / robust-asr.md
Last active May 17, 2022 10:12
Robust ASR: An applied survey of current SoTA ASR architectures

Motivation

Whilst the current ASR landscape is really promosing a lot of it is currently benchmarked on rather "clean" datasets. This often creates a false sense of confidence in the Architecture which might not translate to the real world.

Types of Noises

  1. Gaussian White Noise
  2. Real World Noise
  3. Choppy audio (random 1-2s removed from the audio snippet)
  4. Speed up (random 10s snippets sped up than the rest)

Evaluation

%matplotlib inline
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# Total population, N.
N = 1339200000
# Initial number of infected and recovered individuals, I0 and R0.
# Credits: https://scipython.com/book/chapter-8-scipy/additional-examples/the-sir-epidemic-model/
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import seaborn as sns
# Total population, N.
N = 1339200000