vb Vaibhavs10

Hey hey!

We are on a mission to democratise speech, increase the language coverage of current SoTA speech recognition and push the limits of what is possible. Come join us from December 5th - 19th for a community sprint powered by Lambda Labs. Through this sprint, we'll cover 70+ languages, 39M - 1550M parameters & evaluate our models on real-world evaluation datasets.

Register your interest via the Google form here.

What is the sprint about ❓

The goal of the sprint is to fine-tune Whisper in as many languages as possible and make them accessible to the community. We hope that especially low-resource languages will profit from this event.

Motivation

Whilst the current ASR landscape is really promosing a lot of it is currently benchmarked on rather "clean" datasets. This often creates a false sense of confidence in the Architecture which might not translate to the real world.

Types of Noises

Gaussian White Noise
Real World Noise
Choppy audio (random 1-2s removed from the audio snippet)
Speed up (random 10s snippets sped up than the rest)

	from huggingface_hub import HfApi
	from huggingface_hub import logging

	logging.set_verbosity_info()

	api = HfApi()

	api.upload_folder(folder_path="<FOLDER NAME>",
	repo_id="<DATASET NAME>",
	repo_type="dataset",

	import transformers

	model_name = 'Intel/neural-chat-7b-v3-1'
	model = transformers.AutoModelForCausalLM.from_pretrained(model_name)
	tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)


	def generate_response(system_input, user_input):

	# Format the input using the provided template

	import json
	import time
	import torch
	from transformers import pipeline

	pipe = pipeline(
	"automatic-speech-recognition",
	"openai/whisper-large-v3",
	torch_dtype=torch.float16,
	device="mps",

	import json

	import argparse
	import torch
	from transformers import pipeline

	parser = argparse.ArgumentParser(description="Automatic Speech Recognition")
	parser.add_argument(
	"--file-name",
	required=True,

	!pip install transformers optimum
	!pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/

	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name_or_path = "TheBloke/zephyr-7B-beta-GPTQ"

	model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
	device_map="auto",
	trust_remote_code=False,

	# Load the dataset (locally)

	from datasets import load_dataset

	cv_11 = load_dataset("mozilla-foundation/common_voice_11_0", "hi", split="train")

	# Stream the dataset

	from datasets import load_dataset

	#pip install git+https://github.com/huggingface/transformers.git

	import datetime
	import sys
	from transformers import pipeline
	from transformers.pipelines.audio_utils import ffmpeg_microphone_live

	pipe = pipeline("automatic-speech-recognition", model="openai/whisper-base", device=0)
	sampling_rate = pipe.feature_extractor.sampling_rate

	user_name=
	ssh_key=""

	cd /home
	sudo useradd -m "$user_name"
	sudo mkdir /home/"$user_name"/.ssh
	echo "$ssh_key" \| sudo tee -a /home/"$user_name"/.ssh/authorized_keys
	sudo chsh -s /usr/bin/bash "$user_name"
	curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh \| sudo bash
	sudo apt-get install git-lfs

vb Vaibhavs10

What is the sprint about ❓

Motivation

Types of Noises

Evaluation