vb Vaibhavs10

Motivation

Whilst the current ASR landscape is really promosing a lot of it is currently benchmarked on rather "clean" datasets. This often creates a false sense of confidence in the Architecture which might not translate to the real world.

Types of Noises

Gaussian White Noise
Real World Noise
Choppy audio (random 1-2s removed from the audio snippet)
Speed up (random 10s snippets sped up than the rest)

Evaluation

Hey hey!

We are on a mission to democratise speech, increase the language coverage of current SoTA speech recognition and push the limits of what is possible. Come join us from December 5th - 19th for a community sprint powered by Lambda Labs. Through this sprint, we'll cover 70+ languages, 39M - 1550M parameters & evaluate our models on real-world evaluation datasets.

Register your interest via the Google form here.

What is the sprint about ❓

The goal of the sprint is to fine-tune Whisper in as many languages as possible and make them accessible to the community. We hope that especially low-resource languages will profit from this event.

	#! /usr/bin/env python
	# -- coding: utf-8 --
	"""This module's docstring summary line.
	This is a multi-line docstring. Paragraphs are separated with blank lines.
	Lines conform to 79-column limit.
	Module and packages names should be short, lower_case_with_underscores.
	Notice that this in not PEP8-cheatsheet.py
	Seriously, use flake8. Atom.io with https://atom.io/packages/linter-flake8
	is awesome!
	See http://www.python.org/dev/peps/pep-0008/ for more PEP-8 details

	import pandas as pd
	from sqlalchemy import create_engine
	import cx_Oracle

	oracle_connection_string = (
	'oracle+cx_oracle://{username}:{password}@' +
	cx_Oracle.makedsn('{hostname}', '{port}', service_name='{service_name}')
	)

	engine = create_engine(

	# Credits: https://scipython.com/book/chapter-8-scipy/additional-examples/the-sir-epidemic-model/

	import numpy as np
	from scipy.integrate import odeint
	import matplotlib.pyplot as plt
	import seaborn as sns

	# Total population, N.
	N = 1339200000

	%matplotlib inline

	import numpy as np
	from scipy.integrate import odeint
	import matplotlib.pyplot as plt

	# Total population, N.
	N = 1339200000

	# Initial number of infected and recovered individuals, I0 and R0.

	user_name=
	ssh_key=""

	cd /home
	sudo useradd -m "$user_name"
	sudo mkdir /home/"$user_name"/.ssh
	echo "$ssh_key" \| sudo tee -a /home/"$user_name"/.ssh/authorized_keys
	sudo chsh -s /usr/bin/bash "$user_name"
	curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh \| sudo bash
	sudo apt-get install git-lfs

	#pip install git+https://github.com/huggingface/transformers.git

	import datetime
	import sys
	from transformers import pipeline
	from transformers.pipelines.audio_utils import ffmpeg_microphone_live

	pipe = pipeline("automatic-speech-recognition", model="openai/whisper-base", device=0)
	sampling_rate = pipe.feature_extractor.sampling_rate

	# Load the dataset (locally)

	from datasets import load_dataset

	cv_11 = load_dataset("mozilla-foundation/common_voice_11_0", "hi", split="train")

	# Stream the dataset

	from datasets import load_dataset

	!pip install transformers optimum
	!pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/

	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name_or_path = "TheBloke/zephyr-7B-beta-GPTQ"

	model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
	device_map="auto",
	trust_remote_code=False,