Skip to content

Instantly share code, notes, and snippets.

View Vaibhavs10's full-sized avatar
🐍

vb Vaibhavs10

🐍
View GitHub Profile
user_name=
ssh_key=""
cd /home
sudo useradd -m "$user_name"
sudo mkdir /home/"$user_name"/.ssh
echo "$ssh_key" | sudo tee -a /home/"$user_name"/.ssh/authorized_keys
sudo chsh -s /usr/bin/bash "$user_name"
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs

Hey hey!

We are on a mission to democratise speech, increase the language coverage of current SoTA speech recognition and push the limits of what is possible. Come join us from December 5th - 19th for a community sprint powered by Lambda Labs. Through this sprint, we'll cover 70+ languages, 39M - 1550M parameters & evaluate our models on real-world evaluation datasets.

Register your interest via the Google form here.

What is the sprint about ❓

The goal of the sprint is to fine-tune Whisper in as many languages as possible and make them accessible to the community. We hope that especially low-resource languages will profit from this event.

@Vaibhavs10
Vaibhavs10 / robust-asr.md
Last active May 17, 2022 10:12
Robust ASR: An applied survey of current SoTA ASR architectures

Motivation

Whilst the current ASR landscape is really promosing a lot of it is currently benchmarked on rather "clean" datasets. This often creates a false sense of confidence in the Architecture which might not translate to the real world.

Types of Noises

  1. Gaussian White Noise
  2. Real World Noise
  3. Choppy audio (random 1-2s removed from the audio snippet)
  4. Speed up (random 10s snippets sped up than the rest)

Evaluation

%matplotlib inline
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# Total population, N.
N = 1339200000
# Initial number of infected and recovered individuals, I0 and R0.
# Credits: https://scipython.com/book/chapter-8-scipy/additional-examples/the-sir-epidemic-model/
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import seaborn as sns
# Total population, N.
N = 1339200000
@Vaibhavs10
Vaibhavs10 / oracle_pandas.py
Last active November 12, 2019 17:24
Script to connect with Oracle and create a pandas dataframe
import pandas as pd
from sqlalchemy import create_engine
import cx_Oracle
oracle_connection_string = (
'oracle+cx_oracle://{username}:{password}@' +
cx_Oracle.makedsn('{hostname}', '{port}', service_name='{service_name}')
)
engine = create_engine(
@Vaibhavs10
Vaibhavs10 / pep8-cheatsheet.py
Created November 7, 2019 07:59
PEP 8 Cheatsheet
#! /usr/bin/env python
# -*- coding: utf-8 -*-
"""This module's docstring summary line.
This is a multi-line docstring. Paragraphs are separated with blank lines.
Lines conform to 79-column limit.
Module and packages names should be short, lower_case_with_underscores.
Notice that this in not PEP8-cheatsheet.py
Seriously, use flake8. Atom.io with https://atom.io/packages/linter-flake8
is awesome!
See http://www.python.org/dev/peps/pep-0008/ for more PEP-8 details