Skip to content

Instantly share code, notes, and snippets.

View egorsmkv's full-sized avatar
🌍
world, hello

Yehor Smoliakov egorsmkv

🌍
world, hello
View GitHub Profile
WhisperForConditionalGeneration(
  (model): WhisperModel(
    (encoder): WhisperEncoder(
      (conv1): Conv1d(128, 1280, kernel_size=(3,), stride=(1,), padding=(1,))
      (conv2): Conv1d(1280, 1280, kernel_size=(3,), stride=(2,), padding=(1,))
      (embed_positions): Embedding(1500, 1280)
      (layers): ModuleList(
        (0-31): 32 x WhisperEncoderLayer(
          (self_attn): WhisperSdpaAttention(
import torchaudio
from speechbrain.pretrained import VAD
VAD = VAD.from_hparams(source="speechbrain/vad-crdnn-libriparty", savedir="pretrained_models/vad-crdnn-libriparty")
test_file = 'a.wav'
boundaries = VAD.get_speech_segments(test_file)
segments = VAD.get_segments(boundaries, test_file)
"""
Python implementation of Viterbi algorithm for word segmentation
A clean-up of this: http://norvig.com/ngrams/ch14.pdf
-
You also need 'unigrams.txt' and 'bigrams.txt' to run the segmentation. The ngrams
used in this implementation is from the 'count_1w.txt' and 'count_2w.txt' provided
here: http://norvig.com/ngrams/
-
Usage:
>>> from segment import viterbi
@egorsmkv
egorsmkv / flashlight-coreweave.md
Last active July 9, 2022 12:59
Installation of Facebook's Flashlight (former wav2letter++), install CUDA 10 on Ubuntu 18.04, tested with a Tesla V100 on CoreWeave GPU Cloud
  • GPU: Tesla v100
  • Ubuntu 18.04
apt update
apt install cmake gcc-7 liblzma-dev libbz2-dev

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
@egorsmkv
egorsmkv / mphdict_words_forms.py
Created April 28, 2021 14:39
mphdict words forms generator in python
"""
Generator of words forms for LinguisticAndInformationSystems/mphdict
Source code: https://github.com/LinguisticAndInformationSystems/mphdict/blob/master/src/mphdict/mphDb.cs#L214
License: https://github.com/LinguisticAndInformationSystems/mphdict/blob/master/LICENSE.txt
Copyright: uSofTrod
Output is like the following:
@egorsmkv
egorsmkv / quantum_resources.md
Last active March 21, 2021 10:13
This is my list of resources on the Quantum Technologies topic. You can suggest your links in the Comments section.

Quantum Resources

Websites

  • [Full-Stack Quantum Computation][3]
  • [Quantum Computing on Stack Exchange][15]

Social Groups

  • [Quantum Computing][4]
@egorsmkv
egorsmkv / algo.py
Last active August 31, 2020 10:15
An algorithm to search longest date intervals among a list of dates in Python (currently searches longest date intervals in months)
from datetime import datetime
from typing import List
def solve(date_items: List[datetime]):
"""
Get the longest date interval from the date_items list.
:param date_items:
:return:
import sys
ETC_SYSCONFIG_NE = 'OPTIONS="--web.listen-address=:{port} --collector.textfile.directory ' \
'/var/lib/node_exporter/textfile_collector --collector.systemd --collector.processes"'
ETC_SYSTEMD_NE = '''[Unit]
Description=Node Exporter
After=network.target
[Service]
#!/usr/bin/bash
mkdir node_exporter
cd node_exporter/
wget https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz
tar xf node_exporter-1.0.1.linux-amd64.tar.gz
cd node_exporter-1.0.1.linux-amd64
mv node_exporter /usr/bin
<!-- Main Header -->
<header class="main-header">
<!-- Logo -->
<a href="/" class="logo">
<span class="logo-mini"><b>A</b></span>
<span class="logo-lg"><b>ADMIN</b></span>
</a>
<nav class="navbar navbar-static-top" role="navigation">
<a href="#" class="sidebar-toggle" data-toggle="push-menu" role="button">