Skip to content

Instantly share code, notes, and snippets.

View egorsmkv's full-sized avatar
🌍
world, hello

Yehor Smoliakov egorsmkv

🌍
world, hello
View GitHub Profile
@egorsmkv
egorsmkv / logs.txt
Created July 30, 2025 17:56
decoding frames: logs of read-video-rs run
[swscaler @ 0x120008000] No accelerated colorspace conversion found from yuv420p to rgb24.
best_video_stream_index: 0
Width: 1920, height: 1080
Duration: 494.30734451395824 seconds
FPS: 30.001577377319336
Successfully sought to frame near index 0.
Frame time: 0
Successfully sought to frame near index 30.
Frame time: 8.332895221071446
Successfully sought to frame near index 60.
@egorsmkv
egorsmkv / function-calling-LLMs.md
Last active July 30, 2025 14:57
How function calling looks like in training data for LLMs?

For Large Language Models (LLMs) to learn function calling, the training data typically includes examples that demonstrate the pattern of calling a function. This can be represented in various formats, but a common approach is to use a structured format that resembles a dialogue or a sequence of instructions where a "user" or "assistant" invokes a function with specific arguments. The representation might vary depending on the specific LLM and its intended application, but here's a general idea of how it might look:

Example Format for Training Data

  1. Text-based representation: In this format, function calls are represented as text that the model learns to predict or generate. For instance, if the task involves calling a function get_weather(city), the training data might include examples like:

    User: What is the weather like in Paris?
    Assistant: I need to call get_weather(city="Paris"). The weather in Paris is sunny.
    
@egorsmkv
egorsmkv / cv-rust.md
Created July 25, 2025 17:53
Computer Vision links mentioned in This Week in Rust materials

The sources mention several links related to computer vision, graphics, and image processing within the Rust ecosystem. These include libraries, projects, and discussions:

Graphics & Rendering Libraries/Frameworks:

  • Piston: A graphics library. Its image library has been rewritten in pure Rust, and it also features skeletal animation demos and supports "duck typing".
  • glium: A safe wrapper for OpenGL, used in the simple polar dodging game, Rusty_Dodge. Its project updates and a post-mortem have been noted.
  • gfx / gfx-rs / gfx-hal: A crate designed to display content on a screen across various platforms. Updates from its development have been highlighted. It also has a hardware abstraction layer (gfx-hal) and an ecosystem overview.
  • luminance: A type-safe, type-level, and stateless Rust graphics framework.
  • wgpu: A cross-platform graphics and compute library based on WebGPU, with multi-threading capabilities.
  • Speedy2D: A crate offering cross-platform hardware-acc
@egorsmkv
egorsmkv / speed_batch_eval_flores.py
Created July 16, 2025 17:55
Vibe-coded evaluation
import evaluate
import polars as pl
import time
import torch
from tqdm import tqdm
from transformers import AutoModelForCausalLM, AutoTokenizer
from datasets import load_dataset
# --- 1. SETUP ---
model_id = '/home/smlkw/en-uk-t/final-checkpoints/kulyk-en-uk'
@egorsmkv
egorsmkv / convert_lfm2.py
Created July 16, 2025 15:53
Convert LFM2 model to BF16 and remove `lm_head.weight`
import torch
from safetensors.torch import load_file, save_file
import os
# --- Configuration ---
# Specify the path to your input .safetensors file
input_filepath = "model.safetensors"
# Specify the path for the new BF16 output file
output_filepath = "model_bf16.safetensors"
@egorsmkv
egorsmkv / dedup.py
Last active July 11, 2025 15:16
Deduplicate large text datasets using https://github.com/beowolx/rensa
import pandas as pd
from datasets import load_dataset
from rensa import CMinHash, RMinHash
from tqdm import tqdm
COLUMN = "source"
SPLIT = "train"
ALGORITHM = "CMinHash"
@egorsmkv
egorsmkv / create_ds.py
Created March 21, 2025 13:32
Upload MP3 to HF
import json
from glob import glob
from os.path import basename
files_all = glob("data/*.mp3")
results = []
for idx, filename in enumerate(files_all):
duration = 0
results.append({'file_name': basename(filename), 'duration': duration, 'transcription': '-'})
@egorsmkv
egorsmkv / main.rs
Last active March 14, 2025 14:00
Fast Inverse Square Root written in Rust, translated to Coq using https://github.com/formal-land/coq-of-rust
fn q_rsqrt(number: f32) -> f32 {
let threehalfs: f32 = 1.5;
let x2: f32 = number * 0.5;
let mut y: f32 = number;
let i: u32 = y.to_bits(); // safely get the bit representation of the float
let i: u32 = 0x5f3759df - (i >> 1); // what the heck?
y = f32::from_bits(i); // safely convert bits back to float
y * (threehalfs - (x2 * y * y)) // 1st iteration
WhisperForConditionalGeneration(
  (model): WhisperModel(
    (encoder): WhisperEncoder(
      (conv1): Conv1d(128, 1280, kernel_size=(3,), stride=(1,), padding=(1,))
      (conv2): Conv1d(1280, 1280, kernel_size=(3,), stride=(2,), padding=(1,))
      (embed_positions): Embedding(1500, 1280)
      (layers): ModuleList(
        (0-31): 32 x WhisperEncoderLayer(
          (self_attn): WhisperSdpaAttention(
import torchaudio
from speechbrain.pretrained import VAD
VAD = VAD.from_hparams(source="speechbrain/vad-crdnn-libriparty", savedir="pretrained_models/vad-crdnn-libriparty")
test_file = 'a.wav'
boundaries = VAD.get_speech_segments(test_file)
segments = VAD.get_segments(boundaries, test_file)