Skip to content

Instantly share code, notes, and snippets.

@diatche
diatche / format_transcript.py
Created January 10, 2025 20:37
Split Transcript into Paragraphs
"""
Script: Split Transcript into Paragraphs
Description:
This script processes a transcript from a text file and splits it into paragraphs.
It uses a word count threshold to determine where paragraphs should end, ensuring
natural breaks at the end of sentences. The processed output is saved to the same
directory as the input file, with a suffix added to the base name.
Usage:
@diatche
diatche / extract_slides.py
Last active November 22, 2024 01:03
Extract Slides from Video Script
"""
Extract Slides from Video Script
This script extracts frames from an MP4 video file at regular intervals and deduplicates them to produce unique slides.
It crops the center 50% of each frame to avoid any overlays in the corners, such as picture-in-picture (PIP) or logos.
Usage:
python extract_slides.py <video_file>
Arguments:
@diatche
diatche / ollama-langchain.py
Last active May 30, 2024 08:09
Ask a URL using Llama 2 running locally
# Ask questions about a document using Ollama and Langchain.
# This leverages a local LLM to provide insights into any document.
# It's far from perfect, as it's power is limited by the LLM size.
# It leverages the langchain library to perform tasks like document loading,
# text splitting, embedding generation, and using the Ollama client for generating answers.
# You need to have Ollama installed (with model llama2:13b available). See https://github.com/jmorganca/ollama
# Usage:
@diatche
diatche / transcribe.py
Last active May 30, 2024 08:11
Audio Transcriber using OpenAI Whisper
import argparse
import subprocess
import os
import math
from openai import OpenAI
MAX_SIZE = 26214400 # Maximum file size (in bytes)
# Parse command line arguments
parser = argparse.ArgumentParser(
@diatche
diatche / revolut-csv.py
Last active October 27, 2020 21:20
Converts Revolut CSV format into a CSV table format which is easier for automated processing.
#!/bin/python3
# Converts Revolut CSV format into a CSV table
# format which is easier for automated processing.
#
# Removes the following categories automatically:
# general, transfers
#
# Usage:
#
@diatche
diatche / swedbank-csv.py
Last active October 22, 2020 21:03
Converts Swedbank CSV format into a CSV table format which is easier for automated processing.
#!/bin/python3
# Converts Swedbank CSV format into a CSV table
# format which is easier for automated processing.
#
# Usage:
#
# 1. Save file as swedbank-csv.py and open terminal in folder location.
# 2. In termial: python3 swedbank-csv.py -i <path to source CSV>
# 3. For more info, run: python3 swedbank-csv.py --help
@diatche
diatche / recall-test.html
Created September 17, 2020 21:26
Test Gun recall function
<script src="https://cdn.jsdelivr.net/gh/amark/gun@master/gun.js"></script>
<script src="https://cdn.jsdelivr.net/gh/amark/gun@master/sea.js"></script>
<!-- <script src="https://cdn.jsdelivr.net/npm/gun/gun.js"></script>
<script src="https://cdn.jsdelivr.net/npm/gun/sea.js"></script> -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/lodash.min.js"></script>
<script>
;(() => {
// Reset by uncommenting the next line, run once and comment out again.
// sessionStorage.clear();
localStorage.clear();
@diatche
diatche / slack_cleaner.py
Created September 2, 2020 05:40
Deletes old bot Slack messages using slack_cleaner2
# Deletes old bot Slack messages using slack_cleaner2
# https://github.com/sgratzl/slack-cleaner
# Usage:
# 1. Install slack_cleaner2 and Arrow
# 2. Run with token as CLI arg. E.g. `python slack_cleaner.py <token xoxp-...>`
import arrow
import time
import sys
from slack_cleaner2 import *
@diatche
diatche / li.html
Last active June 17, 2020 03:59 — forked from amark/li.html
<html><body>
<style>
html, body {
background: rgb(245, 245, 245);
margin: 0;
padding: 0;
}
div {
position: relative;
overflow: hidden;