Skip to content

Instantly share code, notes, and snippets.

View martin12333's full-sized avatar

Martin Milan martin12333

View GitHub Profile
@Donavan
Donavan / segmentation-101_part-1.md
Last active October 6, 2025 08:16
Segmentation 101, Part 1: Why your strategy matters

Segmentation 101, part 1: Why your strategy matters

I recent did some more exploring with a local LLM tool that would import your documents into a vector store.  Given the promising initial results with a handful of docs I wanted to see how it handled more / different data.  I decided to copy over the text files containing Expanse trivia and answers I use as a regression suite to test my own "Q&A over documents" process. I wanted to see what types of questions it could answer from that content...

The Problem With Generic Segmentation

The strategy employed by this tool used double newlines as their segmentation boundary condition. A strategy that works well for many types of content however for this content that was a terrible choice as the text in the files are formatted with numbered questions followed by their answers like this:

1. Long winded question with establishing context

 

@taruma
taruma / feidlambda_v0_4.scala
Last active February 24, 2025 14:12
Official GIST feidlambda (feid) v0.4
/*
feidlambda v0.4.0 - LOGIC / UTILITIES FUNCTIONS BY FIAKO ENGINEERING
OFFICIAL GIST (feidlambda v0.4.x):
https://gist.github.com/taruma/b4df638ecb7af48ab63691951481d6b2
REPOSITORY:
https://github.com/fiakoenjiniring/feidlambda
CONTRIBUTOR: @taruma, @iingLK
TESTED: Microsoft Excel 365 v2304
*/
@taybenlor
taybenlor / write.rs
Created September 12, 2022 10:30
This program writes whatever you type into STDIN to the file specified in arg1.
use std::io::Read;
use std::{env, fs, io};
fn main() {
let filename = env::args().nth(1).expect("no filename provided");
let stdin = io::stdin();
let mut buf = String::new();
stdin
.lock()

Datasette tutorial written by GPT3

Prompt: a step by step tutorial for getting started with Datasette

This is a guide for getting started with Datasette. Datasette is a tool for creating and publishing data-driven websites. It is designed to make it easy to publish structured data, such as the results of a database query, in a way that is highly visible and interactive.

Datasette can be used to create websites that allow users to explore and visualize data, or to build applications that expose data via APIs. It can also be used as a static site generator, creating a completely static HTML website that can be deployed anywhere.

This guide will cover the basics of how to install and use Datasette. It will also show you how to create a simple data-driven website using Datasette.

@OlivierLDff
OlivierLDff / Readme.md
Last active October 17, 2025 16:43
🚀 Git Bash Emojis (Windows)

Open git bash with admin privilege.

cd "C:/Program Files/Git/usr/share/mintty"
mkdir -p emojis
cd emojis
curl https://raw.githubusercontent.com/wiki/mintty/mintty/getemojis > getemojis
./getemojis -d
@kepano
kepano / obsidian-web-clipper.js
Last active September 24, 2025 01:32
Obsidian Web Clipper Bookmarklet to save articles and pages from the web (for Safari, Chrome, Firefox, and mobile browsers)
javascript: Promise.all([import('https://unpkg.com/[email protected]?module'), import('https://unpkg.com/@tehshrike/[email protected]'), ]).then(async ([{
default: Turndown
}, {
default: Readability
}]) => {
/* Optional vault name */
const vault = "";
/* Optional folder name such as "Clippings/" */
@Denbergvanthijs
Denbergvanthijs / game_of_life_keras.py
Last active April 13, 2024 12:44
Conway's Game of Life using a neural network with Keras and Tensorflow in Python
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from matplotlib.animation import FuncAnimation
from tensorflow.keras.layers import Conv2D, InputLayer, Layer
from tensorflow.keras.models import Sequential
size = 128
n_frames = 240
full_size = (1, size, size, 1)
@Potherca
Potherca / compile-lynx-wasm.sh
Last active April 13, 2024 16:05
Steps to compile the Lynx web browser to Web Assembly
# Sources used in figuring this out:
#
# - https://developer.mozilla.org/en-US/docs/WebAssembly/C_to_wasm
# - https://emscripten.org/docs/compiling/Building-Projects.html#building-projects
# Set up Emscripten SDK
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk/
./emsdk install latest
./emsdk activate latest
@mbigras
mbigras / nested-code-blocks.md
Last active August 17, 2025 16:27
Nested code blocks in technical documentation

Nested code blocks

Nested code blocks organize technical documentation; but rendering them in Markdown is tricky. Luckily GitHub worked around the vague specification by making 4 space indentation "just work".

Overview

Google Cloud Platform (GCP) technical documentation is formatted well enough; however, emulating that format in Markdown is tricky.

The following are the main organizational and formatting features about GCP documentation which I like: