Skip to content

Instantly share code, notes, and snippets.

View richsoni's full-sized avatar

Richard Soni richsoni

View GitHub Profile
@dannguyen
dannguyen / shakespeare-ngrams-cli-ack.md
Last active May 30, 2023 16:04
How to tokenize and create n-grams in Shakespeare from the command-line

Creating Shakespearean n-grams with just the command-line and regexes

This is a quick example showing how to use regexes to find tri-grams in Shakespeare...well, 570,872 of them, anyway, if we do some basic filtering of non-dialogue.

Though tokenization and n-grams should typically be done using a proper natural language processing framework, it's possible to do in a jiffy from the command-line, using standard Unix tools and ack, the better-than-grep utility.

What are n-grams?

@dvdbng
dvdbng / vim-heroku.sh
Last active October 16, 2024 17:15
Run vim in heroku updated 2017
mkdir ~/vim
cd ~/vim
# Staically linked vim version compiled from https://github.com/ericpruitt/static-vim
# Compiled on Jul 20 2017
curl 'https://s3.amazonaws.com/bengoa/vim-static.tar.gz' | tar -xz
export VIMRUNTIME="$HOME/vim/runtime"
export PATH="$HOME/vim:$PATH"
cd -