Jeroen Van Goey BioGeek

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models


Features	AWS	GCP	Azure	Databricks
Data pipeline	Data Pipeline	Dataflow	Data Factory	Spark
Feature Store	Feature Store	---	---	Feature Store
Model Monitoring	Model Monitor	---	[Azure Monitor](https://docs.microsoft.com/en-us/azure/machine-learning/monitor-azure-machine-learnin

My setup

I'm using Nvidia Jetson nano.

Quad-core ARM® Cortex®-A57 MPCore processor

NVIDIA Maxwell™ architecture with 128 NVIDIA CUDA® cores

4 GB 64-bit LPDDR4 1600MHz - 25.6 GB/s

Ubuntu 18.04 LTS

Genomics - A programmer's guide.

Andy Thomason is a Senior Programmer at Genomics PLC. He has been witing graphics systems, games and compilers since the '70s and specialises in code performance.

https://www.genomicsplc.com

How many times shouldn't it happen...

-- https://news.ycombinator.com/item?id=11396045

SELECT count(*)
FROM (SELECT id, repo_name, path
        FROM [bigquery-public-data:github_repos.sample_files]
 ) AS F

	#VERBOSE=0 torchrun --nproc_per_node 3 self_contained_pp_LOC.py
	import os, random, numpy as np, torch, torch.nn as nn, torch.distributed as dist, torch.nn.functional as F
	from torch.optim import AdamW
	from torch.utils.data import DataLoader, DistributedSampler
	from datasets import load_dataset
	from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer

	STEP, local_rank, world_size, verbose = 0, int(os.environ["LOCAL_RANK"]), int(os.environ["WORLD_SIZE"]), os.environ.get("VERBOSE", "0") == "1"

	def set_all_seed(seed):

	#!/bin/sh
	set -x

	# == Swarm training (alpha release) ==

	# Setup:
	#
	# git clone https://github.com/shawwn/gpt-2
	# cd gpt-2
	# git checkout dev-shard

	function venv {
	default_envdir=".env"
	envdir=${1:-$default_envdir}

	if [ ! -d $envdir ]; then
	python -m venv $envdir
	pip install ipython black flake8
	echo -e "\x1b[38;5;2m✔ Created virtualenv $envdir\x1b[0m"
	fi
	source $envdir/bin/activate

	1) Read an image from file
	2) Display an image that you read from file
	3) Capture Video using your webcam and display the feed
	4) Display back and white live stream from your webcam.
	5) Have a slider to change brightness of the webcam live stream. Display.
	6) Have a slider to change contrast of the webcam live stream. Display.
	7) Capture a snapshot from your webcam. Then display difference between live video stream and this snapshot. (Background subtraction)
	8) Display Canny edge image from your live webcam stream
	9) Have a slider to change smoothness / sharpness of image from live webcam stream.
	10) Display histogram of colors (RGB) from your live webcam stream

	#
	# read/write access to python's memory, using a custom bytearray.
	# some code taken from: http://tinyurl.com/q7duzxj
	#
	# tested on:
	# Python 2.7.10, ubuntu 32bit
	# Python 2.7.8, win32
	#
	# example of correct output:
	# inspecting int=0x41424344, at 0x0228f898