Thierry Moudiki thierrymoudiki

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

R to python data wrangling snippets

The dplyr package in R makes data wrangling significantly easier. The beauty of dplyr is that, by design, the options available are limited. Specifically, a set of key verbs form the core of the package. Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe. Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R. The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).

dplyr is organised around six key verbs:

	.PHONY: build buildsite check clean cleanvars coverage docs getwd initialize install installcranpkg installgithubpkg installedpkgs load removepkg render setwd start test usegit
	.DEFAULT_GOAL := help

	# The directory where R files are stored
	R_DIR = ./R

	define BROWSER_PYSCRIPT
	import os, webbrowser, sys
	from urllib.request import pathname2url

	To install chruby and ruby-install:
	brew install chruby ruby-install

	To install Ruby using ruby-install:
	ruby-install ruby 2.7.1

	NOTE: You can find latest stable version of Ruby here: https://www.ruby-lang.org/en/downloads/
	If you have issues installing Ruby then try the following:
	brew install openssl@3
	ruby-install 3.2.2 -- --with-openssl-dir=$(brew --prefix openssl@3)