Sagor Sarker sagorbrur

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

Code: https://colab.research.google.com/drive/1vltPI81atzRvlALv4eCvEB0KdFoEaCOb?usp=sharing

Can these scores be improved? YES!

Rerunning with more training data, more epochs of training, or using other libraries to set a learning rate / other hyperparameters before training.

Experimenting with epochs - when I doubled the number of epochs, MuRIL improves only slightly (69.5->69.7 on one task)

The point of a benchmark is to run these models through a reasonable and identical process; you can tweak hyperparameters on any model to improve results.

Most active GitHub users (git.io/top)

The list would not be updated for now. Don't write comments.

The count of contributions (summary of Pull Requests, opened issues and commits) to public repos at GitHub.com from Wed, 21 Sep 2022 till Thu, 21 Sep 2023.

Because of GitHub search limitations, only 1000 first users according to amount of followers are included. If you are not in the list you don't have enough followers. See raw data and source code. Algorithm in pseudocode:

githubUsers

	function TRANSLATE(text, repo_id="Helsinki-NLP/opus-mt-en-es") {
	endpoint = "https://api-inference.huggingface.co/pipeline/translation/" + repo_id;
	const payload = JSON.stringify({
	"inputs": text
	});

	// Add your token from https://huggingface.co/settings/token
	const options = {
	"headers": {"Authorization": "Bearer <YOUR HUGGINGFACE API KEY>"},
	"wait_for_model": true,