Kazuki Inamura kzinmr

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

[macOS] Install CMake

Instructions on how to install the CMake tool on macOS.

Uninstall

First step should be to unsinstall any previous CMake installation. This step can be skipped if no CMake version was previously installed.

To uninstall any previous CMake installations use the following commands:

Huggingface

NLP 관련 다양한 패키지를 제공하고 있으며, 특히 언어 모델 (language models) 을 학습하기 위하여 세 가지 패키지가 유용

package	note
transformers	Transformer 기반 (masked) language models 알고리즘, 기학습된 모델을 제공
tokenizers	transformers 에서 사용할 수 있는 토크나이저들을 학습/사용할 수 있는 기능 제공. transformers 와 분리된 패키지로 제공
nlp	데이터셋 및 평가 척도 (evaluation metrics) 을 제공

	# Fetch some text content in two different categories
	from wikipediaapi import Wikipedia
	wiki = Wikipedia('RAGBot/0.0', 'en')
	docs = [{"text": x,
	"category": "person"}
	for x in wiki.page('Hayao_Miyazaki').text.split('\n\n')]
	docs += [{"text": x,
	"category": "film"}
	for x in wiki.page('Spirited_Away').text.split('\n\n')]

	from fastapi import Request, HTTPException
	from pydantic import BaseModel, BaseModel, HttpUrl
	from modal import Secret, App, web_endpoint, Image
	from typing import Optional, List
	from example import proposal
	import os

	app = App(name="circleback", image=Image.debian_slim().pip_install("openai", "pydantic", "fastapi"))

	class Attendee(BaseModel):

	import base64
	import tempfile
	from typing import Optional

	from pydantic import BaseModel

	from modal import Image, Secret, Stub, build, enter, gpu, web_endpoint

	whisper_image = (
	Image.micromamba()

	from collections import defaultdict, Counter
	from operator import add
	from functools import reduce
	import numpy as np

	from sklearn.cluster import KMeans


	def dict_of_list(keys, values):
	assert(len(keys) == len(values))


	from collections import defaultdict
	from functools import reduce, partial
	import numpy as np
	from itertools import chain


	def flatten(l):
	return list(chain.from_iterable(l))

	"calculate PMI(A,B)=P(A,B)/P(A)P(B) for every token A and B in a window"
	from itertools import tee, combinations
	from collections import Counter


	def count_bigram(sentence, window=5):
	# ['A','B','C','D', 'E', 'F', 'G'], 4 ->
	# [['A', 'B', 'C', 'D'],
	# ['B', 'C', 'D', 'E'],
	# ['C', 'D', 'E', 'F'],