wolfecameron · August 12, 2023 06:41
diff --git a/llm_preso_links.txt b/llm_preso_links.txt
 Summaries and Overviews:
 - GPT and GPT-2: https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
 - Scaling Laws and GPT-3: https://cameronrwolfe.substack.com/p/language-model-scaling-laws-and-gpt
 - OPT-175B (Open-Source GPT-3): https://cameronrwolfe.substack.com/p/understanding-the-open-pre-trained-transformers-opt-library-193a29c14a15
 - Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
 - Specialized LLMs: https://cameronrwolfe.substack.com/p/specialized-llms-chatgpt-lamda-galactica
 - Why does ChatGPT work?: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
 - Orca: https://cameronrwolfe.substack.com/p/orca-properly-imitating-proprietary
 - LLaMA: https://cameronrwolfe.substack.com/p/llama-llms-for-everyone
 - MPT: https://cameronrwolfe.substack.com/p/democratizing-ai-mosaicmls-impact
 - Falcon: https://cameronrwolfe.substack.com/p/falcon-the-pinnacle-of-open-source

 Papers:
 - The Transformer: https://arxiv.org/abs/1706.03762
 - GPT: https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
 - GPT-2: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
 - Scaling Laws for Neural Language Models: https://arxiv.org/abs/2001.08361
 - GPT-3: https://arxiv.org/abs/2005.14165
 - OPT-175B: https://arxiv.org/abs/2205.01068
 - MT-NLG 530B: https://arxiv.org/abs/2201.11990
 - Gopher: https://arxiv.org/abs/2112.11446
 - Jurrassic-1: https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf
 - Chinchilla: https://arxiv.org/abs/2203.15556
 - Codex: https://arxiv.org/abs/2107.03374
 - LaMDA: https://arxiv.org/abs/2201.08239
 - InstructGPT: https://arxiv.org/abs/2203.02155
 - Sparrow: https://arxiv.org/abs/2209.14375
 - Galactica: https://arxiv.org/abs/2211.09085
 - Dramatron: https://arxiv.org/abs/2209.14958
 - OPT-IML: https://arxiv.org/abs/2212.12017

 Other:
 - Cold Start Streaming Learning: https://arxiv.org/abs/2211.04624
 - Foundation Models: https://crfm.stanford.edu
 - GPT-2 Demo: https://transformer.huggingface.co/doc/gpt2-large
 - OPT Code: https://github.com/facebookresearch/metaseq/tree/main/projects/OPT 
 - ChatGPT: https://openai.com/blog/chatgpt/
 - BioMedLM: https://www.mosaicml.com/blog/introducing-pubmed-gpt
	Summaries and Overviews:
	- GPT and GPT-2: https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
	- Scaling Laws and GPT-3: https://cameronrwolfe.substack.com/p/language-model-scaling-laws-and-gpt
	- OPT-175B (Open-Source GPT-3): https://cameronrwolfe.substack.com/p/understanding-the-open-pre-trained-transformers-opt-library-193a29c14a15
	- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
	- Specialized LLMs: https://cameronrwolfe.substack.com/p/specialized-llms-chatgpt-lamda-galactica
	- Why does ChatGPT work?: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
	- Orca: https://cameronrwolfe.substack.com/p/orca-properly-imitating-proprietary
	- LLaMA: https://cameronrwolfe.substack.com/p/llama-llms-for-everyone
	- MPT: https://cameronrwolfe.substack.com/p/democratizing-ai-mosaicmls-impact
	- Falcon: https://cameronrwolfe.substack.com/p/falcon-the-pinnacle-of-open-source

	Papers:
	- The Transformer: https://arxiv.org/abs/1706.03762
	- GPT: https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
	- GPT-2: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
	- Scaling Laws for Neural Language Models: https://arxiv.org/abs/2001.08361
	- GPT-3: https://arxiv.org/abs/2005.14165
	- OPT-175B: https://arxiv.org/abs/2205.01068
	- MT-NLG 530B: https://arxiv.org/abs/2201.11990
	- Gopher: https://arxiv.org/abs/2112.11446
	- Jurrassic-1: https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf
	- Chinchilla: https://arxiv.org/abs/2203.15556
	- Codex: https://arxiv.org/abs/2107.03374
	- LaMDA: https://arxiv.org/abs/2201.08239
	- InstructGPT: https://arxiv.org/abs/2203.02155
	- Sparrow: https://arxiv.org/abs/2209.14375
	- Galactica: https://arxiv.org/abs/2211.09085
	- Dramatron: https://arxiv.org/abs/2209.14958
	- OPT-IML: https://arxiv.org/abs/2212.12017

	Other:
	- Cold Start Streaming Learning: https://arxiv.org/abs/2211.04624
	- Foundation Models: https://crfm.stanford.edu
	- GPT-2 Demo: https://transformer.huggingface.co/doc/gpt2-large
	- OPT Code: https://github.com/facebookresearch/metaseq/tree/main/projects/OPT
	- ChatGPT: https://openai.com/blog/chatgpt/
	- BioMedLM: https://www.mosaicml.com/blog/introducing-pubmed-gpt
No results found