YYYoung YaoYYoung

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

Chat GPT "DAN" (and other "Jailbreaks")

关于 AUTOMATIC1111/stable-diffusion-webui 的 FAQ

该 FAQ 并非意图一次性提供所有必要信息, 仅仅是提供必要的指路. 最后的杂项部分是宝藏区, 建议都翻翻看啦!

查看最新消息参阅 SD RESOURCE GOLDMINE 2 (English), 或者 sudoskys/StableDiffusionBook (中文)

前言

建议优先阅读官方文档或者 SD RESOURCE GOLDMINE 或者 VOLDY RETARD GUIDE或者としあきdiffusion

	# 2025年10月30日

	# port: 7890 # HTTP(S) 代理服务器端口
	# socks-port: 7891 # SOCKS5 代理端口
	mixed-port: 10801 # HTTP(S) 和 SOCKS 代理混合端口
	redir-port: 7891 # 透明代理端口，用于 Linux 和 MacOS

	# Transparent proxy server port for Linux (TProxy TCP and TProxy UDP)
	tproxy-port: 1536

	"""
	Copyright (C) 2024 Rachel030219

	This program is free software; you can redistribute it and/or
	modify it under the terms of the GNU General Public License
	as published by the Free Software Foundation; either version 2
	of the License, or (at your option) any later version.

	This program is distributed in the hope that it will be useful,
	but WITHOUT ANY WARRANTY; without even the implied warranty of

	# AFF
	# 如果你想支持我，可以通过我的邀请链接购买机场
	# 感谢支持
	# 1. 倾城极速邀请码: 0jiB5uAA https://qcjs.ovh/#/register?code=0jiB5uAA
	# 2. ssLinks 邀请码: fSo2OhzH https://98a6251b6cd7471da86cca993b6dbe6f.36d.biz/#/register?code=fSo2OhzH

	# 一定要填我的邀请码，不填我哭给你看😭

	# mihomo (Clash Meta) 懒人配置
	# 版本 V1.22-250718

	/**
	* 批量下载自己已购买的电子书和个人文档
	* 要求：至少有一台Kindle设备。
	* 打开 https://www.amazon.cn/hz/mycd/myx/ ，然后按F12键进入Console（控制台），把代码全部复制并粘贴到控制台中，回车。
	* 然后输入 download("ebook") ，下载所有的电子书
	* 想下载个人文档，则是输入 download("pdoc")
	* 下载时如果某个文件下载失败，可以使用刚刚运行的函数（也就是 download() 或者 download("pdoc") ）重新开始下载。在网页没被关闭的情况下，程序会忽略已经下载了的文件。
	* 脚本运行期间请不要关闭网页，请允许网页自动下载多个文件
	* 如果网页被关闭了，但恰巧你保存了上次下载任务返回的成功下载的文件列表，
	* 可以选择复制该列表中的所有文字，并将其作为 download 的第二个参数传入（如 download("ebook",["something","something else"]) ），这样程序同样会忽略已经下载了的文件。