YYYoung YaoYYoung

Chat GPT "DAN" (and other "Jailbreaks")

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

	# AFF
	# 如果你想支持我，可以通过我的邀请链接购买机场
	# 感谢支持
	# 1. 倾城极速邀请码: 0jiB5uAA https://qcjs.ovh/#/register?code=0jiB5uAA
	# 2. ssLinks 邀请码: fSo2OhzH https://98a6251b6cd7471da86cca993b6dbe6f.36d.biz/#/register?code=fSo2OhzH

	# 一定要填我的邀请码，不填我哭给你看😭

	# mihomo (Clash Meta) 懒人配置
	# 版本 V1.22-250718

	"""
	Copyright (C) 2024 Rachel030219

	This program is free software; you can redistribute it and/or
	modify it under the terms of the GNU General Public License
	as published by the Free Software Foundation; either version 2
	of the License, or (at your option) any later version.

	This program is distributed in the hope that it will be useful,
	but WITHOUT ANY WARRANTY; without even the implied warranty of

	# 2025年10月30日

	# port: 7890 # HTTP(S) 代理服务器端口
	# socks-port: 7891 # SOCKS5 代理端口
	mixed-port: 10801 # HTTP(S) 和 SOCKS 代理混合端口
	redir-port: 7891 # 透明代理端口，用于 Linux 和 MacOS

	# Transparent proxy server port for Linux (TProxy TCP and TProxy UDP)
	tproxy-port: 1536