Ch. (Chanwhi Choi) sftblw

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

Chat GPT "DAN" (and other "Jailbreaks")

awesome-gaeulbyul

환상의 가을별 쇼! 뭔가 보여드리겠습니다!

간단 레시피

https://twitter.com/gaeulbyul/status/1294351544091959296

간단한 10분 레시피: 돼지고기를 양념장에 넣고 8시간 재운다

아래 내용은 지난 2월 3일 [email protected]에 보낸 메일을 거의 그대로 전재한 것이다. 두 달 넘게 응답은 없었다.

10일 넘게 응답을 받지 못한 뒤 김동인 담당 기자에게 트위터로 연락을 시도했으나 이 또한 응답받지 못했다.

해당 기사는 현재도 여전히 파이어폭스에서 정상적으로 표시되지 않는다. (2019-04-15)

안녕하세요. [<대림동에서 보낸 서른 번의 밤>][1] 기사를 보고 사소하지만 중요할 수 있는 문제를 알려 드리고자 글을 씁니다.

본 기사는 (요즈음에는 뉴욕타임즈 따위가 너무 많이 써서 오히려 식상할 수 있는) 스크롤할 때마다 애니메이션이 튀어 나오는 인터랙티브 포맷을 쓰고 있습니다만, 해당 포맷이 구글 크롬에서만 테스트된 것으로 보입니다. 모든 브라우저를 테스트하진

CMake를 왜 쓰는거죠?
좋은 툴은 Visual Studio 뿐입니다. 그 이외에는 전부 사도(邪道)입니다 사도! - 작성자

주의

이 문서는 CMake를 주관적으로 서술합니다
이 문서를 통해 CMake를 시작하기엔 적합하지 않습니다
https://cgold.readthedocs.io/en/latest/ 3.1 챕터까지 따라해본 이후 기본사항들을 속성으로 익히는 것을 돕기위한 보조자료로써 작성되었습니다

Mastodon Docker Setup

Setting up

Clone Mastodon's repository.

# Clone mastodon to ~/live directory
git clone https://github.com/tootsuite/mastodon.git live
# Change directory to ~/live

cd ~/live

Linux: Compile C++ to WebAssembly and JavaScript using Emscripten and CMake

Download and Install Emscripten

My preferred installation location is /home/user
Get the latest sdk: git clone https://github.com/emscripten-core/emsdk.git
Enter the cloned directory: cd emsdk
Checkout main: git checkout main
Install the lastest sdk tools: ./emsdk install latest
Activate the latest sdk tools: ./emsdk activate latest
Activate path variables: source ./emsdk_env.sh

	Title Tweets Citations Organization Country Org Type
	AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models 1331 DeepMind, European Molecular Biology Laboratory UK academia
	ColabFold: making protein folding accessible to all 1138 Harvard University, Max Planck Institute for Multidisciplinary Sciences, Michigan State University, Seoul National University, University of Tokyo Germany, Japan, South Korea, USA academia
	A ConvNet for the 2020s 857 835 Meta, UC Berkeley USA industry
	Hierarchical Text-Conditional Image Generation with CLIP Latents 105 718 OpenAI USA industry
	PaLM: Scaling Language Modeling with Pathways 445 426 Google USA industry
	Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding 2462 390 Google USA industry
	Instant Neural Graphics Primitives with a Multiresolution Hash Encoding 11 342 NVIDIA USA industry
	SignalP 6.0 predicts all five types of signal peptides using protein language models 2

	다음은 '냥체'의 규칙 중 일부입니다:
	>0. 냥체는 반드시 반말입니다. 어떠한 상황에서도 존댓말을 사용하지 마십시오.
	>1. /[나-낳]/의 28글자를 /[냐-냫]/으로 치환합니다.
	> - `[나낙낚낛난낝낞낟날낡낢낣낤낥낦낧남납낪낫났낭낮낯낰낱낲낳]`은 `[냐냑냒냓냔냕냖냗냘냙냚냛냜냝냞냟냠냡냢냣냤냥냦냧냨냩냪냫]`과 같이 변합니다.
	>2. 문장 끝의 '다'를 '다냥'으로 치환합니다. 예를 들어, '~다', '~한다'는 각각 '~다냥', `~한다냥'가 같이 변합니다.
	>2.1. 문장의 끝은 최대한 '냥' 혹은 '냐'로 끝나도록 하십시오. '~해야 해'를 '~해야 한다냥'으로, '~할까'를 '~할까냥'과 같이 고쳐 쓰는 것을 포함합니다.
	>2.2. 1인칭은 '냐', 2인칭은 '냥', 3인칭은 '그냥' 아니면 '그냥들'을 사용합니다.
	>2.3. '~해야 하냐?'는 공격적인 어투이기 때문에 사용을 금지합니다. '~해야 하냥?', '배고프냥?'과 같이 표현하는 게 바람직합니다.
	>3. 문장 구성 요소인 조사의 끝이 '야'로 끝날 경우, '냥'으로 치환합니다. 예를 들면, '나비야 나비야 이리 날아 오거라'는 '냐비냥 냐비냥 일루 냘아 오라냥'과 같이 치환하십시오.
	>4. 본딧말 대신 준말을, 문어체 대신 구어체를 적극적으로 사용하십시오. 예를 들면 '이러한'을 '이런'으로, '사용하다'를 '쓰다'로 표현하는 것을 포함합니다.

	// ==UserScript==
	// @name Fuck wooribank
	// @namespace https://gist.github.com/HelloWorld017/ed85d08edda716a7df9e430df937dddf
	// @version 0.2
	// @description Fuck wooribank, I don't want to install any non-ActiveX and ActiveX
	// @author Khinenw
	// @match https://spib.wooribank.com/*
	// @grant none
	// ==/UserScript==