Shuyue Jia (Bruce) SuperBruceJia

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.

Intro

Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We

pretty_csv/pretty_tsv

These are some simple bash functions and scripts for making CSV/TSV files prettier on the command line

see http://stefaanlippens.net/pretty-csv.html for more information.

	{
	"Dataset": [
	"multimedqa",
	"medmcqa",
	"medqa_4options",
	"mmlu_anatomy",
	"mmlu_clinical_knowledge",
	"mmlu_college_biology",
	"mmlu_college_medicine",
	"mmlu_medical_genetics",

	from transformers import AutoTokenizer, TextGenerationPipeline
	from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
	import logging

	logging.basicConfig(
	format="%(asctime)s %(levelname)s [%(name)s] %(message)s", level=logging.INFO, datefmt="%Y-%m-%d %H:%M:%S"
	)

	"""
	Download https://huggingface.co/liuhaotian/llava-llama-2-13b-chat-lightning-preview to local

	# coding=utf-8
	# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software

	"""
	inference_openai.py - text generation with OpenAI API

	See https://platform.openai.com/docs/quickstart for more details.

	Usage:
	python inference_openai.py --prompt "The quick brown fox jumps over the lazy dog." --model "gpt-3.5-turbo" --temperature 0.5 --max_tokens 256 --n 1 --stop "."

	Detailed usage:
	python inference_openai.py --help

	#!/usr/bin/env python
	# -- coding: utf-8 --
	from argparse import ArgumentParser

	import torch
	import torch.distributed as dist
	from torch.nn.parallel import DistributedDataParallel as DDP
	from torch.utils.data import DataLoader, Dataset
	from torch.utils.data.distributed import DistributedSampler
	from transformers import BertForMaskedLM

	# first install pygmentize to the mac OS X or macOS system with the built-in python
	sudo easy_install Pygments

	# then add alias to your ~/.bash_profile or ~/.bashrc or ~/.zshrc etc.
	alias pcat='pygmentize -f terminal256 -O style=native -g'

	""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
	import numpy as np
	import cPickle as pickle
	import gym

	# hyperparameters
	H = 200 # number of hidden layer neurons
	batch_size = 10 # every how many episodes to do a param update?
	learning_rate = 1e-4
	gamma = 0.99 # discount factor for reward