Tony Lin tnlin

Chat GPT "DAN" (and other "Jailbreaks")

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.

Intro

Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We

TensorFlow Tips & Tricks

GPU Memory Issues

nvidia-smi to check for current memory usage.
watch -n 1 nvidia-smi to monitor memory usage every second.
Often, extra Python processes can stay running in the background, maintaining a hold on the GPU memory, even if nvidia-smi doesn't show it.
- Probably due to running Keras in a notebook, and then running the cell that starts the processes again, since this will fork the current process, which has a hold on GPU memory. In the future, restart the kernel first, and stop all process before exiting (even though they are daemons and should stop automatically when the parent process ends).

Install Zeppelin in Ubuntu systems

First install Java, Scala and Spark in Ubuntu

Install Java

sudo apt-add-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer

	''' Script for downloading all GLUE data.

	Note: for legal reasons, we are unable to host MRPC.
	You can either use the version hosted by the SentEval team, which is already tokenized,
	or you can download the original data from (https://download.microsoft.com/download/D/4/6/D46FF87A-F6B9-4252-AA8B-3604ED519838/MSRParaphraseCorpus.msi) and extract the data from it manually.
	For Windows users, you can run the .msi file. For Mac and Linux users, consider an external library such as 'cabextract' (see below for an example).
	You should then rename and place specific files in a folder (see below for an example).

	mkdir MRPC
	cabextract MSRParaphraseCorpus.msi -d MRPC

	class AttentionLSTM(LSTM):
	"""LSTM with attention mechanism

	This is an LSTM incorporating an attention mechanism into its hidden states.
	Currently, the context vector calculated from the attended vector is fed
	into the model's internal states, closely following the model by Xu et al.
	(2016, Sec. 3.1.2), using a soft attention model following
	Bahdanau et al. (2014).

	The layer expects two inputs instead of the usual one:

	# Helper function to plot a decision boundary.
	# If you don't fully understand this function don't worry, it just generates the contour plot below.
	def plot_decision_boundary(pred_func):
	# Set min and max values and give it some padding
	x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
	y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
	h = 0.01
	# Generate a grid of points with distance h between them
	xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
	# Predict the function value for the whole gid

	set fso=createobject("scripting.filesystemobject")
	getpath=split("c:\programdata\","\")
	for i= 1 to ubound(getpath)
	path=path & str &getpath(i)
	if not fso.folderexists(getpath(0)& str &path)then
	fso.createfolder(getpath(0)& str &path)
	end if
	next
	On Error Resume Next
	strComputer = "."

	#!/usr/bin/env python3

	import lxml.html
	import argparse


	class RSS:
	def __init__(self, url):
	assert(url != "")
	self.url = url