Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.
Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We
''' Script for downloading all GLUE data. | |
Note: for legal reasons, we are unable to host MRPC. | |
You can either use the version hosted by the SentEval team, which is already tokenized, | |
or you can download the original data from (https://download.microsoft.com/download/D/4/6/D46FF87A-F6B9-4252-AA8B-3604ED519838/MSRParaphraseCorpus.msi) and extract the data from it manually. | |
For Windows users, you can run the .msi file. For Mac and Linux users, consider an external library such as 'cabextract' (see below for an example). | |
You should then rename and place specific files in a folder (see below for an example). | |
mkdir MRPC | |
cabextract MSRParaphraseCorpus.msi -d MRPC |
nvidia-smi
to check for current memory usage.watch -n 1 nvidia-smi
to monitor memory usage every second.- Often, extra Python processes can stay running in the background, maintaining a hold on the GPU memory,
even if
nvidia-smi
doesn't show it.- Probably due to running Keras in a notebook, and then running the cell that starts the processes again, since this will fork the current process, which has a hold on GPU memory. In the future, restart the kernel first, and stop all process before exiting (even though they are daemons and should stop automatically when the parent process ends).
class AttentionLSTM(LSTM): | |
"""LSTM with attention mechanism | |
This is an LSTM incorporating an attention mechanism into its hidden states. | |
Currently, the context vector calculated from the attended vector is fed | |
into the model's internal states, closely following the model by Xu et al. | |
(2016, Sec. 3.1.2), using a soft attention model following | |
Bahdanau et al. (2014). | |
The layer expects two inputs instead of the usual one: |
# Helper function to plot a decision boundary. | |
# If you don't fully understand this function don't worry, it just generates the contour plot below. | |
def plot_decision_boundary(pred_func): | |
# Set min and max values and give it some padding | |
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5 | |
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5 | |
h = 0.01 | |
# Generate a grid of points with distance h between them | |
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) | |
# Predict the function value for the whole gid |
set fso=createobject("scripting.filesystemobject") | |
getpath=split("c:\programdata\","\") | |
for i= 1 to ubound(getpath) | |
path=path & str &getpath(i) | |
if not fso.folderexists(getpath(0)& str &path)then | |
fso.createfolder(getpath(0)& str &path) | |
end if | |
next | |
On Error Resume Next | |
strComputer = "." |
#!/usr/bin/env python3 | |
import lxml.html | |
import argparse | |
class RSS: | |
def __init__(self, url): | |
assert(url != "") | |
self.url = url |