Skip to content

Instantly share code, notes, and snippets.

@mahadirz
Last active March 15, 2023 06:52
Show Gist options
  • Save mahadirz/6953b9924f14dd4f13bb4583d90187f1 to your computer and use it in GitHub Desktop.
Save mahadirz/6953b9924f14dd4f13bb4583d90187f1 to your computer and use it in GitHub Desktop.
ChatGPT-4 Live Streaming subtitle from https://www.youtube.com/watch?v=outcGtbnMuQ
foreign
did the gpd4 developer demo live stream
honestly it's kind of hard for me to
believe that this day is here open AI
has been building this technology really
since we started the company but for the
past two years we've been really focused
on delivering gpt4
that started with rebuilding our entire
training stack actually training the
model
and then seeing what it was capable of
trying to figure out its capabilities
its risks working with Partners in order
to test it in real world scenarios
really tuning Its Behavior optimizing
the model getting it available
so that you can use it and so today our
goal is to show you a little bit of how
to make gbto4 shine
how to really get the most out of it you
know where it's kind of you know
weaknesses are where we're still working
on it and just how to really use it as a
good tool a good partner
um so if you're interested in
participating in the Stream uh that if
you go to our Discord so that's
discord.gg openai there's comments in
there and we'll take a couple of
audience suggestions
so the first thing I want to show you is
the first task that gpd4 could do that
we never really got 3.5 to do
and the way to think about this is all
throughout training that you know you're
constantly doing all this work it's 2
A.M the pager goes off you fix the model
and you're always wondering is it gonna
work
is all this effort actually going to pan
out and so we all had a pet task that we
really liked and that we would all
individually be trying to see is the
model capable of it now
and I'm going to show you the first one
that we had a success for four but never
really got there for 3.5
so I'm just going to copy the top of our
blog post from today going to paste it
into our Playground now this is our new
chat completions playground that came
out two weeks ago I'm going to show you
first with GPT 3.5 4 has the same API to
it the same playground
the way that it works is you have a
system message where you explain to the
model what it's supposed to do and we've
made these models very steerable so you
can provide it with really any
instruction you want whatever you dream
up and the model will adhere to it
pretty well and in the future it will
get increasingly increasingly powerful
at steering the model very reliably
you can then paste whatever you want as
a user the model will return messages as
an assistant and the way to think of it
is that we're moving away from sort of
just raw text in raw text out where you
can't tell where different parts of the
conversation come from but towards this
much more structured format that gives
the model the opportunity to know well
this is the user asking me to do
something that the developer didn't
attend I should listen to the developer
here
all right so now time to actually show
you the task that I'm referring to so
everyone's familiar with summarize
this let's say article into a sentence
okay getting a little more specific uh
but where every word begins with G
so this is 3.5 let's see what it does
yeah it kind of didn't even try
just gave up on the task this is pretty
typical for 3.5 trying to do this
particular kind of task if it's you know
sort of a very kind of stilted article
or something like that maybe it can
succeed but for the most part 3.5 just
gives up
but let's try the exact same prompt
the exact same system message
in gbt4
so kind of borderline whether you want
to count AI or not but so let's say AI
doesn't count
that's cheating
so fair enough the model happily accepts
my feedback
so now to make sure it's not just good
for G's I'd like to turn this over to
the audience I'll take a suggestion on
what letter to try next in the meanwhile
while I'm waiting for our moderators to
pick the lucky lucky letter I will give
a try with a
um but in this case I'll say gpd4 is
fine
why not
also pretty good summary
so I'll hop over to our Discord
all right
wow if people are are being a little
ambitious here I'm really trying to put
the model through the paces we're going
to try Q uh which if you think about
this for a moment I want the audience to
really think about how would you do a
summary of this article that all starts
with Q it's not easy
it's pretty good that's pretty good
all right so I've shown you summarizing
an existing article I want to show you
how you can flexibly combine ideas
between different articles so I'm going
to take this article that was on Hacker
News yesterday
copy paste it
into the same conversation so it has all
the context of what we're just doing I'm
going to say find one common theme
between this article and the gpd4 blog
so this is an article about Pinecone
which is a python web app development
framework and it's making the technology
more accessible user friendly if you
don't think that was insightful enough
you can always give some feedback and
say that was not insightful
enough
please no I'll just even just leave it
there leave it up to the model to decide
so Bridging the Gap between powerful
technology and practical applications
seems not bad and of course you can ask
for any other kind of task you want
using its flexible language
understanding and synthesis you can ask
for something like
now turn the GT4 blog post into a
rhyming poem
picked up on open AI evalues open source
for all helping to guide answering the
call which by the way if you'd like to
contribute to this model please give us
evals we have an open source evaluation
framework that will help us guide and
all of our users understand what the
model is capable of and to take it to
the next level
so there we go this is consuming
existing content using gpt4 with with a
little bit of creativity on top
but next I want to show you how to build
with gpt4 what it's like to create with
it as a partner
and so the thing we're going to do
is we're going to actually build a
Discord bot
I'll build it live and show you the
process show you debugging show you what
the model can do where its limitations
are and how to work with with them in
order to sort of achieve New Heights so
the first thing I'll do is tell the
model that this time it's supposed to be
an AI programming assistant
its job is to write things out in
pseudocode first and then actually write
the code and this approach is very
helpful so that the model break down the
problem into smaller pieces and then
that way you're not kind of asking it to
just come up with a super hard solution
to a problem all in one go
it also makes it very interpretable
because you can see exactly what the
model was thinking and you can even
provide Corrections if you'd like
so here is the prompt that we're going
to ask it this is the kind of thing that
3.5 would totally choke on if you've
tried anything like it but so we're
going to ask for a Discord bot that uses
the gpd4 API to read images and texts
now there's one problem here which is
this model's training cutoff is in 2021
which means it has not seen our new chat
completions format so I literally just
went to the blog post from two weeks ago
copy pasted from the blog post including
the response format it has not seen the
new image extension to that and so I
just kind of wrote that up and you know
just
very minimal detail about how to include
images so and now the model can actually
leverage the doc that documentation that
it did not have memorized that it does
not know
okay
and in general these models are very
good at using information that it's been
trained on in new ways and synthesizing
new content and you can see that right
here that it actually wrote an entirely
new bot
now let's
actually see if this bot is going to
work in practice so you should always
look through the code to get a sense of
what it does don't run untrusted code
from humans or from AIS
and one thing to note is that the
Discord API has changed a lot over time
and particularly that there's one
feature that has changed a lot since
this model was trained
give it a try in fact yes we are missing
the intense keyword this is something
that came out in 2020
. so the model does know it exists but
it doesn't know which version of the
Discord API we're using so are we out of
luck well not quite we can just simply
paste to the model exactly the error
message not even going to say hey this
is from running your code could you
please fix it
we'll just let it run
and the model says oh yeah whoops the
intense argument here's the correct
here's the correct code
now let's give this a try once again
kind of making sure that we understand
what the code is doing
now a second issue that can come up is
it doesn't know what environment I'm
running in and if you notice it says hey
here's this inscrutable error message
which if you've not used jupyter
notebook a lot with async IO before you
probably have no idea what this means
but fortunately
once again you can just sort of say to
the model hey
I am using Jupiter
and would like to make this work
can you fix it
and the specific problem is that there's
already an event Loop running so you
need to use this Nest async i o Library
you need to call Net Nest I sync IO dot
apply the model knows all of this
correctly instantiates all of these
these pieces into the bot it even helps
hopefully tells you oh you're running in
Jupiter well you can do this bang pip
install in order to install the package
if you don't already have it that was
very helpful
so now we'll run and it looks like
something happened
so the first thing I'll do
is
go over to our Discord
and I will paste in
a screenshot
of our Discord itself so remember gpt4
is not just a language model it's also a
vision model in fact it can flexibly
accept inputs that intersperse images
and text arbitrarily kind of like a
document now the image feature is in
preview so this is going to be a little
sneak peek it's not yet publicly
available it's something we're working
with one partner called be my eyes in
order to really start to develop it and
get it ready for prime time
but you can ask anything you like for
example I can't you know I'll say gp4
hello world
can you describe this image
and painstaking detail
all right which first of all think of
how you would do this yourself there's a
lot of different things you could latch
onto a lot of different pieces of the
system you could describe and we can go
over to the actual code and we can see
that yep we in fact received the message
have formatted an appropriate request
for our API
and now we wait
um because you know one of the things we
have to do is we have to make the system
faster that's one of the things that
we're working on optimizing in the
meanwhile I just want to say to the
audience that's watching we'll take an
audience request next so if you have an
image and a task you'd like to
accomplish please submit that to the
Discord our moderators will pick one
that will run
so we can see that the Discord oh it
looks like we have a response perfect
so it's a screenshot of a Discord
application interface pretty good did
not even describe it it knows that it's
Discord it's probably Discord written
there somewhere where it just kind of
knows this from from prior experience
server icon label gpd4 describes the
interface in great detail talks about uh
all the people telling me that I'm
supposed to do Q uh very very kind
audience
and describes a much of the uh the
notification messages and the users that
are in the channel and so there you go
that's some that's some pretty good
understanding now this next one if you
notice first of all we got a post but
the model did not actually see the
message so is this a failure of the
model or of the system around the model
well we can take a look
and if you notice here content is an
empty string we received a blank message
contents
the reason for this is a dirty trick
that we played on the AI
so if you go to the Discord
documentation
and you scroll through it all the way
down to uh I can see it hard for me to
even find honestly to the message
content
intent you'll see this was added as of
September 2022 as a required field so in
order to receive a message that does not
explicitly tag you you now have to
include this new intent in your code
remember I said intensive change a lot
over time this is much newer than the
model as possible is possibly able to
know so maybe we're out of luck we have
to debug this by hand but once again we
can try to use gpd4's language
understanding capabilities
to solve this now keep in mind this is a
document of like I think this is like
ten thousand fifteen thousand words
something like that it's not formatted
very well this is literally a command a
copy paste like this is what it's
supposed to parse through to find in the
middle of that document that oh yeah
message contents that's required now but
let's see if it can do it
so we will ask for I I am receiving
blank message contents
can you
why could this be happening
how do I fix it
so one thing that's new about gpd4 is
context length
32 000 tokens is kind of the upper limit
that we support right now and the model
is able to flexibly use long documents
it's something we're still optimizing so
we recommend trying it out but not
necessarily sort of really really
scaling it up just yet unless you have
an application that really benefits from
it so if you're really interested in
Long context please let us know we want
to see what kinds of applications it
unlocks but if you see
it says oh yeah message content intent
was not enabled and so you can either
ask the model to write some code for you
or you could
I actually just you know do it the
old-fashioned way
either way is fine
I think this is a augmenting tool makes
you much more productive but it's still
important that you are in the driver's
seat and are the manager and knows
what's what's going on so now we're
connected once again
and uh Boris would you like to rerun the
message
once again we can see that we have
received it even though the bot was not
explicitly tagged
seems like a pretty good
pretty good description interesting this
is an interesting image actually looks
like it's a dolly generated one and
let's actually try this one as well
so what's funny about this image oh it's
already been submitted
so once again we can verify this making
the right API calls
squirrels do typically eat nuts we don't
expect them to use a camera or act like
a human so I think that's that's a
pretty good explanation of why that
image is funny
so I'm going to show you one more
example of what you can do with this
model
so I have here a nice hand-drawn mock-up
of a joke website definitely worthy of
being put up on my refrigerator
so I'm just going to take out my phone
literally take a photo
of this mock-up
and then I'm going to send it
to our Discord
all right going to send it to our
Discord
and this is of course the rockiest part
making sure that we actually send it to
the right Channel
which in fact I think maybe I did not
sent it to the wrong Channel
it's funny it's always the uh the sort
of non-ai parts of these demos that are
the hardest part to do
and here we go
technology is now solved
and now we wait
so the thing that's amazing in my mind
is that
what's going on here is we're talking to
a neural network
and this neural network was trained to
predict what comes next right it played
this like this game of sort of being
shown a partial document and then
predicted what comes next across an
unimaginably large amount of content and
from there it learns all of these skills
that you can apply and all these very
flexible ways and so we can actually
take now this output so literally we
just said to
output the HTML from that picture
and here we go
actual working JavaScript
filled in the jokes
for comparison
this was the original
of our mock-up
and so there you go going from
hand-drawn
beautiful art
if I do say so myself to working website
and this is all just potential right we
you can see lots of different
applications we ourselves are still
figuring out new ways to use this so
we're going to work with our partner
we're going to scale up from there but
please be patient because it's going to
take us some time to really make this
available for everyone
so I have one last thing to show you
I've shown you reading existing content
I've shown you how to
build with the system as a partner the
last thing I'm going to show
is how to work with the system to
accomplish a task that none of us like
to do but we all have to
so you may have guessed the thing we're
going to do is taxes
now note that GPT is not a certified tax
professional nor am I so you should
always check with your your Tax Advisor
but it can be helpful to understand some
dense content to just be able to empower
yourself to to be able to sort of solve
problems and get a get a handle on
what's Happening when you could not
otherwise so once again I'll do a system
message in this case I'm going to tell
it that it's tax GPT which is not a
specific thing that we've trained into
this model you can be very creative if
you want with the system message to
really get the model in the mood of what
is your job what are you supposed to do
so I pasted in
the tax code this is about 16 Pages
worth of of tax code and there's this
question about Allison Bob they got
married at one point uh and that here
are their their incomes and they take a
standard deduction they're filing
jointly so first question what is their
standard deduction for 2018
. so while the model is chugging I'm
going to solve this problem by hand to
show you what's involved so the standard
deduction is the basic standard
deduction plus the additional the basic
one is 200 percent for a joint return of
subparagraph C which is here okay so
additional doesn't apply the limitation
doesn't apply
um okay now these apply oh wait special
rules for taxable year 2018 which is the
one we care about through 2025 you have
to substitute twelve thousand for three
thousand so two hundred percent of
twelve thousand twenty four thousand is
the final answer
if you notice the model got to the same
conclusion
and you can actually read through its
explanation
and to tell you the truth the first time
I tried to approach this problem myself
I could not figure it out I spent half
an hour reading through the tax code
trying to figure out this like back
reference and why there's some program
like just what's even going on it was
only by asking the model to spell out
its reasoning and then I followed along
that I was like oh I get it now I
understand how this works and so that I
think is where the power of the system
lies it's not perfect but neither are
you and together is this amplifying tool
that lets you just reach New Heights
and you can go further you can say okay
now calculate their total liability
and here we go it's doing the
calculation
honestly I every time it does it it's
just it's amazing this model is so good
at Mental Math it's way way better than
I am at Mental Math it's not hooked up
to a calculator like that's another way
that you could really try to enhance
these systems but it has these raw
capabilities that are so flexible it
doesn't care if it's code it doesn't
care if it's language it doesn't care if
it's tax all of these capabilities in
one system that can be applied
towards the problem that you care about
towards your application towards
whatever you build
and so to end it the final thing that I
will show is I a little other dose of
creativity which is now summarize this
problem into a rhyming poem
and there we go a beautiful beautiful
poem about doing your taxes so thank you
everyone for tuning in I hope you
learned something about what the model
can do how to work with it and honestly
we're just really excited to see what
you're going to build I I've talked
about openai evals please contribute we
think that this model improving it bring
it to the next level is something that
everyone can contribute to and that we
think it can really benefit a lot of
people and we want your help to do that
so thank you very much we're so excited
to see what you're going to build
foreign
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment