Last active
March 15, 2023 06:52
-
-
Save mahadirz/6953b9924f14dd4f13bb4583d90187f1 to your computer and use it in GitHub Desktop.
ChatGPT-4 Live Streaming subtitle from https://www.youtube.com/watch?v=outcGtbnMuQ
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
foreign | |
did the gpd4 developer demo live stream | |
honestly it's kind of hard for me to | |
believe that this day is here open AI | |
has been building this technology really | |
since we started the company but for the | |
past two years we've been really focused | |
on delivering gpt4 | |
that started with rebuilding our entire | |
training stack actually training the | |
model | |
and then seeing what it was capable of | |
trying to figure out its capabilities | |
its risks working with Partners in order | |
to test it in real world scenarios | |
really tuning Its Behavior optimizing | |
the model getting it available | |
so that you can use it and so today our | |
goal is to show you a little bit of how | |
to make gbto4 shine | |
how to really get the most out of it you | |
know where it's kind of you know | |
weaknesses are where we're still working | |
on it and just how to really use it as a | |
good tool a good partner | |
um so if you're interested in | |
participating in the Stream uh that if | |
you go to our Discord so that's | |
discord.gg openai there's comments in | |
there and we'll take a couple of | |
audience suggestions | |
so the first thing I want to show you is | |
the first task that gpd4 could do that | |
we never really got 3.5 to do | |
and the way to think about this is all | |
throughout training that you know you're | |
constantly doing all this work it's 2 | |
A.M the pager goes off you fix the model | |
and you're always wondering is it gonna | |
work | |
is all this effort actually going to pan | |
out and so we all had a pet task that we | |
really liked and that we would all | |
individually be trying to see is the | |
model capable of it now | |
and I'm going to show you the first one | |
that we had a success for four but never | |
really got there for 3.5 | |
so I'm just going to copy the top of our | |
blog post from today going to paste it | |
into our Playground now this is our new | |
chat completions playground that came | |
out two weeks ago I'm going to show you | |
first with GPT 3.5 4 has the same API to | |
it the same playground | |
the way that it works is you have a | |
system message where you explain to the | |
model what it's supposed to do and we've | |
made these models very steerable so you | |
can provide it with really any | |
instruction you want whatever you dream | |
up and the model will adhere to it | |
pretty well and in the future it will | |
get increasingly increasingly powerful | |
at steering the model very reliably | |
you can then paste whatever you want as | |
a user the model will return messages as | |
an assistant and the way to think of it | |
is that we're moving away from sort of | |
just raw text in raw text out where you | |
can't tell where different parts of the | |
conversation come from but towards this | |
much more structured format that gives | |
the model the opportunity to know well | |
this is the user asking me to do | |
something that the developer didn't | |
attend I should listen to the developer | |
here | |
all right so now time to actually show | |
you the task that I'm referring to so | |
everyone's familiar with summarize | |
this let's say article into a sentence | |
okay getting a little more specific uh | |
but where every word begins with G | |
so this is 3.5 let's see what it does | |
yeah it kind of didn't even try | |
just gave up on the task this is pretty | |
typical for 3.5 trying to do this | |
particular kind of task if it's you know | |
sort of a very kind of stilted article | |
or something like that maybe it can | |
succeed but for the most part 3.5 just | |
gives up | |
but let's try the exact same prompt | |
the exact same system message | |
in gbt4 | |
so kind of borderline whether you want | |
to count AI or not but so let's say AI | |
doesn't count | |
that's cheating | |
so fair enough the model happily accepts | |
my feedback | |
so now to make sure it's not just good | |
for G's I'd like to turn this over to | |
the audience I'll take a suggestion on | |
what letter to try next in the meanwhile | |
while I'm waiting for our moderators to | |
pick the lucky lucky letter I will give | |
a try with a | |
um but in this case I'll say gpd4 is | |
fine | |
why not | |
also pretty good summary | |
so I'll hop over to our Discord | |
all right | |
wow if people are are being a little | |
ambitious here I'm really trying to put | |
the model through the paces we're going | |
to try Q uh which if you think about | |
this for a moment I want the audience to | |
really think about how would you do a | |
summary of this article that all starts | |
with Q it's not easy | |
it's pretty good that's pretty good | |
all right so I've shown you summarizing | |
an existing article I want to show you | |
how you can flexibly combine ideas | |
between different articles so I'm going | |
to take this article that was on Hacker | |
News yesterday | |
copy paste it | |
into the same conversation so it has all | |
the context of what we're just doing I'm | |
going to say find one common theme | |
between this article and the gpd4 blog | |
so this is an article about Pinecone | |
which is a python web app development | |
framework and it's making the technology | |
more accessible user friendly if you | |
don't think that was insightful enough | |
you can always give some feedback and | |
say that was not insightful | |
enough | |
please no I'll just even just leave it | |
there leave it up to the model to decide | |
so Bridging the Gap between powerful | |
technology and practical applications | |
seems not bad and of course you can ask | |
for any other kind of task you want | |
using its flexible language | |
understanding and synthesis you can ask | |
for something like | |
now turn the GT4 blog post into a | |
rhyming poem | |
picked up on open AI evalues open source | |
for all helping to guide answering the | |
call which by the way if you'd like to | |
contribute to this model please give us | |
evals we have an open source evaluation | |
framework that will help us guide and | |
all of our users understand what the | |
model is capable of and to take it to | |
the next level | |
so there we go this is consuming | |
existing content using gpt4 with with a | |
little bit of creativity on top | |
but next I want to show you how to build | |
with gpt4 what it's like to create with | |
it as a partner | |
and so the thing we're going to do | |
is we're going to actually build a | |
Discord bot | |
I'll build it live and show you the | |
process show you debugging show you what | |
the model can do where its limitations | |
are and how to work with with them in | |
order to sort of achieve New Heights so | |
the first thing I'll do is tell the | |
model that this time it's supposed to be | |
an AI programming assistant | |
its job is to write things out in | |
pseudocode first and then actually write | |
the code and this approach is very | |
helpful so that the model break down the | |
problem into smaller pieces and then | |
that way you're not kind of asking it to | |
just come up with a super hard solution | |
to a problem all in one go | |
it also makes it very interpretable | |
because you can see exactly what the | |
model was thinking and you can even | |
provide Corrections if you'd like | |
so here is the prompt that we're going | |
to ask it this is the kind of thing that | |
3.5 would totally choke on if you've | |
tried anything like it but so we're | |
going to ask for a Discord bot that uses | |
the gpd4 API to read images and texts | |
now there's one problem here which is | |
this model's training cutoff is in 2021 | |
which means it has not seen our new chat | |
completions format so I literally just | |
went to the blog post from two weeks ago | |
copy pasted from the blog post including | |
the response format it has not seen the | |
new image extension to that and so I | |
just kind of wrote that up and you know | |
just | |
very minimal detail about how to include | |
images so and now the model can actually | |
leverage the doc that documentation that | |
it did not have memorized that it does | |
not know | |
okay | |
and in general these models are very | |
good at using information that it's been | |
trained on in new ways and synthesizing | |
new content and you can see that right | |
here that it actually wrote an entirely | |
new bot | |
now let's | |
actually see if this bot is going to | |
work in practice so you should always | |
look through the code to get a sense of | |
what it does don't run untrusted code | |
from humans or from AIS | |
and one thing to note is that the | |
Discord API has changed a lot over time | |
and particularly that there's one | |
feature that has changed a lot since | |
this model was trained | |
give it a try in fact yes we are missing | |
the intense keyword this is something | |
that came out in 2020 | |
. so the model does know it exists but | |
it doesn't know which version of the | |
Discord API we're using so are we out of | |
luck well not quite we can just simply | |
paste to the model exactly the error | |
message not even going to say hey this | |
is from running your code could you | |
please fix it | |
we'll just let it run | |
and the model says oh yeah whoops the | |
intense argument here's the correct | |
here's the correct code | |
now let's give this a try once again | |
kind of making sure that we understand | |
what the code is doing | |
now a second issue that can come up is | |
it doesn't know what environment I'm | |
running in and if you notice it says hey | |
here's this inscrutable error message | |
which if you've not used jupyter | |
notebook a lot with async IO before you | |
probably have no idea what this means | |
but fortunately | |
once again you can just sort of say to | |
the model hey | |
I am using Jupiter | |
and would like to make this work | |
can you fix it | |
and the specific problem is that there's | |
already an event Loop running so you | |
need to use this Nest async i o Library | |
you need to call Net Nest I sync IO dot | |
apply the model knows all of this | |
correctly instantiates all of these | |
these pieces into the bot it even helps | |
hopefully tells you oh you're running in | |
Jupiter well you can do this bang pip | |
install in order to install the package | |
if you don't already have it that was | |
very helpful | |
so now we'll run and it looks like | |
something happened | |
so the first thing I'll do | |
is | |
go over to our Discord | |
and I will paste in | |
a screenshot | |
of our Discord itself so remember gpt4 | |
is not just a language model it's also a | |
vision model in fact it can flexibly | |
accept inputs that intersperse images | |
and text arbitrarily kind of like a | |
document now the image feature is in | |
preview so this is going to be a little | |
sneak peek it's not yet publicly | |
available it's something we're working | |
with one partner called be my eyes in | |
order to really start to develop it and | |
get it ready for prime time | |
but you can ask anything you like for | |
example I can't you know I'll say gp4 | |
hello world | |
can you describe this image | |
and painstaking detail | |
all right which first of all think of | |
how you would do this yourself there's a | |
lot of different things you could latch | |
onto a lot of different pieces of the | |
system you could describe and we can go | |
over to the actual code and we can see | |
that yep we in fact received the message | |
have formatted an appropriate request | |
for our API | |
and now we wait | |
um because you know one of the things we | |
have to do is we have to make the system | |
faster that's one of the things that | |
we're working on optimizing in the | |
meanwhile I just want to say to the | |
audience that's watching we'll take an | |
audience request next so if you have an | |
image and a task you'd like to | |
accomplish please submit that to the | |
Discord our moderators will pick one | |
that will run | |
so we can see that the Discord oh it | |
looks like we have a response perfect | |
so it's a screenshot of a Discord | |
application interface pretty good did | |
not even describe it it knows that it's | |
Discord it's probably Discord written | |
there somewhere where it just kind of | |
knows this from from prior experience | |
server icon label gpd4 describes the | |
interface in great detail talks about uh | |
all the people telling me that I'm | |
supposed to do Q uh very very kind | |
audience | |
and describes a much of the uh the | |
notification messages and the users that | |
are in the channel and so there you go | |
that's some that's some pretty good | |
understanding now this next one if you | |
notice first of all we got a post but | |
the model did not actually see the | |
message so is this a failure of the | |
model or of the system around the model | |
well we can take a look | |
and if you notice here content is an | |
empty string we received a blank message | |
contents | |
the reason for this is a dirty trick | |
that we played on the AI | |
so if you go to the Discord | |
documentation | |
and you scroll through it all the way | |
down to uh I can see it hard for me to | |
even find honestly to the message | |
content | |
intent you'll see this was added as of | |
September 2022 as a required field so in | |
order to receive a message that does not | |
explicitly tag you you now have to | |
include this new intent in your code | |
remember I said intensive change a lot | |
over time this is much newer than the | |
model as possible is possibly able to | |
know so maybe we're out of luck we have | |
to debug this by hand but once again we | |
can try to use gpd4's language | |
understanding capabilities | |
to solve this now keep in mind this is a | |
document of like I think this is like | |
ten thousand fifteen thousand words | |
something like that it's not formatted | |
very well this is literally a command a | |
copy paste like this is what it's | |
supposed to parse through to find in the | |
middle of that document that oh yeah | |
message contents that's required now but | |
let's see if it can do it | |
so we will ask for I I am receiving | |
blank message contents | |
can you | |
why could this be happening | |
how do I fix it | |
so one thing that's new about gpd4 is | |
context length | |
32 000 tokens is kind of the upper limit | |
that we support right now and the model | |
is able to flexibly use long documents | |
it's something we're still optimizing so | |
we recommend trying it out but not | |
necessarily sort of really really | |
scaling it up just yet unless you have | |
an application that really benefits from | |
it so if you're really interested in | |
Long context please let us know we want | |
to see what kinds of applications it | |
unlocks but if you see | |
it says oh yeah message content intent | |
was not enabled and so you can either | |
ask the model to write some code for you | |
or you could | |
I actually just you know do it the | |
old-fashioned way | |
either way is fine | |
I think this is a augmenting tool makes | |
you much more productive but it's still | |
important that you are in the driver's | |
seat and are the manager and knows | |
what's what's going on so now we're | |
connected once again | |
and uh Boris would you like to rerun the | |
message | |
once again we can see that we have | |
received it even though the bot was not | |
explicitly tagged | |
seems like a pretty good | |
pretty good description interesting this | |
is an interesting image actually looks | |
like it's a dolly generated one and | |
let's actually try this one as well | |
so what's funny about this image oh it's | |
already been submitted | |
so once again we can verify this making | |
the right API calls | |
squirrels do typically eat nuts we don't | |
expect them to use a camera or act like | |
a human so I think that's that's a | |
pretty good explanation of why that | |
image is funny | |
so I'm going to show you one more | |
example of what you can do with this | |
model | |
so I have here a nice hand-drawn mock-up | |
of a joke website definitely worthy of | |
being put up on my refrigerator | |
so I'm just going to take out my phone | |
literally take a photo | |
of this mock-up | |
and then I'm going to send it | |
to our Discord | |
all right going to send it to our | |
Discord | |
and this is of course the rockiest part | |
making sure that we actually send it to | |
the right Channel | |
which in fact I think maybe I did not | |
sent it to the wrong Channel | |
it's funny it's always the uh the sort | |
of non-ai parts of these demos that are | |
the hardest part to do | |
and here we go | |
technology is now solved | |
and now we wait | |
so the thing that's amazing in my mind | |
is that | |
what's going on here is we're talking to | |
a neural network | |
and this neural network was trained to | |
predict what comes next right it played | |
this like this game of sort of being | |
shown a partial document and then | |
predicted what comes next across an | |
unimaginably large amount of content and | |
from there it learns all of these skills | |
that you can apply and all these very | |
flexible ways and so we can actually | |
take now this output so literally we | |
just said to | |
output the HTML from that picture | |
and here we go | |
actual working JavaScript | |
filled in the jokes | |
for comparison | |
this was the original | |
of our mock-up | |
and so there you go going from | |
hand-drawn | |
beautiful art | |
if I do say so myself to working website | |
and this is all just potential right we | |
you can see lots of different | |
applications we ourselves are still | |
figuring out new ways to use this so | |
we're going to work with our partner | |
we're going to scale up from there but | |
please be patient because it's going to | |
take us some time to really make this | |
available for everyone | |
so I have one last thing to show you | |
I've shown you reading existing content | |
I've shown you how to | |
build with the system as a partner the | |
last thing I'm going to show | |
is how to work with the system to | |
accomplish a task that none of us like | |
to do but we all have to | |
so you may have guessed the thing we're | |
going to do is taxes | |
now note that GPT is not a certified tax | |
professional nor am I so you should | |
always check with your your Tax Advisor | |
but it can be helpful to understand some | |
dense content to just be able to empower | |
yourself to to be able to sort of solve | |
problems and get a get a handle on | |
what's Happening when you could not | |
otherwise so once again I'll do a system | |
message in this case I'm going to tell | |
it that it's tax GPT which is not a | |
specific thing that we've trained into | |
this model you can be very creative if | |
you want with the system message to | |
really get the model in the mood of what | |
is your job what are you supposed to do | |
so I pasted in | |
the tax code this is about 16 Pages | |
worth of of tax code and there's this | |
question about Allison Bob they got | |
married at one point uh and that here | |
are their their incomes and they take a | |
standard deduction they're filing | |
jointly so first question what is their | |
standard deduction for 2018 | |
. so while the model is chugging I'm | |
going to solve this problem by hand to | |
show you what's involved so the standard | |
deduction is the basic standard | |
deduction plus the additional the basic | |
one is 200 percent for a joint return of | |
subparagraph C which is here okay so | |
additional doesn't apply the limitation | |
doesn't apply | |
um okay now these apply oh wait special | |
rules for taxable year 2018 which is the | |
one we care about through 2025 you have | |
to substitute twelve thousand for three | |
thousand so two hundred percent of | |
twelve thousand twenty four thousand is | |
the final answer | |
if you notice the model got to the same | |
conclusion | |
and you can actually read through its | |
explanation | |
and to tell you the truth the first time | |
I tried to approach this problem myself | |
I could not figure it out I spent half | |
an hour reading through the tax code | |
trying to figure out this like back | |
reference and why there's some program | |
like just what's even going on it was | |
only by asking the model to spell out | |
its reasoning and then I followed along | |
that I was like oh I get it now I | |
understand how this works and so that I | |
think is where the power of the system | |
lies it's not perfect but neither are | |
you and together is this amplifying tool | |
that lets you just reach New Heights | |
and you can go further you can say okay | |
now calculate their total liability | |
and here we go it's doing the | |
calculation | |
honestly I every time it does it it's | |
just it's amazing this model is so good | |
at Mental Math it's way way better than | |
I am at Mental Math it's not hooked up | |
to a calculator like that's another way | |
that you could really try to enhance | |
these systems but it has these raw | |
capabilities that are so flexible it | |
doesn't care if it's code it doesn't | |
care if it's language it doesn't care if | |
it's tax all of these capabilities in | |
one system that can be applied | |
towards the problem that you care about | |
towards your application towards | |
whatever you build | |
and so to end it the final thing that I | |
will show is I a little other dose of | |
creativity which is now summarize this | |
problem into a rhyming poem | |
and there we go a beautiful beautiful | |
poem about doing your taxes so thank you | |
everyone for tuning in I hope you | |
learned something about what the model | |
can do how to work with it and honestly | |
we're just really excited to see what | |
you're going to build I I've talked | |
about openai evals please contribute we | |
think that this model improving it bring | |
it to the next level is something that | |
everyone can contribute to and that we | |
think it can really benefit a lot of | |
people and we want your help to do that | |
so thank you very much we're so excited | |
to see what you're going to build | |
foreign |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment