Skip to content

Instantly share code, notes, and snippets.

@ek
Created July 31, 2024 20:26
Show Gist options
  • Save ek/b9efbecf534c74fd3f26155cb5c9f3ea to your computer and use it in GitHub Desktop.
Save ek/b9efbecf534c74fd3f26155cb5c9f3ea to your computer and use it in GitHub Desktop.
A crash course in AI with Mike Taylor - Full Episode S5 Ep7

Key Points from AI Discussion with Mike Taylor

YouTube recording

Evolution of AI and Deep Learning 00:29

  • Deep learning models neural networks similar to the human brain
  • More data and compute power leads to predictable improvements in AI capabilities
  • NVIDIA GPUs provide massive parallel processing power, crucial for AI development

Major AI Players and Models 27:00

  • OpenAI: GPT-3, GPT-4, DALL-E
  • Anthropic: Claude
  • Google: Gemini
  • Meta: LLaMA
  • Midjourney, Stable Diffusion (for image generation)
  • Emerging video generation models: Sora, Luma AI

How Large Language Models (LLMs) Work 13:56

  • Trained on vast amounts of internet data
  • Learn to predict next words/tokens, developing understanding of language and world knowledge
  • Newer models becoming "multimodal" - handling text, images, audio inputs/outputs

Prompt Engineering Principles 58:03

  1. Give direction (role-playing, context)
  2. Specify format
  3. Provide examples
  4. Evaluate quality (run multiple times)
  5. Divide labor (break complex tasks into steps)

AI Applications and Considerations 39:45

  • Can automate many tasks previously done by humans
  • Potential for bias in evaluation - humans may underestimate AI capabilities
  • Useful for tasks that would be impractical/impossible to do manually
  • Privacy concerns when using sensitive data with AI models

Future of AI 33:32

  • Rapid improvements expected to continue
  • Integration of different modalities (text, image, video, audio)
  • Increasing accessibility and ease of use for non-technical users

Book: "Prompt Engineering for Generative AI" 1:08:45

  • Focuses on enduring principles rather than specific techniques
  • Aims to provide foundational knowledge for working with AI
  • Extensive editing and review process to ensure quality

Transcript

Speaker 0 (00:00 - 00:04): Welcome to the show for the 3rd time. Mike is becoming,

Speaker 1 (00:04 - 00:07): yes, trying to try to make myself a regular guest

Speaker 0 (00:08 - 00:20): back by popular demand. But, yeah, look today, we exploring AI. I don't think there's been like a conference of being to the last 2 or 3 years that hasn't mentioned AI in some respect.

Speaker 1 (00:20 - 00:20): Mhmm. You know,

Speaker 0 (00:20 - 00:32): it's really funny you watch those, on stage speeches by, like, you know, the Google. What's this like? And it's like every single day. I there's a lot of me going around with somebody doing it. It's like AI AI.

Speaker 1 (00:34 - 00:34): Yeah. So

Speaker 0 (00:34 - 00:47): they're all like, God. So I'm a bit conscious of, like, doing an episode on AI because I think it's got a lot of buzz already. I've started, using it last couple of years. You had just written a book that's about to come out in June, I believe. I'm

Speaker 1 (00:47 - 00:49): talking to you right now. Weeks away.

Speaker 0 (00:49 - 00:50): Speaking circuit.

Speaker 1 (00:51 - 01:12): Yeah. I did, did, like, a book signing at the, lead dev, conference and, have a couple of events coming up as well, which is fun. So, yeah, kinda feel feel like I'm finally famous, even though my my rock band failed, when I was in when I was in high school. I'm now I'm now signing AI books, which is, you know, great for my ego.

Speaker 0 (01:12 - 01:51): A rock star of other types now. Perhaps a rock star in AI world. But yeah. Look. I've been following you, like, like, last couple of years, and you've been doing the talking circles.

Who who better to talk to than yourself? I mean, last time we talked more about attribution and everything, which you were at that time, pretty deep in doing a project on. So, we're covered that. But yeah. Look.

I I just think there's a lot of around this. Everyone I made has trouble using it, and it's like they'll put something in and they go, oh, it doesn't work, which is very easy to find. And I know we've talked about this, but can can we just wind back to the history? Because I think this new wave of AI because, you know, AI has been around for a while, but I think, you know, Jeffrey Hinton. What's his name?

Speaker 1 (01:52 - 01:53): Oh, Jeff Hinton. Yeah.

Speaker 0 (01:53 - 02:28): The the the Yeah. Yeah. His name. I think he's sort of, like, started the new sort of, like, neural net sort of wave, really, which then has really compounded and grown in a nonlinear fashion and improved, like, so much in just such a short pace of time. I know it's just yesterday, and I messaged you using Luna.

They released their new what is it? Dream something? This, text to video. So it's gone from, like, text to images to now video, which is pretty nuts and, you know, it's not too bad. So take it back.

Like, where has this new wave come from? Where did it start? And we we can go from there.

Speaker 1 (02:29 - 05:42): Yeah. So, there's a lot of gatekeeping in this industry just like in in the industry like, like marketing. So, it depends who you talk to, but, I I would say there's there's, like, a relatively big leap in terms of, performance, from traditional machine learning or statistics to what we call today, like, AI, like, generative AI. And, to explain the difference, essentially, you know, when we first had computers, you'd have to build a computer specifically to do that one thing. Right?

Like, in if you've seen the imitation game, you have Alan Turing, they built a computer that specifically calculated, this one equation. Right? And it was the size of a room. Mhmm. And and, eventually, we got to the point where we had general purpose computers, and we got them small enough that you could have one yourself.

And, with a normal computer, you're, you're writing an algorithm or, like, more specifically, the programmer's writing an algorithm, a series of steps says, do this, then do this, then do this, and then your your company accounts are filed. And, and and machine learning, is this kind of you know, it's the it's the newer thing where, you know, over the past, say, 20 years or so, it's really risen to prominence, where, you know, you don't the programmer doesn't write the rules anymore. You know, the machine learns the rules based on patterns in the data, and the machine is, you know, testing the rules. If those rules work, then they're kept. If they're not if they don't work, then they're not kept.

And and over enough iterations, it finally figures out a a whole host of millions of these rules that, you know, you otherwise just, you know, completely unintelligible to a human. Now, where deep learning comes in, which is kind of interesting, is that they try to model, how the human brain writes those rules. So, you you know, they have artificial neurons, right, and neural networks just like the brain has, neural networks. So it's not a perfect simulation of a human brain, but it's, like, surprisingly effective considering how, you know, how how poor it is compared to how our brain does it. Right?

So, what what they found, and, I think this is, like, the real breakthrough is, this is back in the day when when deep learning wasn't very popular. Most people thought it wouldn't work. Jeffrey Hinton and then a few of the people that he that he then, kind of, like, taught and and and and mentored made a few breakthroughs, and they they figured out that, hey. If you kind of use this neural network, architecture, it can handle, arbitrarily large amounts of data. And it seems like the more data you throw at it, the better it gets.

Like and it's a pretty predictable curve. So we've just been on that curve. And, you know, everyone in the AI labs noticed that curve, and they just went, If we can get more GPUs, then we can, you know, just follow that curve. And predictably, it's just gonna go from, you know, it's a high school level to, like, college level to, you know, eventually better than, any human at any task. Right?

And and they're on the curve. They they think Okay.

Speaker 0 (05:42 - 05:43): Let's just stop that. So this

Speaker 1 (05:43 - 05:47): is yeah. There's a lot to unpack there, but, yeah, that's kind of the story of where we're at today.

Speaker 0 (05:48 - 06:06): I like it because you use this word compute, which is like the new oil. And I think to explain why NVIDIA is worth so much is because they're controlling a lot of the world's compute power. So and compute just means like the power of a CPU or some kind of chip to, like, do these calculations and do this learning

Speaker 1 (06:07 - 06:08): Exactly. Yeah. Yeah.

Speaker 0 (06:08 - 06:09): As quickly as possible. Correct?

Speaker 1 (06:09 - 08:16): Or Yeah. Yeah. And and what NVIDIA figured out really early on, and they did it for graphics cards initially. It was if you're a gamer, back in the day, you would buy an NVIDIA chip for your computer because then it could process all these kind of renderings, and you could play video games without lag. Now what they realized, and I the analogy I use for this is like a checkout line in a supermarket.

Right? A normal computer, is like having one checkout line. Whereas, you know, so so there's a massive queue of people waiting to check out. Right? So every time the computer is running an instruction, that's a person waiting to check out, and it processes it one at a time.

Sure. Well, an NVIDIA GPU gives you more staff, essentially. Right? So, you know, you have maybe, you know, 10 lines, 16 lines, a 100 lines, a a 1000000 lines. And, you know, that means that, you know, more people can check out at the same time.

So you can you can run more computations at the same time. And, that turns out to be, you know, really useful. It's surprisingly useful. And, actually, NVIDIA were pretty early pioneers like that. A lot of the early breakthroughs from Jeffrey Henson and like the people that he, mentored, were using Nvidia GPUs kind of came about.

I I, wanted to listen to, the Acquire podcast, and it's really interesting backstory. Oh, that's okay. Yeah. But they they they they dove really deep into it. They did, like, a 3 part series, which I really recommend.

They, the the reason why, they got into this really early on is that, like, some scientist, he was talking to his son who was playing had an NVIDIA GPU, and he had to play games. And he was like, the specs on this are better than what I have in my lab. So he went out and bought a bunch of GPUs from Best Buy on his credit card, and they and they helped him finish his research, like, 4 months early. So he, he even contacted NVIDIA and said, like, look. Like, this is amazing.

Thank you so much. You helped my research. And I think that that kind of early signal, you know, Jensen, the CEO, massively doubled down on scientific computing before anyone else was really thinking about that. And now he owns the whole thing, which is and making NVIDIA is so so valuable.

Speaker 0 (08:16 - 09:08): Yeah. It's insane because, at the first wave NVIDIA like, I knew them from gaming, like, way back, when the first ones came out and we're like, oh my god. Like, I can play 3 d rendering, like and what it was doing at that time is that the the CPU was constrained by processing power that you need a GPU to like do all this crazy next level graphics so they've been doing image rendering for for Yolks, right, so they obviously they have a step up in knowing how to specialize everything designed around just that. And because the question comes up like why can't you just use a you know a couple of CPUs right instead and it's because like these are specialized there's like layers of stuff built into that that is just specialized into that one thing right yeah. Yeah.

Like, I just find it really interesting that so that that curve compute power is it's basically brute force. It's like strength of compute that allowed us to get to this next level in the AI reactor. Yeah.

Speaker 1 (09:08 - 09:51): Yeah. It's it's an engineering breakthrough rather than, really that many scientific breakthroughs. I mean, the transformer, you would call that, like, a scientific breakthrough. There's a famous paper, attention is all you need. But, you know, there have been scientific breakthroughs along the way, but, really, it's a story of, hey.

Let's just throw more GPUs at it. Right? Which is why Silicon Valley is is doing so well. You know? Like, they're just great at that.

You know? Like, they're just going, hey. Hey. Well, you know, everyone else is thinking, okay. What am I gonna use all this compute for?

You know, whereas, a lot of the guys in Silicon Valley are already talking about when are we gonna build the first, $100,000,000,000 cluster of GPUs? When are we gonna build the 1st $1,000,000,000,000 cluster of GPUs? So, you know, they're thinking, you know, 5, 10

Speaker 0 (09:51 - 09:51): years ahead.

Speaker 1 (09:52 - 09:54): Yeah. It's, they're pretty far ahead of everyone. Yeah.

Speaker 0 (09:55 - 10:12): And and that's why, you're probably seeing some geopolitical issues here with, like, neon gas out of Ukraine, the Houston ship production being a bit of a important resource that wants to be protected all costs. And then the whole Taiwan thing, they wanna get, s what is it? SMRC or what's the Yeah.

Speaker 1 (10:12 - 10:13): T t s m c.

Speaker 0 (10:13 - 10:15): T r c. T s m c or something?

Speaker 1 (10:15 - 10:16): Yeah. Yeah. Yeah. Yeah.

Speaker 0 (10:16 - 10:48): They're like now, t TSMC. Funnily, I have the story that, he was Taiwanese, guy in America, and he couldn't get enough funding and they didn't wanna, like, create ships there. So he had to go back to Taiwan, do it, and now it's, like, become such a big thing. And ironically, it come full circle. So now the US is like, hey.

Build these, manufacturing facilities back in the US because we need them. They were, like, at the core of everything this next wave. And now it's basically a fight between China, and and the US in terms of who's gonna get AI first and control sort of the whole world's power, which is really interesting.

Speaker 1 (10:48 - 11:02): Well, yeah. Yeah. I mean, the lesson from history is that America America always, you know, does the right thing eventually. It just takes them a while. And when they when they do when they do decide to do something, they catch up very quick.

You know?

Speaker 0 (11:02 - 11:21): They do. Yeah. Yeah. It's like a strategic resource. There's always some kind of war or, like, a forceful acquisition because it's like we need that and we'll pay whatever.

Yeah. Yes. So I think that's explaining some of the geopolitical issues right now, but, I think the first wave in video just to backtrack a little, with their GPUs was, for Bitcoin mining and and

Speaker 1 (11:21 - 11:24): Yeah. Yeah. Yeah. With with weirdly there. Yeah.

Speaker 0 (11:24 - 11:26): And that's the next

Speaker 1 (11:26 - 11:48): one. Yeah. Weirdly, they, you know, gaming PCs is one thing, and then, they started to to sell out GPUs for crypto mining. And then, when crypto crashed, they had all this excess capacity just at the right time and and feel like, the early pioneers in AI probably massively benefited from that. Right?

But it means that they were so far ahead of everybody else in in, in production, right, because they had already scaled up for crypto.

Speaker 0 (11:49 - 12:31): Yeah. Okay. So, so we've hit this new wave. The CompuPal is coming, NVIDIA is, like, rolling in it, and then we've got this of the first, I would say, there's been a lot of people working on these projects, like university, the things, private, AI things. I know there's a big wave of, like, every startup was AI and then sort of that died off a bit because they found out how hard it was.

And then, you know, sort of SaaS took over a bit and that was big. And now I find, what was it, 2019 or something that, Type CPT sort of 3 came out or something, 2 or whatever. Like, can you run us through, like, the next web? Because now it's becoming sort of accessible to consumer, whereas before it was very much like an initially sort of b to b thing or, like, in Research Labs. So when did that start and who who's who started

Speaker 1 (12:32 - 13:29): it? Yeah. So, GPT 2 is when I started to hear about AI being, like, worth looking into. And, and then, you know, I was I was I was at the time I was leaving my agency. You know, I I left, in in 2020, and and, you know, that year, I got access to CopyAI was the first kind of experience I had with GPT 3.

They had been building it on GPT 2, which could bear the string some sentences together, and was open source. GPT 3 was the first one that was closed source. The OpenAI was like, oh, we've got something here. And, and they started to, you know, sell API access and things. So, those guys got API access really quickly.

I got access to that, and I was like, I remember I was on holiday, and my wife works in journalism. And I was showing her, and she was like, her reaction was not like, wow. This is cool. She was like, oh, this is terrifying. It's like, I've gotta see, this, like, what am I gonna do in 5 years, you know, if this keeps getting better?

Speaker 0 (13:29 - 13:55): There's also, like, Jarvis dotai. I remember using that early as well. It was like Copy AI and then Jarvis, and then there was a couple of other iterations, but a very quick succession. And so this is, because this comes up a lot like what is an LLM, and there's different sort of, there's open source ones now that are a bit like smaller, sort of models. And then for the bigger ones, you've got to pay money because it costs more.

So, like, can you explain the difference there? Because there's a couple of lots of models going around. What's the difference between these different ones?

Speaker 1 (13:56 - 15:16): Yeah. So because we're on this curve, the models basically get, like, twice as good and half half the cost, in every 6 months. Right? So, so the models that, you know, are coming out today that, run are small enough to run-in your phone, like the new one that Apple announced, you know, they they are already better than, you know, the the best possible model was 2 years ago. You know?

So so so things that were, like, state of the art 2 years ago, essentially, you can assume that, like, it'll be small enough and cheap enough to run-in your phone, you know, a few years later. So, that's kind of the curve that we're on. And, you know, so GPT 3, I remember I used I used a lot and, you know, started to automate a lot of my work. And then when gpt 3.5 came out, which was basically the same model but really trained to behave, trained to be a bit more useful as, like, a an assistant rather than, you know, just a raw model, then, that was what powered chat gpt. And then and then we're off to the races because that was the first major consumer application where it wasn't just an API that us nerds were playing around with, you know, people could program.

Yeah. Like, suddenly people who could you know, there was it was a a model that actually did a good job, that, like, regular people could try. So, you know, they quickly got to a 100,000,000 users very quickly. Yeah. Yeah.

Yeah.

Speaker 0 (15:16 - 16:29): Those are nuts. And I remember, prior to that, because, yeah, I also had access. Right? Like, API access. I remember that that's sort of what killed really social media because suddenly, a lot of the tech people or people who had access to the stuff could create, like, you know, a 100 viral Twitter tweets that I could then queue into the system and know that'd be pretty successful, you know, and then you had these sort of, these content niches like Tapilio and, like, tweet hunter and stuff, you know, aggregating the stuff and then you can kind of do iterations of it at a push of a button, like basically automate your social media.

I remember, people using that and then all these other users going why are they so popular? Like, how do they, you know, tweak so many successful things in a row? I'm like, yeah. Well, there's there's some tricks happening in the background. You know what I mean?

Like, I remember that happening and then that really killed social media algorithms And they really had to, they really had to put, the quality filter parameters really high, which has killed all the reach, organic reach. Right? But I remember, like, them having to counteract that force because it was just so good, and there was, like, this this separation between the 2, you know, people. Okay. So ChargePD comes out, and then it becomes consumer facing.

I was like, oh my god. This is so good. I can kinda do anything. Yeah. What is an LLM, though, before we sort of go to the next thing, which is GPT 4 or whatever?

Speaker 1 (16:29 - 19:13): Yeah. It's an LLM. Yeah. So so the yeah. There's there's 2 2 major kind of, infrastructures that that are being used.

Right? So, when you talk about, text models, that's a large language model, LLM. You also have diffusion models like, Midjourney, DALLE, stable diffusion. They're the ones that generate images. And, and actually everything's kind of converging on LLMs because the newest version of DALI is supposedly, it's a diffusion model like embedded within an LLM.

Right? Same thing for stable diffusion, the newest model of that. So so everything's kind of converging. So what is an LLM? It's really just a, an architecture, that they've figured out.

They can throw a lot of data at, and it just, you know, can take it. And, and and the the way the way it works is, you know, they chopped up the whole Internet, like, basically scraped the entire Internet. They, you know, the the the the the problem with, these models is, like, how do you know whether it's doing a good job or not? Right? Like, how do you how do you teach it, oh, that you did this well, you did this badly.

It's too expensive to do that for for early training, because you'd have to hire millions of humans to teach it. Right? So what the trick they they pulled was, hey, if we just like take every sentence on the on the Internet and we we just chop off, say, like the last word, and we and it's done a good job if it predicts the last word. And, this is like a really elegant solution because now you don't need to really do any you don't have to have a human review that data. Right?

Because, you know, it can it can automatically, you know, tell whether it's done a good job because you know what that last word actually was. Right? So, now, the really interesting thing that happened, and this was the unexpected part, is that just by telling it to predict that last word, it it it basically had to learn a lot about how the world works. Right? So it seems very simple, like, oh, it just predicts the last word, like, you know, LeBron James, threw a and then it would be like, you know, a bowl.

Right? Because he's a he's a basketball player. Right? That's a pretty easy prediction. But, in order to make that prediction, you have to know who LeBron James is.

You have to know he's a basketball player. You know, you have to kind of know a little bit about how the world works, in order to make that prediction. So, you know, by giving us a quite simple task, and you do that, you know, 100 of 1,000,000,000,000,000 of times, then, then it it just reverse engineers, physics, language, you know, it can translate between any, you know, between any language, your data structures, it can learn how to code. You know, it's it's kind of crazy how many, different things it's already capable of just from that one trick.

Speaker 0 (19:13 - 19:18): Okay. Okay. So what's the diffusion model then? Because you said there's LMs and then diffusion model. So what's the difference?

Speaker 1 (19:19 - 19:55): Yeah. So diffusion models are used for image generation, and, they're they're smaller, and, they I don't like, they they're they're training on lots of images, but the, the way that they, they operate is slightly different, a similar type of concept. But what they do is because because obviously there's no words in an image, right, so they have pixels. So what they do is they say okay. If you take an image, and then, you add noise to it, so you make it fuzzy.

Right? So you have a high quality image and you make it fuzzy.

Speaker 0 (19:56 - 19:56): Yep.

Speaker 1 (19:56 - 20:30): And then you just see if it can predict the real image from the fuzzy image. So see if it can remove noise Okay. From the image. Right? So that's what it's learned to do.

So, so yeah. So so the, it turns out if you do that enough, you can eventually, get it so good that you can give it random noise and tell it this is an image of, you know, Donald Trump, on the moon, whatever. And, and and it will remove the noise until you know, so it's like, you know, that that famous, Michelangelo Michelangelo.

Speaker 0 (20:30 - 20:35): Same kind of thing instead of the last word. It's like it's like finding the next pixel kinda thing. Right? That's like

Speaker 1 (20:35 - 21:02): Yeah. Exactly. Yeah. Because it has some of the pixels because there's noise. Right?

But, yeah, it's like the Michelangelo, Michelangelo quote. You know, like, I think it was him anyway. You know, when, like, how how do you how do you, how do you carve statues? I take a lump of rock, and then I chop away anything that isn't the statue. You know?

That's kind of like what they're doing with denoising. You know? It's it's like you I just give it random noise. You just say, just chop away anything that isn't, you know, isn't the prompt.

Speaker 0 (21:03 - 22:07): That's funny because I remember doing sculpture in art class. I think it was in high school or whatever, and, I found it really hard because you had to think backwards. You had to think, where's the point closest to the subject and then etch away. So it's basically, like, you know, painting and kind of adding layers and, like, you you're crafting something from inside out. And sculpture was the complete opposite, and I found it really hard because if you etch away too much, like, the the ends up being flat and then everything kind of looks weird you're losing the depth and, yeah, that's why I found it really interesting so completely get it.

Okay. So got diffusion models, got LMs. Now ChatChik d comes out. Everybody's using it. It's, like, $20 a month, whatever it is.

And people then using Dolly. So, can you just tell us all the different brand names out there? Because I think it's really good to get a lay of the land. There's some open source models, meaning they're kind of free and kind of build upon, like, I know Facebook's llama is somewhat open source. And then, there's more ones that are closed source like OpenAI, where you've gotta pay or there's a limit on the usage, that kind of thing.

So what are the big players right now as it stands?

Speaker 1 (22:07 - 23:38): Yeah. Yeah. So it's obviously changing every week, but, but the, the major kind of genesis of this is OpenAI was way ahead of everyone for a very long time, and they're really the only player in town. And then and that's that has remained true mostly in LLMs, actually. Like, it's only really recently that they're aware of the, you know, players that were serious enough to challenge them.

So I'll cover that in a sec, but, like, they're really an interesting one was the image space because with the fusion models, because I feel like the innovation happened a little bit quicker there. And and it is, like, you know, probably what's gonna happen with LLMs, and it's gonna play out the same way. So in the image space, OpenAI released DALL E. This was in 2022. I tried to get access.

Couldn't couldn't it was on the wait list. I'm not in Silicon Valley inside there. Right? So so I I I signed up to Midjourney, and, and it was kinda crazy because people were like, you know, DALL E, when we saw it, the images were unbelievable. It was like, wow, this is magic.

I can't believe you can type in like a picture of a cat and it generates a cat. Right? Like, this was crazy to me at the time. I thought this is I I really wanna get access to this. Me and Jenny came out, and everyone's like, there's no way that, like, you know, some random bootstrapped startup, you know, I think they're a tea a team of 14 people has made anything close to what OpenAI has made.

Right? Like, nobody expected that this would be possible. And and, the funny thing is, like, they opened their wait list up really quickly. I got access. I think it was made for for take first.

I just looked at it recently.

Speaker 0 (23:38 - 23:41): Because they're on they're on Discord as well.

Speaker 1 (23:41 - 24:21): Exactly. And they had a really weird model, like, they're on discord. So, you know, the only way you could use the model is if you, like, send it a message on discord, which just again made it extra nerdy. Yeah. So they kind of flew under the radar, but, it was one of the reasons why I thought, oh my god, this is gonna be a very big thing is that they very quickly like, I remember, like, every 6 months where they're adding, you know, 5, 6000000 people to that Discord.

So I I think it's well over, like, 15,000,000 people now, which is bigger than, say, the Fortnite Discord. You know? It's bigger than all the gaming communities. It's bigger than the crypto communities. So I was like, okay.

This is definitely a thing. Right?

Speaker 0 (24:21 - 24:21): Yep.

Speaker 1 (24:21 - 26:36): And then and then so long, there was a huge shock that, you know, someone without OpenAI's $1,000,000,000 of funding, could make something as good as as DALL E, but, they actually iterated quicker and became better than DALL E for a long time. And then the really surprising thing Mhmm. Was stable diffusion came out. And this was a real shock because, not only was it, like, as good as Midjourney and and DALL E at the time, but it was open source. It was free.

Right? I could download it and run it on my computer. Right? And and, I think that that's what really blew open a lot of interest. That's when all the, you know, people on Reddit were were playing around with these things.

Obviously, inevitably, people started making porn with it, you know, which is where a lot of the innovation came from, you know, which the the industry doesn't talk about that much. You know, and and, and and it just because it was open source, it just got better a lot quicker. It had a lot more features. People invented new techniques. So, in in a lot of ways, it's kind of instructive because we haven't had that stable diffusion moment yet in LLMs.

You know, in LLMs, you have OpenAI still has the lead. You know, Anthropic is the nearest competitor, like the midjourney of, you know, of of of, of the LLM space. You know, smaller team, you know, more personality, you know, less kind of corporate overload vibes, and, you know, not aligned with Microsoft. You know, so, so they were kind of almost there, and we have Llama 3, which is open source, but it's not it's not as good as, you know, Claude or which is their anthropics one, or or it's not it's not as good as Chatibiti. GBT.

Right? So, you know, Zuckerberg is, you know, is kind of the savior of the open source community. Like, people are praying, worshiping him, because, like, he he's, like, currently training a model that's gonna be 400,000,000,000 parameters. The the current one is 70,000,000,000, so it's gonna be much bigger. And, that version of llama might be the stable diffusion model, you know, like, equivalent.

It might be that moment where suddenly we have an open source model. We can run it on our own local computers potentially, you know, in a few years. And, and then we're really off to the races, I think.

Speaker 0 (26:36 - 26:45): Okay. Okay. So just to summarize then, we've got a stable diffusion, which I know there's some, like, finance guy that's sort of, like, taken over that and, like, you know, there's some weird stuff. Yeah. But the company was

Speaker 1 (26:45 - 27:00): kind of imploded. Yeah. The the actual company that pushed it forward is imploded, but, but the, but, you know, that's the nice thing about open source is that, like, it doesn't make a difference because we've all got it on our computers now, and we're, you know, we're we're improving it, yeah, regardless of what the company does.

Speaker 0 (27:00 - 27:19): Okay. So we've got stable diffusion. We've got MidJourney. We've got Accord by Anthropic, we've got, Llama by Facebook, we've got Yeah. Chat gpt or gpt 34 or whatever by OpenAI.

Did I and then no. We've got Google fail. I mean, Google Gemini. I think it's called in this mother names or

Speaker 1 (27:19 - 27:20): whatever. Gemini. Yeah. They they

Speaker 0 (27:20 - 27:22): we have an exchange of them. Or not?

Speaker 1 (27:23 - 27:53): Yeah. Yeah. So, another one to to know about so so to be fair to Google, I think they they, you know, they have a lot of, latent, you know, kind of pent up abilities that haven't really been fully deployed yet, because they've been a bit slow to the slow to the mark. But, but, you know, they they have their own chips. They have, like, a separate type of chip that isn't the same as NVIDIA's.

They have, you know, so that's a huge advantage. Like, they're not gonna be constrained like other people might be.

Speaker 0 (27:53 - 27:54): Heaps of data.

Speaker 1 (27:54 - 28:25): Yeah. Heaps of data, heaps of money. So so, like, even though Google is behind, I suspect it won't be for long. Now, yeah, the other one to mention that is pretty interesting is, there's Mistral, which is a French startup, and they've also been open source. Mhmm.

And they they made a big splash. Again, very small team. They released a certain source model, and, it was, like, suddenly almost as good as, like, the the the the the worst version of Chat CPT. Like, the I I

Speaker 0 (28:25 - 28:35): heard they just got some extra funding yesterday because they got some big, sponsors like Andreessen and Sequoia. I think they, like, raised that round recently. They just raised, like, 300 or something crazy. Like so they're one of the big European ones.

Speaker 1 (28:36 - 29:01): Yeah. Exactly. Yeah. And and I I feel like there might be I I mean, it's it's great that they exist because one of the biggest dangers of AI, might be what if the EU massively overregulates it. And, the nice thing about, Mistral well, I I'd say that's not necessarily a danger to AI.

It's a danger to the EU because, you know, can you imagine being left out of the Internet, you know, happening or being left out with the printing

Speaker 0 (29:01 - 29:03): press or something? AI. Right.

Speaker 1 (29:13 - 29:41): You know, think think they're good. You know? So so it's like their their like, their vibes are immaculate. You know? So so so, like, they're actually really well respected and and, you know, they the chances are they'll release, like, a gpt 4 level open source model fairly soon as well.

So, I think now that the way the French have this national champion, that's actually, like, probably maybe one of the more beneficial things that could happen is that, like, because they're doing a good job, maybe that will stay the hand of the EU and and, you know, mean that Europe is on for the right, which is,

Speaker 0 (29:42 - 29:42): it can

Speaker 1 (29:42 - 29:44): be really helpful, I think. Because we don't want it all to be just

Speaker 0 (29:44 - 30:36): dominated by the US and China. You know? No. No. I think it's a bad thing.

So we got, so we we need compute power, which is the GPU or CPU kind of power to train these models. Obviously, we need lots of datasets, which is where this, like, privacy thing comes into into play because, like and I think that's why Google's behind us because they have, like, more of the world's data than anybody else. Right? But they can't, like, exclusively use it because of, like, previous, privacy policy and like other things. So they I think they they have the most potential, but probably it helps explain some of the misses they've had recently is because they're actually trying to protect, some of these agreements.

And then, obviously, Facebook has pictures and videos and texts, so they're, like, they're on a winning wicket there with, like, crazy amounts of, like, the world's data as well. So you need Compu Power, you need the data inputs, and then you train your models, and they get better over time. And you just basically need raw powders to make them better. Now we've got text. We've got video.

Yeah.

Speaker 1 (30:36 - 32:11): Actually, you're gonna miss out on Yeah. Sorry. Yeah. You it was one key phone, though, is the talent, actually. So, that's, like, kind of adds another dimension to this.

Mhmm. It's it's really interesting because today, like, as a developer, you know, you can make a lot of mistakes. Right? Like, if I'm building a web app, like, I'm building a website, I can I can code it? If there's errors, I can code it better.

I can run tests. If it passes the test, whatever. Yeah. That's fine. So so coding right now is a is a iterative loop, and it's trial and error.

And it's very fast, very easy, very low cost relative to, like, you know, in the past. Well, because AI is so resource intensive, because these GPUs are rare and, expensive, and because, getting access to those GPUs, like, you better have a good reason to be using Clif. Right? Because if you're if you're messing up like, if you run your code and it breaks, then, you know, at the end of the training run, then then you've just wasted all those cycles. So I think this is maybe an underappreciated part for people who aren't, as technical.

OpenAI has all the talent. So, when they do a training run, like, the people on their team just have better intuition for what's gonna work, because they're more experienced and because they're, like, really super ambitious and, you know, they they came into this really early on. So they have better intuition about what's gonna work when they run it. And there's only a few places in the world where that type of talent can create. It's OpenAI.

It's, you know, Facebook, and, and it's, like, anthropic and and, you know, those sorts of places. So And because, again, because they That's the key.

Speaker 0 (32:11 - 32:55): I saw the that their revenue was, like, 3,500,000,000 ARR now, something like like, making lots of money. So, obviously, some of the the the talent that's there have, like, shares or employee share options. So, like, there's big yeah, buy in there, which is kinda helps explain why perhaps Sam Motwin, albeit for all his flaws and, like, Helen Turner's, you know, investigation. Well, she disclosed what actually happened. I was like, oh my god.

Yeah. I was like, he really he chatted with you without telling the board. I was like, what the like anyway so, so he's obviously got like a lot of talent locked in with employee share options that kind of thing that vest of the time. And then, obviously, it's an interest. They're making lots of money, like, 1,000,000,000 ARR, which is which is big, which feeds Rep rep revenue, but they're not That's

Speaker 1 (32:55 - 32:57): why I've been surprised that yet, probably.

Speaker 0 (32:59 - 33:32): Yeah. No. No. No. Of course not.

But I mean, it does help when compared to an open source program, whether, like, you know, it's free or something. You know? Obviously, you're gonna attract that talent to be quicker cycles, better improvement, better quality. So cool. Okay.

And now now now we've, like, we've got the Cling, that I saw the other day, a Chinese video text to video thing. We've got, Soar, which came out couple like I said, 2 months ago or something. That was crazy, which is the video one. And then we've got, Luma that just released their text to video one as well. So is that, like, a diffusion model, just that in motion?

Like, explain what what these are.

Speaker 1 (33:32 - 34:00): Yeah. So, you know, we, you know, we we did text, we did images, and now people are starting to explore, other modalities. Right? So, you know, audio is is another one, like Juno AI. I I don't know if you played with that.

It's really fun, really good. Like, you can tell it what type of song you want, and it will Yeah. Come up with the lyrics. It will generate. So I hit, like I I actually I, like, I use that one quite a bit because I, like, make songs for my daughter.

Like, I've made, like she's like, she wants, like, a Taylor Swift song.

Speaker 0 (34:00 - 34:02): Oh, all for, I've seen it for, like, all

Speaker 1 (34:02 - 34:03): she did that day.

Speaker 0 (34:03 - 34:20): Like, speeches. Yeah. Yeah. I've seen people use on on speeches, like, before they come onto the stage just like the just create some kind of track or whatever or, like, for the start of podcast like this one, you know. Actually, let's do it.

Let's let's put AI into this thing. So I, actually, I'll edit this later, but you'll probably hear, like, an intro track. So let's let's create some AI.

Speaker 1 (34:20 - 38:03): Yeah. Yeah. Yeah. So, yeah. So, yeah, audio and then, you also have text to speech.

So, you know, if you're, like, there's 11 Labs, which is the the one I've used, but OpenAI has something as well where, you can type in some text and then give it a voice, and then it will start talking. So this is the you know, with OpenAI, they got in trouble for using Scarlett Johansson's voice. Or actually, technically, someone who sounded like her. Right? Yes.

Yeah. And then you also have audio transcription. So, that's kind of a solved problem now that you can take any kind of podcast, transcribe it into text, right, to get the text out. Yep. So so what you're finding is regularly?

Yeah. Exactly. Yeah. We're we're using all of these tools. Like, they're all kind of converging, and and gpt four o is, you know, the the first glimpse of that, I think from a strategic perspective.

Gpt 4 o doesn't it isn't multiple models where, like, one transcribes the audio. Like, when you talk to chat gbt, you know, it used to be that you'd, you you talk to it. It would listen to the audio, and then it would turn the audio into text and then it would take the text, and then and then return more text and then return that text, you know, its response into audio again, and speak in Scarlett Johansson's voice. Right? So so that's kind of how you would you would build this type of tool.

You would string all these different models together. And then when you ask it for an image, it's actually calling a different model, the diffusion model, and saying, hey, make me an image of this. This is what they've asked for. So, 2 p four zero is the first question. They're all kind of converging where it's a multimodal model, meaning, they take in any type of input, and then, they, you know, that kind of goes into the you can think of it as the AI's brain, which will trigger some neurons somewhere.

And then those neurons can then fire, any output. Right? So you can take any, you know, image, text, sound, whatever in, and then give anything out. Right? So so that's one of the trends.

Now the the video models are still kind of a new modality. They're not quite integrated into that format yet, but they will be in the future. The way the video ones work is slightly different right now in that, they need if you think about a video is just, you know, lots of images stacked together in a way. You know, it's like if you have a a 150 frames a second, it's that literally just means it's like a 150 pictures, every second. Right?

So it's a lot of data, but it's not like, you know, impossible, to to handle. So, so, again, you just scale up the GPUs. Eventually, it gets to the point where it can generate a 150 images faster than we can see them. Right? And, and then it can create a video.

So, the the way that it does it technically you know, LLMs, which is the tokens, the text. These patches are basically like parts of the screen, where, the well, the way they do it, you can kind of track, like, okay. There's a car moving across these images, so it knows which patches to predict, like, what patch it's gonna get into next, if the car's moving from left to right. So so, that's really important. That was, like, kind of the last thing they needed to figure out for video because, it's not it's not good enough to just generate a 150 images if, like, you know, people jumping around and it's very jerky motion.

Right? Like, you need it to be smooth motion. You need to slightly understand physics. Right? Like, if you throw a ball in the video, like, the ball should come back down in the in the right place.

So

Speaker 0 (38:03 - 38:07): Well, like, that was the last Yeah. The all the clothes, like

Speaker 1 (38:09 - 38:10): Yeah. Exactly. Okay. Well, like, it

Speaker 0 (38:10 - 38:12): was like a storm melting cup,

Speaker 1 (38:12 - 38:58): example where where they, you know, had a pirate ship in a in a in in a cup of tea. You know, and it looks realistic. Right? So, yeah, I think that that was kind of the last unlock they needed. And and, you know, in the similar to text and images now you have, like, OpenAI's model, Soarer, which isn't released yet, but it's supposed to be amazing, and then you have some of these kind of upstarts who have also cracked this at the same time.

And, they're kind of going, hey. You know, you can use our model straight away. So Luma Labs in this analogy is like the mid journey, in the scenario. Right? Like, OpenAI hasn't given anyone access to story yet, but, like, you know, Luma Labs, you can you can use straight away.

So so I feel like, yeah, I'm I'm, kind of bullish on them. I if if they don't mess up, you know, I think they could do pretty well.

Speaker 0 (38:58 - 39:45): Obviously, to use these models, let's bring this to the next thing. Before we get to that, though, is this data privacy thing? So I know some businesses that, I'm talking to, whatever, they're like, hey, like, legal firms, etcetera. They've got a lot of data, like, personal data on storage, on cloud service, that kind of thing. And they're going, okay.

Can we use that to, you know, especially, like, help desk or internal documents, whatever? Can we use an AI to then, like, easily query that or repurpose that into articles or this or that or whatever? Right? So but, obviously, there's some privacy considerations around, like, where does that data go if it's really sensitive, you know, which model is gonna use because I don't want, you know, this company's getting this really sensitive stuff. So what what are those sort of use cases there for for some of these larger businesses that might be sort of playing around with it?

And what would you sort of recommend there?

Speaker 1 (39:45 - 44:33): Yeah. Good question. So, when we start talking about use cases and and, like, using your data in the model, there's a lot of misconception. And, the way the way I describe it I I think some of the misconception is on purpose. Right?

Like, I I've worked with vendors who they say they're, like, training the model in your data, and and they're not actually changing the model. They're still using the same OpenAI model underneath, but what they're doing is they're putting your data into the prompt, that they send to OpenAI. Right? So, they're not actually training anything. Right?

Like, they're just kind of misusing the word to make it sound, you know, fancier so they can charge more. But, but, no. There's a a few different ways to think about this. Right? So, I like to give the analogy of, of humans, like, of people going through the workplace because I feel like this is an easier way to understand it.

So you have pretraining, which is if you think about the 1,000,000 of years of evolution that got us to this point, you know, there are certain things our bodies know how to do. Right? There's certain reactions that we have to our environment, that are preprogrammed in. You know, we we've learned them through millions of years of evolution, being alive on Earth. Right?

And, and those those are kind of like biological. Right? So, so when you train, the new version of OpenAI like, your GPT 4, GPT 5, that that there's a pre training step where they take all the information with the Internet and it learns to derive like, how the environment works. So so that is our equivalent to biology. Then you're born and, and then your parents raise you.

Right? And, what your parents do is they tell you, no, don't do this. Yes. Do more of this. Right?

And you do that enough times over 18 years and hopefully, like, a fully formed citizen pops out, right? That is, you know, that isn't gonna go to jail and, you know, it's gonna do good thing, helpful, useful things for society. Right? And that's called that the equivalent in L. L.

M. Terms is RLHF Reinforcement Learning from Human Feedback, which is the difference between GPT 3 and chat GPT. GPT. Right? Chat GPT is much better at following instructions than the raw base model, and to the point where, like, they didn't even let you use the raw base model anymore.

Now, you know, whenever you use a version of, of gpt 4, you know, gpt 5 onwards, it's you can it's it's not it's not a baby. You know, it's it's it's a it's it's a an adult. It's been raised by good parents. You know? You can think about it that way.

And then, and then you go out to the world of work and you you get a job. And on your first day on the job, they say, okay. Here's how we do things at our company. Right? Here are standard operating procedures.

They might give you on the job training to say, here's how we do this task. Well, the equivalent in LLMs is the information you put in the prompt. So you're not actually training the model, right? There's nothing, your DNA is not changing, right? It's you're still the same person, but you're getting more context on how to do that task.

And and that is the way that most companies will and should interact with these models. Is like, don't worry about making your own, you know, new model from scratch that's that's gonna cost 1,000,000,000 of dollars, you know, don't even really worry about fine tuning, training the model weights, because that's like the equivalent of raising a new child. It's still relatively difficult and expensive. And, you know, what what you need to worry about is, like, take a already well raised child, which is chat gbt and put your context for that task. Like, the thing it doesn't know, it doesn't know anything about your company.

Right? So, if you can give it the standard operating procedure in a document and say, here's how we do here's how we write here's how we, you know, do this task, and it's got that in the prompt, then, it can do that task much better according to your specifications. So so all of the kind of hype around, you know, using ChatChappetee with your data, etcetera, it's all about just stuffing that into the prompt. And the nice thing is we've had a lot of innovation in this space as well where you used to be you could only fit 4,000 tokens into the prompt, which is a token is is about 3 fourths of the word. You know, so, so it's it's like, you know, a normal like a blog post, right, you could fit in, and that was about it.

Whereas now, the standard g p t 4, I think, can fit a 120,000 tokens, which is about the size of a Harry Potter book. So you so when you talk to chat gbt, it could fit that much information, like a whole Harry Potter book worth of information into the prompt, and then, and then give you a relatively good response from that. So so that that's the the delineation there. Most most AI work is, you know, most prompt engineering, what I'm doing is not like training you a new model. It's it's really just like finding clever ways to put the right information into the prompt.

Okay.

Speaker 0 (44:33 - 44:55): Cool. So, this brings me into prompt engineering. Yeah. I know you spoke about this recently, about the misconceptions of exactly what that is. And some people think it's like, oh, here's some queries that you can shove into chat JPT, and I'll spit out a really good answer.

You're gonna see these people on Twitter and LinkedIn, like, hey. You steal my my prompt. Tell us from your perspective, what is prompt engineering?

Speaker 1 (44:55 - 45:32): So, if you think about, like, if you keep going within that analogy, if you're hiring an employee, you need to give them a brief on how to do the task. Right? And, the way you, write that brief can make a massive difference to how well that human does the task. You know, like, if you're hiring an agency, you give them a really poor creative brief, you're gonna get really poor results. Right?

And and the same is true with Chat GT. The way that you word the brief, you know, makes a massive difference to the type of results you get. So you can think about this as getting good at brief writing. And, unfortunately, as we know, working in marketing and and they haven't been been on the agent side and write

Speaker 0 (45:32 - 45:32): a brief.

Speaker 1 (45:32 - 46:44): No. They can write a brief. Right? So that's what prompt engineering is is is, you know, I I I'm basically someone who's just really good at writing briefs, for Chat, Jeopardy, and and DALL E. So, so that's how you kind of think about it.

Now the difference is what I what I, you know, really love about this and why I actually have kind of moved away from marketing and and work full time in AI now is that, in marketing, you know, you you get a chance to write a brief once, and you don't even know if it was your fault because the brief was wrong or if it was the agency's fault because you hired the wrong agency or they're having a bad day that day and they just didn't read the instructions. You know, there's millions of reasons why something could go wrong and you don't get to AB test it. Right? It's, you know, you get to hire 2 agencies, give them you know? Or you don't get to, like, you know, hire a 100 agencies, give, you know, 20 of them one brief, 20 of them another brief.

Right? Like, you know, you can't do that in real life. But with, problems engineering, you can. You know, I can I can hire 20 agencies, right, to do a task, but the agency is is chat gbt? And then, I could give it, you know, 2020, you know, 20 times one brief, 20 times another brief, 20 times another brief, and then I can just measure, k, how often does it get it right, if I if I word it this way versus if I word it that way.

Speaker 0 (46:44 - 47:00): And that's kind of what you're describing by engineering, which is, like, this, like, sort of test and learn architecture behind the actual, what we'd call, you know, text prompt that goes into it. So you're engineering that sort of structure and learning and developing and training these things. So that that's Exactly.

Speaker 1 (47:00 - 49:03): Yeah. Yeah. Yeah. And and where it becomes engineering rather than just prompting in my mind is, when you're doing it at scale. So, if you're just chatting to chat gpt and it's a task that you're gonna do once or twice, you don't really need to think about prompt engineering.

Where it becomes important is, if you're building an AI application like a feature in your product that uses AI, or if you're templatizing the task that, you know, you want the AI to do for your team every day, every, you know, couple of days, then it's really important to know that, like, one in every 100 times it says something racist or, you know, 10 and every, 100 times, it it gets gets the brief wrong. You know? Yeah. So so, so so the, you know, the key, task that I'm doing as a prompt engineer is very similar to the types of things I was doing as a growth marketer, which is, you know, defining what does performance look like, how we're gonna measure performance, because that is this whole, you know, thing, both in marketing and in AI, because, you know, most people don't know how to measure performance. Right?

So it's a lot of, like, hand holding to get to that point of of, like, what's our definition of whether this task has been done well or not. And then, and then doing the task multiple ways, like testing all those different briefs, all those different instructions, and finding ones that perform better against our evaluation metrics. And then, and then when when you're really failing and you're either you've tried multiple combinations of, like, okay, doesn't matter how I say it, it's still not doing a good job. Then finding ways to, you know, clever ways to give it more context. So, you know, hooking it up to a search engine so it could do a Google search on your behalf, to get the right context.

You know? Or, you know, hooking it up to a database, which which has all of your company documents in it, so it can it can search for which documents are most relevant to this query. Yeah. You know, that that sort of work. So that's where it becomes engineering rather than, you know, just, finding the right magic combination of words, which is just prompting.

Yeah. Okay.

Speaker 0 (49:03 - 49:24): So for most people who aren't engineers or listening to this, or who aren't taking these kind of things at scale using APIs and that kind of stuff. Right? And most people would just be interacting with the default window or whatever, which is like a text prompt. Right? Now I've seen a lot of people using it for the first time or, like, maybe not as experienced as you're going, oh, I put this in.

It came back with this rubbish. Like, oh, it doesn't work. Right?

Speaker 1 (49:25 - 49:25): I I've

Speaker 0 (49:25 - 49:44): had this a lot, especially for, like, Poomis. But I found that the quality of how you contextualize that really matters in terms of the output. For example, I noticed trick after, you know, reading some of your stuff is, like, tell it what it is first before you tell it what to do. So you are a graphic designer.

Speaker 1 (49:44 - 49:45): Yeah.

Speaker 0 (49:45 - 50:14): I am, hiring you to create me a logo. This logo I like these colors and this kind of style can you give me a couple of things and then we'll refine it from there. Very different to if you go create me a logo that's like this word in green, you know what I mean? It's going to come like something random that doesn't even have the words in it, so can you just tell us like what are some sort of basic rules here if we're doing prompting ourselves to to go through to get a better quality output?

Speaker 1 (50:14 - 50:43): The the people who say that, I think it's like the equivalent of someone who, like, if you hired a new employee and then you didn't give them any onboarding, didn't give them any training in your company, You didn't even tell them, what you want, you know, from the task. Like, you didn't give them a brief. And then you go, they've they've they've done a bad job here. Therefore, hiring employees doesn't work. Right?

Was it completely ignoring the evidence that, you know, there are, like, funding companies out there who have found hiring employees to

Speaker 0 (50:43 - 50:45): work. Managers in a free company.

Speaker 1 (50:45 - 54:50): Exactly. Yeah. Yeah. So so, I I mean, if it's working for other people but not working for you, then, maybe the the right take is to it's not that AI doesn't work, but, like, maybe we're just not using it correctly. Right?

Or or, you know, maybe it just doesn't work yet for your use case because I found plenty of things where I'm like, I've struggled, spent a few days on something, didn't get it to really work up to my standard. But then, you know, 6 months later, the new version of the model comes out, and now it can do this without any faff. You know? But I I think, one of the big blind spots I'm seeing with people is self preservation, and I saw it with myself as well. I actually only accidentally broke that loop, which is what really led me down this path.

So, let me explain, what I mean by that. You don't want AI to do a good job at the things that you do. Right? You know, like because then if it does, then what's your what's the point of view being there? Right?

So, whenever you're evaluating a task that you do yourself and you're and you're looking at AI and you go, no. They're not. It's not doing a good job. Like, maybe try and be a little bit more honest with yourself and try and and try to do, like, more of a blind test, with other people, who aren't you to see which ones do they really prefer. So the reason why I started running these fine tests is I had this when I was delegating to employees when when I was building my agency.

I was, you know, one of the major things that I was really good at I thought was writing blog posts to grow. And that was like 60% of our leads came through our blog. And, I was really rest rested to, like, hand that over to someone. And, what we did is we ran an AB test. He blogged he he wrote 6 blog posts.

I wrote 6 blog posts. And, at the over the course of, those 3 months, his he got the same amount of traffic for his blog post and the same amount of leads that I did. And I thought, God, I'm so wrong about this. You know, even though I thought his posts were objectively worse because my ego is telling me that, I was ready to be replaced, like and and and and, you know, great. Like, if if if I can, you know, stop writing buckets, it's saving me a day every week.

So so that I fully handed everything over to him. And I had the same experience with AI where at first, I was like, I can't really use this to break my content. Can't really use this to automate this task. And then, I had an an accidental situation where, I was working on a on a blog post with a client, and they had access to the Google Doc. And they I woke up in the morning, and they had commented on it all.

And I was like, oh, I I hadn't even looked at this. I just generated it from chat gbt, and I was copying pasting stuff in into this Google Doc. I didn't expect the client to review it. I was gonna go through it and, like, actually make it sound like me and, you know, actually, like, improve it. Thought that you wrote it.

They thought they thought I'd written it. And, the the really, thing the thing that bummed me out was, they actually gave me, better comments. Like, they were happier with that work than, they normally would be. So I had to tell them obviously what happened, but, I I was like, okay. Yeah.

Maybe I need to do more blind tests. So one of the things I do in workshops now is I'll, I'll I'll I'll I'll ask them to give me some tasks that their team does every every day that they're struggling with they can't get AI to do. And then I'll use my prompt engineering to try and replicate, like, can can I get AI to do that task with the same context that the human was given? And then, and then I do a blind test, and I ask them to say, okay. Which one do you think was AI and, which one is better, which is key?

Because the funny thing is that a lot of people can tell which one's AI now, but, usually it's because the AI is better. You know, but, but, like, it's it's really interesting for the people to see, like, oh, like, I thought my version was was much better, but, like, actually, you know, 60% of people voted for the AI one, you know, in some cases. You know? So so I think once you see that, you

Speaker 0 (54:50 - 54:50): know, it

Speaker 1 (54:50 - 54:58): kind of creates a big unlock in your brain. You think, okay. Maybe I'm just kind of being harsh on AI because I don't want it to be Okay. So

Speaker 0 (54:59 - 56:19): so step so step one is, like, this unconscious bias to against AI right and this is funny you mentioned this because there's that famous, not Erime Bass, but like Barnard Shaw is one of the 4 authors in it. I think Kennedy, Sharps, and 2 other people, they did this test of, to to experience marketers, they choose which of the 2 ads, produce the superior revenue results. Yeah. And basically, marketers experienced marketers only have 2% chance over 50%, meaning that that is, like, within the margin of error, meaning that, like, experienced marketers are no better than flipping a coin. Yeah.

And agencies were, like, negative like, like, 40 47%. So they're, like, worse audio. Worse than than it was. And I'm, like, hang on. This is, like, this is, like, one of the biggest lies in the entire marketing industry that they there's somehow this agency or this Experian marketer do a better job, in terms of, yeah, advertising creative than anybody else.

And then we must hire them, and we must hire that really good agency because they better and, like, you know, they're not. And when it comes to revenue, may maybe in terms of, you know, political optics and that kind of stuff, but when it comes to bottom line results, no. Therefore, you know, it's sort of like that's a really hard thing for a lot of marketers to, like, think about that actually my experience means basically nothing.

Speaker 1 (56:20 - 57:39): Yeah. Yeah. Yeah. A 100%. Like and we saw this with our agency as well.

Like, we we used to keep track of every experiment we ran, because we're a performance agency, and, we had we had a database of 7,000 experiments by the end, before I left. And, and what that told me across all client we had 200 clients represented in there, you know, small start ups as as well as big companies like Booking dotcom, Nestle. And what we found was we would have a test success rate of 35%. So if you think about what needs to happen before you put a test live, it's a new version of of a campaign, like, you know, a new creative, a new landing page, new email, whatever, that you think, is gonna perform better than the one you have. And and then you send it to your boss, the strategist, and and the strategist thought it was gonna perform better, so they improved it.

And then the strategist pitched it to the client and the client thought it was going to perform better, so they approved it. And, in some cases, the client's boss, right, like if they need approval. And then so all of those people thought this was gonna work and and, you know, even with those the collective wisdom of that crowd, they, they they still only got it right 35% of the time.

Speaker 0 (57:39 - 58:03): Yeah. And I love that. Okay. So so so subconscious bias. Now So there's a bit of a learning phase, is kind of the the longer answer to how I said how not to prompt badly.

Right? So give me context, that kind of thing. Is there some, like, really quick point is you can get the layman here in terms of, like, a a bit of a training phase, so expect that, try to get rid of your bias. What else some other point is that we could tell people more out of these programs.

Speaker 1 (58:03 - 59:30): So because we just wrote a book on prompt engineering for O'Reilly and it's going, you know, it's in it's in print, you know, so which is, like, maybe a terrible idea, because everything changes in AI every week. And we we didn't want the book to be out of date the day it was published. Right? So so what we did was we kinda zoomed out and said, okay. What are the principles?

Like, you know, forget about the tricks and hacks and and tips that, you know, you see on social media. Let's zoom out. Like, what are the main principles of, like, here are the the categories that, you know, they worked, you know, 4 years ago, 3 years ago. They're still working today, and and we think that they will continue to work in the future. And it actually ends up being a pretty small list.

It's 5 5 things. And so, the and you can run down them like a checklist. Right? And we base the whole book on this. The first one is give direction.

So, that's the briefing part. Like, that's the role playing part. So, like, as Steve Jobs come up with a product name. Right? You're gonna get very different, names.

So if you say, you know, as Elon Musk come up with a product name. Right? So, giving it some direction of what's the type of style that you want. That's the first one. And then the second one is to specify the format.

So, you know, it makes a big difference as to what format you want this response in. Like, for tech for image generation, you know, do you want this as, like, in the format of, like, a painting, or do you want it as, like, you know, a a a product photo, or do you want this to look like, you know, video game?

Speaker 0 (59:30 - 59:32): Like, you know, there's always different for something.

Speaker 1 (59:33 - 01:00:14): Yeah. Exactly. Yeah. And then, you know, same thing. This is what if you're building AI applications, it becomes really important that, you know, the text that comes back is in, you know, format that the developer can pass and and use, like, JSON.

Then, then the third thing we we usually move on to is providing examples. And this is the biggest, most impactful thing you can do, but it takes a bit of time because you have to find good examples of the task being done. So, you know, if it's doing a bad job if you've given it some direction, you specified what format you want it in, and it's still doing a bad job. That's the biggest thing you could do is, like, go and write some examples of this task. Like, do the task manually a few times, or find examples on the

Speaker 0 (01:00:14 - 01:00:16): Or upload the image of something.

Speaker 1 (01:00:16 - 01:02:42): Yeah. Or upload an image of something you wanna kinda replicate, and then, and then it's gonna do a much much better job. So so, you know, actually, it's gonna make such a big difference that in AI research, they differentiate it. They say, how well does this model do zero shot, which is no examples given, versus few shot, which is a handful of examples given of the task. So so so that makes a big difference.

And then those are the 3 things you kind of do to improve the prompt. But then, then you're kind of thinking about how do we improve the system. So then the 4th is evaluate quality. So don't just run it once or twice. Run it 10 times.

You can just hit regenerate and chat gbt. You don't necessarily have to be a developer to, you know, to to run it a few times. Right. So so run it 10 times and see how often does it get it right or wrong. Because it is probabilistic.

You know, you don't get the same result every time. And sometimes it might fail and you don't wanna kind of, you know, rely on that then. Then, you know, then then the 5th principle is divide labor. So just like in in our economy, right, like, you know, like, you you you work as marketing consultant because you're very good at that. You, I mean, you're actually good at a lot of things.

So maybe, it's a terrible analogy, but but, you know, like, I I probably wouldn't hire you. Like, I might hire you to to, to tell me what champagne to drink or, you might be good at baking or whatever. But but, like, maybe maybe you're really terrible at pickleball. Right? So, you know, you don't, you don't play pickleball, and and somebody else plays pickleball because they're more maybe more interested to watch.

So same thing with AI. You know, your different prompts, you don't only try and do everything in one prompt. Quite often, you wanna split it up into multiple tasks. So in generating blog content, for example, don't ask it to create a whole blog post with one prompt. You can ask it to write some titles, you know, and then you could ask it to write an intro.

And then after the intro, you can say, okay. Write, like, write an outline for what the body copy should have. And then and then you say, okay. Write section 1 of the outline. Write section 2 of the outline.

Write section 3 of the outline. And then you might have, like, okay. Write write an outro, generate an image for the that is relevant to this blog post. And then and then, you know, then you might even say, okay. Now rewrite it in my style.

You know, here's my style. You know? So you'd have multiple steps, you know, as part of that one thing. And and typically, you get better results if you split things up.

Speaker 0 (01:02:42 - 01:03:23): Yeah. I love it. And that sort of brings me to some work I was doing with, like, someone that, you know, around agents. So, you create specialized AI agents, that are specialized in these tasks. So for this example, it's like a marketing research.

So we'd have, like, someone who is, like, the the quote person or the quant person, someone's the architect of the study, you know, someone who's analyzed their results, and they they specialize, you train them all together, and then you create this, like, hive of these agents together that actually work as this with a team, which is basically emulating what professional services does anyway. Like, we can't be good at everything. So, like, you specialize and then bring it together as like a hive. And, the results from that were just, like, amazing. They're, like, way better than some of the best teams I've worked with, which is pretty scary.

Okay. So,

Speaker 1 (01:03:24 - 01:03:25): I actually like that point.

Speaker 0 (01:03:25 - 01:03:27): You've got a book that is non profit.

Speaker 1 (01:03:27 - 01:03:40): Yeah. Sorry. I just wanted to to raise this because I saw it in thought of you. Do you see that, Mark Ritson came out massively in favor of AI, yesterday? He's invested in synthetic data.

Speaker 0 (01:03:40 - 01:03:41): I think investing in the business.

Speaker 1 (01:03:42 - 01:04:55): Yeah. Yeah. Yeah. But, I mean, to be honest, it's it's actually a legit idea, because I I've done I've done this myself. But but, yeah, what they do is they, they they spin up a bunch of, you know, agents.

Right? And and those agents represent the demographics of the of, maybe the people in your audience of of, like, the type of people you're trying to target in marketing. And then they actually ask them to fill out a survey. Right? So rather than going out and recruiting real people and the real audience to give to do a survey, you can use these agents.

And, I think that's a a really good example of that principle. Right? Like, you kind of splitting it up instead of just asking in one prompt, like, you know, what is my product's tagline any good or whatever? You know, instead, you can spin up, like, you know, a 1,000 AIs and say, give them all different personalities that match the types of personalities you would see in your real audience. And then and then you have, like, you know, market research, on tap.

Right? I think it's actually a pretty solid idea. It's but, anyway, it's just funny because, I think Britain's probably the type of person that maybe would find a way to, to have hot takes against AI. So I'm kind of glad that he's on our side. He's gonna be having hot takes in favor of now.

Speaker 0 (01:04:57 - 01:05:41): Well well, he's he's like a brand for hire. Right? He's got that that point where his bill is just not brand up so much that, like, you know, one minute he's, like, advocating for TV. Right? Because he's getting paid to from certain interests, and then the next minute he's, like, an ambassador for YouTube, which is, like, the opposite.

And then and then he's, like, anti anti AI and now invest in a company. Now it's suddenly pro AI. Like, people are looking for, like, this, like, arbitrary truth. I'm like, he's just going where the money is, like, you know, and we're gonna do it, but, like, you know, you gotta take some of what these people say with a grain of salt. There's a lot of soft interest behind, you know, he's very good at, creating his personal brand, but I I he's not someone I would, you know, blindly just listen to an article and trust, you know, that there's no sort of, a thing behind that.

So

Speaker 1 (01:05:41 - 01:05:42): Yeah. But,

Speaker 0 (01:05:42 - 01:06:36): yeah, I I like that. I was having a discussion today, though, with someone who he's who's committed deleted on this post. Because, you know, he put down the end, like, sponsored. I have a financial interest in the start up, you know, blah blah. Because, like, Joe Lombardo and the other guy from, like, LinkedIn.

Right? The who started and he's, like, part of the business. Anyway, and down the bottom, like, someone commented, and then, like, it was just, like, you're an idiot or whatever. Like, now you're shooting this, whatever, and delete the comment and then blocked it. And I was like, okay.

Now we're both bought by written, but, I found that funny. Yeah. I I think the the thing that that probably went to I would be, primary research with customers. So it can be indicative, but there's certain insights you can only find by, like, creating that friction through customers because it is so fluid. So I think it has it some uses for maybe testing some things or pretesting or, like, getting an indicative result, but then you probably wanna validate with, like, real, research on top of that if it's something quite risky,

Speaker 1 (01:06:36 - 01:06:37): I

Speaker 0 (01:06:37 - 01:06:37): would say.

Speaker 1 (01:06:37 - 01:08:06): Yeah. Yeah. Exactly. Yeah. I would say, AI the the best uses of AI are when, you know, you wanna do something, but, it would just be basically impossible to do with your resources, with, you know, the real way.

Right? So so, like, you know, if if you can't afford to do market research, then doing it with AI is is is, you know, way better than not doing it at all. Right? And, you know, I find I find that, like, I I had a project where, we were we we we'd, used the vision AI and and audio transcription to ingest, you know, hundreds of hours of video from, Twitch streams, of people playing video games. And it's and and we we found some insights of, like, what sort of strategies do they use?

Like, when are they excited? When are they not excited? And that's sort sort of thing that can be used for product development. It can be used for marketing. And that's just, you know, it's just never nobody is gonna actually do that in real life.

Like, nobody's gonna sit there and watch a 100 hours of twitch streams. Right? Even the most avid gamer isn't gonna carve out the time. And, like, even if they wanted to, like, imagine their boss approved that. You know?

But the AI can do it and and, you know, so so, like, if we get any insights from that, that's a bonus. Right? And maybe it costs a, you know, a few $100 instead of, you know, tens of 1,000 staff, people do it. So, I feel like that's that's kind of a nice niche where, like, wherever there's a a task that, like, you know, you're gonna get some benefit and, otherwise, you just couldn't have done it, then, you know, that's that's when you use AI, I think.

Speaker 0 (01:08:07 - 01:08:44): Yeah. No. I agree. Okay. Someone that you got a book coming out.

So, why don't you just show us, you know, what what you got? Because I actually wanna read this book, because, I found this ironic. I was like, like, what are you doing? And I've had this question before, like, what are you doing? Like, releasing a book on AI, like, you know, in printed book form.

Like, isn't this kind of somewhat ironic? And it kind of makes sense. Like, I like the your strategy here is, like, Jeff Bezos is, like, what doesn't change? And you've kind of written the book now, but with the forethought that, hey, what will stay the same as as AI progresses and everything, and maybe some of it will, like, lose relevancy or whatever, but I I like those 5 steps. So, how do we get the book?

What's it called? Etcetera.

Speaker 1 (01:08:45 - 01:09:24): Yeah. Yeah. So, yeah. I mean, it's interesting because, O'Reilly, I think one of those publishers is still doing really well. Like, the publishing world in general isn't, but, but because developers, buy books, for some you know, like, it's kind of mad, but, you know, you wouldn't think because developers are very tech forward.

You think they'll do everything online, but, actually, what they really like is to have a reference manual. Right? Like, like, they didn't have to go and look something up online all the time, you know, check Stack Overflow, ask chat gpt. Mhmm. Sometimes it's usual to get that core knowledge and, like, have it.

Like, you can pull it down off the shelf and have them flip through it. You know? So so the developers still buy books.

Speaker 0 (01:09:24 - 01:09:27): That's, like, being better by some sort of quality process as well. Right?

Speaker 1 (01:09:27 - 01:11:42): Exactly. Yeah. And and, you know, that was the the case for us. You know, we only took about 6 months to write this, but, it was a year of editing. So, you know, O'Reilly put us through, I think it was 6 rounds of technical review, 3 people that we chose, and then 3 people that they chose, and and they went through line by line, actually twice each.

So, yeah, it was, like, 12 edits there. Plus we had a production editor who did 3 rounds of edits, and then a copy editor who just did all the kind of spelling, grammar, and stuff. And that was 2 additional rounds of edits. So, yeah, we we've gone through this like a fine tooth comb, and I'm sure there's probably still one spelling mistake somewhere. But, but but, yeah, like, I think that's the major difference, I think, doing a book with a real publisher, like, in print, because the the quality bar is so much higher than, you know, anything else I've ever written.

Like, I, you know, like, I I just pump stuff out there. I don't even edit my own work to put either. So, you know, I I just think, like, let's get the hot takes out there. And, and if there's something wrong, then John will tell me. Me.

You think? But, you know, it's it's, yeah, with the book, you you can't do that. So so I think that, yeah, if if you're looking for that condensed information because all this information is out there and you can learn this stuff by testing it yourself. But, you know, if you're working in AI, you know, it's it's $80.80, US, for the book, which is very pricey. But, you know, but if it saves you like 2 hours, 3 hours, and you're, you know, you're a software engineer, you're a data scientist, you know, or or someone who's, like, kind of interested in learning more stuff, then, you know, that's worth your time.

Right? It's gonna just save you, you know, the the much more than the $80 in terms of, your the value of your time. So so that's the that's the pitch. Right? Yeah.

It's, it's called Prompt Engineering for Generative AI. I I have a coauthor, James Phoenix, who, he did all the hard technical stuff, and I did all the pretty pictures and the and the, you know, business logic principles. So, you know, we kind of teamed up on that, which which made it a little bit easier to write. But, yeah, it's it's out, June 25th 2024, depending on when you're listening to this. But in the in the US, it's already out in the UK, and it's on Amazon.

So, yeah, have a have a look on that.

Speaker 0 (01:11:42 - 01:12:08): Okay. I'll just buy it off Amazon. Let's can ship something, across the world. Cool. Okay.

I think it was really good. You know, we could probably go deeper on all of this stuff, obviously, but, I think what I wanted from this episode was a bit of a crash course. I think you delivered in Spain. So thanks for filling in some of the gaps in my knowledge as well, which is really good. And, yeah, like, are you coming to the South by Southwest Sydney in October?

Or, like, what's your next speaking gig? Like, where if someone wants to see you, like, what's the next thing that you got going on?

Speaker 1 (01:12:08 - 01:12:41): Yeah. Good question. Yeah. I'm trying to trying to line up a few things. Yeah.

South by Southwest is a good shout. I've never been. So so, yeah, I'll look into that. But, yeah, I'm I'm going to an event in Chicago in in October, go to Chicago that was called. I'm doing I'm gonna do try and do something in San Francisco, and I've got a few others in, in London.

But, yeah, if you if you follow me on Twitter, hammer_mt, then, that's where I, like, I post random updates and things like that. So it's probably the easiest place to find find these things as they come up.

Speaker 0 (01:12:41 - 01:12:58): Great. I saw your post the other day about the Mario thing with the first iteration of, mid journey and then sort of the latest one, which is mid journey something or 7. And I was like, wow. Okay. You can really see the difference between the same same prompt kind of thing.

It was a Mario image or something. And I was like, oh, wow. Mario drinking because I know you've been using that for ages.

Speaker 1 (01:12:58 - 01:13:13): Yeah. It was a Mario drinking a pint of beer. That was the first thing I ever tested with the it was iteration there. But, I actually funnily enough, you know, you try that on OpenAI and it refuses. It says, sorry.

We're not allowed to do that. So

Speaker 0 (01:13:13 - 01:14:13): Yeah. Great. Okay. Well, thanks for your time, Mike. I really appreciate it.

And I'm sure if anyone has some more questions about AI, whatever, let us know. We can always do a sort of deep dive into some of these areas, depending on its popularity. So thanks for your time. Buy Mike's book. I'm gonna buy one anyway because I actually really wanna read and get better at the stuff because, I think I made some of those mistakes that, I was kind of, like, complaining what people say.

Like, hey. You gotta expand a bit more time contextualizing things and learning. And then and I was like, oh, the moment came. And I was like, okay. No.

This is really cool. Like, I was putting in Excel data. I'm like, hey. Give me a graph, and then it was like, but I want the bars centered, and I want you to label this and blah blah blah blah blah. I was like, super detailed.

And then it's like you know the wheel goes around and I'm like oh it's gonna be shit and it comes back it's like perfect so it's like oh great I'll play to social media and then the post went viral and I'm like oh, that was easy. I'm like, I didn't have to go to Google Sheets or anything like insert diagram and stuff around these axes and everything. And I was like, oh, this is so good. So, yeah.

Speaker 1 (01:14:13 - 01:14:23): And this is the worst it's gonna be. Right? Like, you know, in in 6 months, in 12 months, it's gonna be twice as good, 4 times as good. So so, yeah, like, even if you're struggling now, it's, you know, it's it's it's gonna be amazing.

Speaker 0 (01:14:24 - 01:14:27): So cool. Okay. Well, thanks again, Mike. I really appreciate it. Talk again soon.

Speaker 1 (01:14:27 - 01:14:28): Alright. See you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment