Fear-Mongering & Forecasting: Assessing AI's Predictions About AI
The Cerebral Valley PodcastNovember 19, 202400:49:0747.23 MB

Fear-Mongering & Forecasting: Assessing AI's Predictions About AI

Cerebral Valley is tomorrow! I’ve been listening to old interviews, brainstorming with Claude and ChatGPT, and talking to investors to prep for my conversations with Dario Amodei, Martin Casado, and Alexandr Wang.

We’ll be sharing those conversations here in the newsletter. Expect video highlights on our social media feeds, a detailed rundown of the biggest moments in the newsletter Thursday, and full-length conversations on our YouTube channel.

To satiate your AI appetites until then, give a listen to the latest edition of the Cerebral Valley Podcast with my friends and co-hosts Max Child and James Wilsterman. You’ve listened to us assess whether startups are underrated or overrated and make our draft picks. Now we’re looking to the future. We asked Claude and ChatGPT o1 to make some predictions about what will happen in artificial intelligence over the next year. And then we took the over or under on those predictions.

Brought to you by Brex

Brex knows runway is everything for venture-backed startups, so they built a banking solution that helps them take every dollar further. Unlike traditional banking solutions, Brex has no minimums and gives startups access to 20x the standard FDIC protection via program banks.

Plus, startups can earn industry-leading yield from their first dollar — while being able to access their funds anytime. If you want to make sure your portfolio companies have a place to save, spend, and grow their capital, check out Brex here.

Chapters

* 00:00 — Introduction to AI Predictions

* 02:48 — Exploring Predictions for AI in 2025

* 06:06 — AI Regulation in Healthcare

* 08:53 — Self-Driving Cars and Tesla's Future

* 12:04 — AI in News Media

* 14:55 — AI-Generated Films and Entertainment

* 17:53 — Anthropic’s Predictions and AI Co-Processors

* 20:59 — AI in Pharmaceutical Development

* 24:13 — International AI Treaties and Regulations

* 26:47 — Comparing AI Models: ChatGPT vs. Claude

* 30:06 — Future of AI and Human Systems

* 32:46 — Conclusion and Reflections on AI Predictions



Get full access to Newcomer at www.newcomer.co/subscribe

[00:00:01] Welcome to the Cerebral Valley Podcast. I'm Eric Newcomer. We're calling this Fear-Mongering and Forecasting. We're looking to the future. We're making predictions or really, I think we're going to be relying on AI and chatbots to make our predictions and then we will shit all over them. I'm here with Max Child.

[00:00:20] Hey, Eric.

[00:00:21] And James Wilserman.

[00:00:22] Hey, guys. Ready to make some hot take predictions here.

[00:00:26] We're going to use both OpenAI, ChatGBT and Anthropic. So we'll also get sort of the meta sense of which bot is making better predictions. But yeah, excited.

[00:00:39] What is AI if not constantly looking to the future and charting out what could be given the rate of growth we've seen so far?

[00:00:48] So it's very fitting that in our last episode before the Cerebral Valley AI Summit on November 20th in San Francisco, we are going to fiendishly fantasize about what could be to come.

[00:01:03] This episode is presented by Brex, the financial stack that founders and VCs can truly bank on.

[00:01:09] Imagine what your founders could do with their runway if they had a banking solution that had no minimums, no transaction fees, and 20 times the standard FDIC protection.

[00:01:19] Plus, they could earn an industry-leading yield while maintaining access to funds whenever needed.

[00:01:25] Brex simplifies financial services for startups so they can focus on building.

[00:01:31] Connect your portfolio to the financial stack that one in three U.S. venture-backed startups already use.

[00:01:37] Check out brex.com forward slash banking dash solutions.

[00:01:42] Without further ado, James.

[00:01:44] So for this episode, I asked both ChatGPT and Claude the same question to come up with a list of predictions about AI for the year 2025.

[00:01:55] I also asked the predictions to be hyper-specific and falsifiable so that by the end of 2025, we could easily determine whether the predictions were correct or incorrect.

[00:02:06] The predictions I said could be spicy, medium, mild, however hot.

[00:02:13] I didn't get any pepper emojis out on the output.

[00:02:17] Maybe I should ask you that.

[00:02:20] But the predictions, I was pretty happy with how they came out.

[00:02:25] And I asked, you know, relevant to the AI industry.

[00:02:28] And then I also asked to get a probability estimate for each prediction.

[00:02:32] So then you guys are going to be able to provide me kind of the over-undertake on each prediction.

[00:02:38] So we have our own prediction market basically.

[00:02:41] Exactly.

[00:02:41] We basically created a bunch of possible prediction markets around AI.

[00:02:46] Nice.

[00:02:47] And you guys have to decide whether you're short or long each prediction.

[00:02:52] Nice.

[00:02:52] How's that?

[00:02:52] I love it.

[00:02:53] Great.

[00:02:53] I'm excited.

[00:02:53] Is this GPT-01 or is this 4.0?

[00:02:57] Yeah, this is ChatGPT-01 preview and Claude Sonnet 3.5.

[00:03:04] Okay.

[00:03:04] So the most thinking-oriented AI models.

[00:03:07] We're paying the big bucks here.

[00:03:09] We're premium subscribers for-

[00:03:12] We're bankrupt after running the predictions, guys.

[00:03:17] But no, these are the best AI can do in a consumer product.

[00:03:22] It feels- I was pretty impressed.

[00:03:24] Like I was kind of skeptical actually that this would work that well.

[00:03:27] You know, I think AI has a tendency to provide sort of generic takes in a lot of ways.

[00:03:35] But if you prompt it correctly, I was able to get these really specific predictions.

[00:03:41] So I think this is key for-

[00:03:42] This is for 12 months from now?

[00:03:44] Like what's the date?

[00:03:45] Just the year 25.

[00:03:47] By the end of 2025, December 31st.

[00:03:50] Okay.

[00:03:50] Cool.

[00:03:51] All right.

[00:03:51] Let's do it.

[00:03:52] All right.

[00:03:53] We'll start with the 01 preview from OpenAI in ChatGPT.

[00:03:58] So here's the first prediction.

[00:04:04] So-

[00:04:05] We have not heard these before to be clear.

[00:04:07] Max and I are coming in blind.

[00:04:09] Unless you guys are co-founders, so maybe you're in cahoots.

[00:04:12] Nope.

[00:04:12] I got nothing.

[00:04:13] Running blind.

[00:04:14] All right.

[00:04:15] So first prediction, OpenAI will release ChatGPT or GBT5, meeting 10 trillion parameters.

[00:04:27] What do you guys think?

[00:04:28] How many parameters is four again?

[00:04:30] Yeah, five.

[00:04:30] One trillion?

[00:04:31] The bot's already smarter than me.

[00:04:32] What's that-

[00:04:33] You're like, what's a parameter?

[00:04:34] Isn't it like 400 billion?

[00:04:36] Can we get the number of the current?

[00:04:38] It's like four or 500 billion, right?

[00:04:41] 1.5 trillion?

[00:04:42] Okay.

[00:04:42] Let's add the elevator music here.

[00:04:44] 1.76 trillion is the rumors.

[00:04:46] Okay.

[00:04:47] One to two.

[00:04:48] Okay.

[00:04:48] So this would be almost, you know-

[00:04:49] What's it predicting to rate?

[00:04:51] How many?

[00:04:51] A 10X.

[00:04:53] Like a 7X basically or whatever.

[00:04:55] And we think the current is 1.7 something?

[00:04:57] Yeah.

[00:04:58] Also, the probability for this prediction is 50%.

[00:05:02] So, oh, over.

[00:05:04] Smash the over on that.

[00:05:05] Wait, wait, wait.

[00:05:06] Sorry.

[00:05:06] It's-

[00:05:07] Is it making a claim about the name of it?

[00:05:10] Because I feel like-

[00:05:11] I think-

[00:05:11] I was just listening to Dario on Lex Friedman.

[00:05:14] And like, I feel like half of his like hemming and hawing about what they're going to do is like not committing to the name of various models.

[00:05:22] Because, you know, they're doing all these like half updates and stuff.

[00:05:24] So how much is this prediction about what the model is going to be called?

[00:05:28] Yeah.

[00:05:29] I think we can decide there.

[00:05:31] But I would say both have to come true.

[00:05:33] The GPT-5 naming and the 10 trillion.

[00:05:36] I think, you know, if I'm reading the tea leaves of what open-

[00:05:40] What 01 was thinking and reasoning about here, maybe they're saying, hey, 10 trillion is a big step up to the point where it would have to almost be called, you know, GPT-5.

[00:05:49] I mean, I don't know. This is kind of pedantic, but I would smash the over on a model that it has 10 trillion parameters, but I don't know if they're going to call GPT-5.

[00:05:57] Exactly.

[00:05:58] I feel like Altman has been-

[00:06:00] You committed to over already.

[00:06:01] I think the name-

[00:06:02] Okay, fine. I'll commit.

[00:06:03] Why are you guys so skeptical that it's going to be called GPT-5?

[00:06:06] Because Altman keeps-

[00:06:07] They're just chaotic on the names.

[00:06:08] The names are bad at the moment.

[00:06:09] I got it.

[00:06:10] It's hysterical that you introduced as like, OpenAI, ChatGPT, 01.

[00:06:16] Like, they need to like-

[00:06:17] I mean, Anthropic has more colorful names, but they're getting chaotic.

[00:06:22] Sure.

[00:06:22] So I'm just going to take the under on the basis of the name in terms of the parameters.

[00:06:30] Yeah, I'm probably bullish.

[00:06:32] I mean, thankfully it's a compound prediction, but yeah, the under.

[00:06:36] Okay. All right. Well, I will stick with my over, but yeah, I mean, I think they're basically running into like the iPhone 4S, 5S, 6S problem where Apple used to give the like in-between phones like, you know, an S instead of a new number.

[00:06:49] And then I think the marketing team realized like, people want to see the number go up.

[00:06:53] Like we don't see, we don't, we don't buy new iPhones unless the number goes from 14 to 15.

[00:06:57] So I think, I think this is why GBT is trying to move off these numbers because I think they're like, nobody wants the new model unless we give it a higher number. Right.

[00:07:06] And they're like, they don't even care if we introduce like 4.0 and voice mode and you know, 01 and all this stuff.

[00:07:10] So I just, I feel like Altman is severely foreshadowing that they're going to drop these numbers.

[00:07:16] I do think that.

[00:07:17] You're contradicting yourself.

[00:07:18] No, I'm taking the over 50% is a good price.

[00:07:21] I just think I understand where they're coming from with the names.

[00:07:24] James, are you making predictions or you'll give us a quick, your quick take?

[00:07:28] Yeah, I'll just react to your guys' takes.

[00:07:31] Um, I think that, yeah, you guys are way too focused on the naming conventions.

[00:07:35] Um, I think the reason they are calling things like, you know, 4.0 and 01 preview is to try to like, you know, downplay the model a little bit before the next evolution of the model.

[00:07:48] Like I think they will stick with the GPT four or five, six naming convention, uh, but they will need it to be kind of a step change, you know, uh, improvement on the model.

[00:07:58] So I do think a 10 trillion parameter model will get that name, but there's an open question.

[00:08:02] I mean, this is going to be probably coming up throughout just like how monumental is a 10 trillion parameter model change, right?

[00:08:10] Like if, if, if they have consumed basically all the data there is to consume and synthetic data has its limits.

[00:08:18] Well, that's fair.

[00:08:19] Then if there's not a step change in the model capabilities, maybe they don't give the new name.

[00:08:23] Okay.

[00:08:24] All right.

[00:08:24] Let's move on.

[00:08:25] Um, good, good first discussion there.

[00:08:27] Are you over or under though?

[00:08:28] I never got that part.

[00:08:29] Um, I'm over.

[00:08:31] Okay.

[00:08:31] I mean, if you want me to make, make my own prediction, I think you've got to come in.

[00:08:34] Just say, I just put it on.

[00:08:35] Say what you are.

[00:08:36] I'll take the over.

[00:08:37] All right.

[00:08:38] Um, so, um, next prediction from chat GBT.

[00:08:43] We have, uh, by the end of 2025, at least three countries will have implemented regulations at the national level, specifically governing the use of AI to diagnose healthcare disease.

[00:09:00] And what's the price on that?

[00:09:02] 70% probability.

[00:09:04] Oh, short, way short.

[00:09:08] I just think it's really, it takes a long time to write legislation.

[00:09:12] I mean, you're saying like national level entities are going to write complex legislation about healthcare and AI in the next 14 months.

[00:09:20] Uh, yeah.

[00:09:20] Yeah.

[00:09:21] It seems interesting though.

[00:09:22] I'll grab.

[00:09:22] I'll let Eric answer, but you're just the over.

[00:09:27] I mean, I don't want to just do the opposite of max every time here, but I do.

[00:09:31] That's good.

[00:09:32] There are a lot of country.

[00:09:32] There are a lot of countries.

[00:09:34] We've actually seen aggressive AI regulation already.

[00:09:38] There's clearly a lot of appetite and I don't think this prediction requires some sort of comprehensive.

[00:09:43] They just need some regulation touching sort of health, health diagnostics.

[00:09:48] Right.

[00:09:49] Um, yeah, I guess, I guess the question is like, is some random EU regulator issuing a decree count as 35 countries?

[00:09:56] Yeah, exactly.

[00:09:57] I think it should.

[00:09:58] I think it should.

[00:09:58] Well, yeah, of course you think it should.

[00:10:01] Like, yeah, but I'm just saying like, you know, I'm, it sounds going to be about the nature of predictions.

[00:10:06] It's more than the substance underlying.

[00:10:10] You know, for every good game, we've realized that the rules are really what's most important to discuss, not the actual.

[00:10:18] Okay.

[00:10:19] All right.

[00:10:19] The gameplay.

[00:10:20] All right.

[00:10:20] Well, I'm still, I'm still grabbing the under here.

[00:10:23] But yeah, that just seems like a lot of work for three different nation states to execute.

[00:10:27] And then in the 12 month timeframe would be my, are we getting more point like, or so, like, does he get more return on his investment for predicting something that was a seven?

[00:10:35] Yeah, that's a good idea.

[00:10:37] I think that we could do something like that.

[00:10:39] Yeah.

[00:10:39] If you take the over on a low probability prediction, and then at the end of next year, you're proven correct.

[00:10:45] I think we'll give you, give you some more points.

[00:10:47] Yeah.

[00:10:48] I think we need ROI.

[00:10:49] Let's, let's, let's save the exact.

[00:10:52] Right.

[00:10:52] Exactly.

[00:10:53] Table, but scoring.

[00:10:54] Could we get into another rule discussion?

[00:10:56] Yeah.

[00:10:59] James, what are you picking?

[00:11:01] Okay.

[00:11:02] I am taking the over there and I think, you know, we're just going to see this come up a lot in the next year is like people using AI for their jobs, maybe in areas where they're not supposed to.

[00:11:12] And I just think that'll be a key kind of political discussion point is, Hey, why is, why is my doctor, you know, using chat GPT?

[00:11:20] Right.

[00:11:20] And is that, should that be legal?

[00:11:22] So I'm, I wouldn't be surprised by three countries.

[00:11:24] Take, I want them to be, to be clear or like, as an assistant, I think it would be good.

[00:11:28] I'm not saying that's a bad thing, but maybe it should be a little bit regulated, right?

[00:11:31] There should be something.

[00:11:33] Okay.

[00:11:33] I might just be biased based on living in the United States where we have just epic legislative gridlock, but yeah.

[00:11:39] Yeah.

[00:11:39] They passed AI regulations in California.

[00:11:43] Well, that's true.

[00:11:45] I mean, you're saying the big one, but oh, you're saying the state, the state government.

[00:11:48] Yeah.

[00:11:49] Well, they didn't pass it because the, they passed other ones.

[00:11:53] They passed all the other things.

[00:11:54] Okay.

[00:11:55] This prediction specifically says national, national.

[00:11:57] Yes.

[00:11:58] Yes.

[00:11:59] All right.

[00:12:00] But you're up slightly anyway, sorry.

[00:12:02] Moving on.

[00:12:04] Prediction by December, 2025, Tesla will release fully self-driving cars that are legally approved

[00:12:12] for unsupervised operation on a, on public roads and at least one U S state.

[00:12:19] Hmm.

[00:12:21] Hmm.

[00:12:21] What's the price.

[00:12:23] We are saying 40% probabilities.

[00:12:27] Hmm.

[00:12:28] Interesting.

[00:12:29] I'll go over.

[00:12:32] Actually, I, one of my predictions coming in was around Tesla AV capabilities.

[00:12:36] Um, I just think that way most proved it's possible.

[00:12:41] Elon's got more chips than anyone.

[00:12:43] They've got more video data than anyone.

[00:12:45] They're obsessed with this problem.

[00:12:47] I mean, it's the entire future of the company and then coming down to the ticky tacky state

[00:12:51] approval bit.

[00:12:52] Uh, you know, he's got his boy in the white house and he's got friends in every Republican

[00:12:57] government in America.

[00:12:59] So like fucking Wyoming could approve this shit and he win the bet.

[00:13:03] So I think that, uh, I'll take the over at 40%.

[00:13:06] So it, it's to be driving like without a driver behind the wheel.

[00:13:12] Yeah.

[00:13:13] That's I think unsupervised meaning you don't have, I still, I still think there's a very

[00:13:17] likely scenario where it's like in Texas or Wyoming or whatever you, you know, you can

[00:13:25] sort of allow it to drive, but people are effectively behind the wheel.

[00:13:28] Like I'm not very bullish on Tesla's like technology.

[00:13:32] I don't think, are you talking about like remote assistance, like remote drivers taking

[00:13:37] over or something or like, I think that would, that would count or not.

[00:13:41] That would count.

[00:13:42] Like that's what Waymo does.

[00:13:43] Waymo has remote takeover.

[00:13:45] If there's an issue, remote takeover.

[00:13:47] But it's a little, it's like if remote, how is what they have right now?

[00:13:52] Not already passing that.

[00:13:54] Like people are allowed to use autopilot.

[00:13:55] You're talking about Tesla, Tesla, you, the, the, from a legal perspective, you have to

[00:14:01] have a driver in the car behind the wheel, ready to take over at a moment's notice.

[00:14:05] It also has to be supervised.

[00:14:07] Like, you know, that you have to be looking at the road or disengage.

[00:14:11] Like they have a camera on your eyes that show that, that checks if you're watching the

[00:14:15] road or not.

[00:14:16] And it gets mad at you if you're like on your phone.

[00:14:18] I've, I've tried this.

[00:14:21] Um, yeah, yeah.

[00:14:22] Yeah.

[00:14:23] It literally throws a warning up on the screen saying I like, I can tell you're looking at

[00:14:27] your phone and I'm like, well, what's the point of self driving if I can't look at my

[00:14:30] phone?

[00:14:31] Like that's the entire use case.

[00:14:34] Uh, I think I'm, I'm taking the over just because of all the sort of seeming cases where

[00:14:43] this would count as approved.

[00:14:45] I don't think they're close to Waymo.

[00:14:47] I'm not bullish on Tesla's approach, but I think there are lots of ways to sort of fudge

[00:14:53] this with humans still sort of behind the wheel.

[00:14:57] All right.

[00:14:57] For bonus points, which is the first state?

[00:15:00] Oh, Texas.

[00:15:03] Texas would be the obvious choice, but it's also really big and there's a lot of complicated

[00:15:07] roads and cities and stuff there.

[00:15:09] So I don't know.

[00:15:10] Does it have to be the whole state for the rules or?

[00:15:13] Yeah.

[00:15:13] Yeah.

[00:15:14] It's a state government, right?

[00:15:15] Yeah.

[00:15:15] Wait, the whole state?

[00:15:17] It says one state will approve it, right?

[00:15:19] That's interesting.

[00:15:19] You mean like Waymo is not in the whole state.

[00:15:21] It's in San Francisco.

[00:15:22] Right.

[00:15:22] Right.

[00:15:24] I think it would just have to be not the whole state, but you know, somewhere in the state.

[00:15:30] Yeah.

[00:15:30] The whole state I would change my whole, yeah.

[00:15:33] Yeah.

[00:15:34] I think it has to be Texas.

[00:15:35] I guess like I wasn't even thinking, you know, you'd be able to own a Tesla, but only use

[00:15:39] it for full self-driving if you're in one city.

[00:15:43] Yeah.

[00:15:43] Yeah.

[00:15:44] That's interesting.

[00:15:45] All right.

[00:15:45] Texas.

[00:15:46] All right.

[00:15:46] We'll see how that plays out.

[00:15:49] James.

[00:15:50] Oh yeah.

[00:15:50] My take.

[00:15:52] Taking the over as well.

[00:15:53] I think you guys are, you guys are spot on.

[00:15:57] And yeah, I think I will actually go a little contrarian and take California as the first

[00:16:01] state.

[00:16:02] Wow.

[00:16:03] Okay.

[00:16:03] Wow.

[00:16:04] All right, guys, let's move on to the next prediction.

[00:16:07] We have another prediction from chat GPT.

[00:16:10] By the end of 2025, AI generated content will constitute at least 50% of online news articles

[00:16:19] published by a major news outlet.

[00:16:23] Probability 30%.

[00:16:26] Hmm.

[00:16:27] What's major?

[00:16:28] Again, I think this is a key question here.

[00:16:29] Like how major.

[00:16:30] Litigated after the fact.

[00:16:35] All right.

[00:16:36] What would be capital M major?

[00:16:38] I mean, like watching the, I'm taking the over.

[00:16:41] I think Condé Nast is clearly major.

[00:16:44] I think Forbes is probably major.

[00:16:46] And I think if you're just talking articles, you know, we've seen a lot of media just be

[00:16:52] willing to produce a lot of articles to, you know, get, get gobbled up by SEO.

[00:16:58] And so it's not necessarily the majority of their homepage, but the majority of the articles,

[00:17:04] one outlet.

[00:17:05] I say yes.

[00:17:06] I think I, yeah, I think at 30, I'd grab the over.

[00:17:09] Um, I just think three to one odds on that or two to one on sounds pretty good.

[00:17:13] I think to your point, like, yeah, the SB nation network I could see or Vox or whatever.

[00:17:19] Like, um, just somebody is going to give into the SEO, uh, the SEO temptation to have a

[00:17:25] crap load of articles generated by AI that people are going to link to.

[00:17:28] So, so will they talk about that publicly?

[00:17:31] Do you think like, will they know I'm sure there'll be some journalist that gets leaked

[00:17:34] to in a bar where, where Eric lives in Brooklyn.

[00:17:37] And, uh, then, then somebody will write an angry story about it.

[00:17:40] I don't think Axios would necessarily do it, but there's definitely a breed of sort

[00:17:44] of business first media publication that would brag about this to seem cutting edge.

[00:17:49] I bet someone will write a cranky article before somebody brags, brags about it.

[00:17:54] Oh, before?

[00:17:55] Yeah.

[00:17:56] Yeah.

[00:17:57] We'll see.

[00:17:57] Will AI itself break the news that,

[00:17:59] the,

[00:18:00] No.

[00:18:02] Okay.

[00:18:03] Um, not until it's hooks up to the Bitcoin network and starts getting paid for it.

[00:18:09] I'll take, I'll, I'll take the under on 30%.

[00:18:11] Um, I'm sure, uh, we'll have to litigate what, uh, constitutes a major outlet, but, um,

[00:18:18] I'm skeptical that it'll hit 50% of articles.

[00:18:21] Like, I, I mean, um, at least in the next year, I think that it, I think that it'll maybe

[00:18:26] start, you know, kind of long form, like some, some indie blogger or something will be

[00:18:30] generating hundreds of thousands of articles or something, uh, with AI next year.

[00:18:34] But I'm skeptical that a major outlet will be doing that at that scale, uh, next year,

[00:18:39] but we'll see.

[00:18:40] I just, I just go with the, yeah, the Logan Roy attitude here, which is that money wins

[00:18:46] and there's a lot of private equity firms that own these now.

[00:18:49] And, uh, they're not doing this though.

[00:18:52] The, the, the, yeah, we'll get fired.

[00:18:53] This is an amazing way to cost cut.

[00:18:54] And it's also amazing way to get links, you know, from Google search.

[00:18:57] So I just think it's going to be very tempting for a lot of these companies.

[00:19:02] All right.

[00:19:03] Next prediction.

[00:19:03] Um, by December 31st, 2025, the first AI generated feature length film will be released in theaters

[00:19:13] with the script and visuals, both entirely crafted by AI 25% probability under.

[00:19:24] I'm crushing the under on that.

[00:19:25] I just think that you have to do a deal with theaters that like get distribution and Hollywood's

[00:19:30] going to lose their effing minds.

[00:19:32] There's somebody who tries to do this in the next year.

[00:19:34] So I just think the, the gears of entertainment don't turn that quickly.

[00:19:38] Uh, I'm not saying somebody won't generate a feature film by the end of 25 that plausibly

[00:19:42] could be in a theater.

[00:19:43] Although I, I would actually debate that as well, but I think that the odds that it's like actually

[00:19:47] released in theaters, I'm, I'm just super under on that.

[00:19:51] They barely, they barely let Netflix into the Oscars.

[00:19:54] And your mistake is misunderstanding, uh, the entertainment industry, even though, uh,

[00:20:00] you're, you're, you grew up, you grew up in LA area, but, um, the, uh, you can pay to

[00:20:06] play in theaters any day.

[00:20:08] Like there are all these sort of documents.

[00:20:10] There are all these films that sort of pay to get theater distribution so that they

[00:20:13] count for stuff.

[00:20:14] And you can absolutely do that.

[00:20:16] I'm taking the over, uh, tons, tons of companies with an interest to have a

[00:20:20] have this achieved and somebody will just buy the theater plays to make.

[00:20:24] So you have the, like by the U S constitution angle here, which is somebody is just going

[00:20:28] to fucking do it because it's going to sound cool.

[00:20:30] Basically.

[00:20:31] Right.

[00:20:31] And pay to play the theaters.

[00:20:32] Okay.

[00:20:33] I mean, I'm, there's already, there's already a film that Sarah, my wife saw that I, you

[00:20:39] know, there are some boundaries, like it was shot by humans, but then it reorders based

[00:20:44] on AI, you know?

[00:20:45] So, so that was already in theaters.

[00:20:47] So I think this is a good, this is a good robust debate.

[00:20:50] I like that.

[00:20:50] You're like, you guys believe in the forces of change and I believe in the entrenched hating

[00:20:55] institutions.

[00:20:56] I haven't taken my, I haven't taken my, uh, well, I was referencing the, uh, the legislation

[00:21:01] discussion from earlier, but yeah, this is a meta point about how quickly the gears of

[00:21:05] power turn.

[00:21:06] And you guys are like really quick.

[00:21:07] And I'm like, I don't know.

[00:21:08] Uh, so, all right, James, what do you think?

[00:21:12] I am also taking the under on this one.

[00:21:15] Uh, and not, not for any remote reason around theater distribution, but just purely from

[00:21:21] model capability.

[00:21:22] Like the, the specification of this prediction is that the script and visuals have to entirely

[00:21:28] be created by AI.

[00:21:29] I think that we are not there on visuals.

[00:21:31] We're barely there on scripting.

[00:21:33] Um, I think I don't think we're there on scripting either, by the way, I haven't seen

[00:21:37] the script.

[00:21:37] But on this, as you started the episode with is that there's a lot of control and models

[00:21:42] over prompting.

[00:21:43] And so it's like, if they prompt, they could prompt a whole script into it and then output

[00:21:48] a script and say, Oh, it was AI.

[00:21:51] And that I just think there's such a strong incentive for the result to be, Oh, it was AI

[00:21:56] that like humans can put a lot of thought into it and get what they want.

[00:21:59] I don't think it'll be hard to do a script.

[00:22:01] I mean, it won't be a good script.

[00:22:02] Uh, but I, but I do think it'll be hard to create a full feature length.

[00:22:07] Does it say feature?

[00:22:07] It says feature length.

[00:22:08] So I don't know that probably 120 minutes or something of, of visuals like a, that's

[00:22:13] extremely expensive to create.

[00:22:15] And also be it's like very hard to get any sort of consistency of the characters or anything.

[00:22:21] I'm just skeptical of the visual aspect.

[00:22:23] I think our, our guys at runway will personally make sure this happens.

[00:22:27] Yeah.

[00:22:27] Probably the guys, it's probably the way Eric wins is that the runway, our friends at runway

[00:22:32] just bribe a theater chain into doing it.

[00:22:35] And they're like, they're like, Eric's got some skin in the game and we're going to help

[00:22:43] them out here.

[00:22:43] We do this.

[00:22:44] Eric is going to brag about us.

[00:22:48] Yeah.

[00:22:49] I don't know.

[00:22:50] They haven't even released Sora, you know, yet or it's coming soon.

[00:22:54] Somebody was just one of the competitors.

[00:22:56] I think runway maybe it was saying, we think Sora is about to come.

[00:22:58] Come.

[00:22:58] All right.

[00:22:59] Good discussion.

[00:23:00] Let's switch over to Claude and see how Anthropics predictions do and how you guys

[00:23:06] respond to the probabilities.

[00:23:08] And then we can have a little discussion at the end of who's better at predicting, but

[00:23:14] Meta discussion.

[00:23:15] But starting with.

[00:23:17] We're already having a discussion about our opinions on their predictions.

[00:23:20] And then we're going to have a higher level of discussion about our opinions of the relative

[00:23:25] predictive power of the two models.

[00:23:27] And then we're going to have AI watch this episode and react to our predictions.

[00:23:32] It's all good.

[00:23:33] It's all good.

[00:23:34] Well, I guess on a certain level, you know, thinking about AI is all about intelligence.

[00:23:38] So it's like, how do we introspect on introspection even more?

[00:23:41] Anyway.

[00:23:41] Sure.

[00:23:42] Yeah.

[00:23:42] Somebody, uh, someday we'll, uh, be teaching a class in a university about this episode.

[00:23:49] All right.

[00:23:50] Too self-indulgent.

[00:23:51] All right.

[00:23:51] All right.

[00:23:52] Um, starting with the Claude's first prediction by December 31st, 2025, at least three major

[00:24:00] smartphone manufacturers among Apple, Samsung, Google, and Xiaomi will release phones with dedicated

[00:24:08] AI co-processors capable of running large language models with at least 7 billion parameters entirely

[00:24:16] on device probability 75%.

[00:24:20] Man, the parameters ones are where I'm like, I don't even know how many parameters they can

[00:24:25] run on device right now.

[00:24:26] Can't they run 7 billion locally already?

[00:24:28] Is that wrong?

[00:24:29] I think that the Apple intelligence chip can run 7 billion, right?

[00:24:33] I thought it's at least three to four.

[00:24:35] I, you know, 3 billion, 3 billion.

[00:24:38] Okay.

[00:24:38] Three to four.

[00:24:38] All right.

[00:24:39] I got it.

[00:24:39] I nailed it.

[00:24:40] Apple's current, just to reiterate there, Apple's current on device LLM uses a hundred

[00:24:47] percent local LLM with 3 billion parameters and then can also connect to external servers

[00:24:54] for larger parameter models.

[00:24:56] And what's the price on this?

[00:24:59] Over 75%.

[00:25:00] Ooh.

[00:25:01] Or probability 75%.

[00:25:03] Okay.

[00:25:04] Well, I'm going to grab the under on that.

[00:25:06] Cause I think a dedicated co-processor is really what's holding me back here.

[00:25:11] Cause that requires like a separate chip essentially to go into the hardware.

[00:25:15] I'm a avid follower of Apple rumors and I have not gotten any sense that they're interested

[00:25:19] in adding chips, like separate chips to their system to do this kind of thing.

[00:25:24] And so you'd basically have to get Xiaomi, Google and Samsung to all do this separate

[00:25:29] chip in the next year.

[00:25:30] And it's not really clear sentiment that that's necessary to run these models.

[00:25:35] So, um, yeah, I'll take, I'll take the under at 75 for sure.

[00:25:39] You sold me under hardware.

[00:25:44] Hardware, hardware timelines are really long and it, that that's only a slightly larger model

[00:25:48] than they can already run locally.

[00:25:50] So it just not clear to me that you've made it.

[00:25:52] Yeah.

[00:25:53] And it's also not all these come on device for Google.

[00:25:58] Like, is that really in their mission?

[00:26:00] Like there's so much like, Oh, we'll send everything to the cloud.

[00:26:02] Like, are they really?

[00:26:03] Yeah.

[00:26:03] Apple's the one selling like everything will happen on device.

[00:26:06] I think I'd like grab the over at like 10% on this, but at 75, I think I under a big

[00:26:14] time.

[00:26:15] Yeah.

[00:26:16] I mean, I guess I'm, I'm, I'm hearing what you're saying on the dedicated aspect, but like,

[00:26:22] I feel like you could quibble with that semantically of like, as long as it's a GPU at all, that

[00:26:29] is dedicated to running AI tasks.

[00:26:32] Um, like that's what they already have a GPU, right?

[00:26:36] I mean, yeah.

[00:26:36] Well, okay.

[00:26:37] So they have essentially a GPU and then there's a set of numbers.

[00:26:41] There's like a set of cores, right?

[00:26:44] Um, I guess it's just, what is the neural engine?

[00:26:49] I guess is what I'm saying.

[00:26:50] It's like, are we counting the current neural engine on there for like eight years already

[00:26:54] as, as a yes to this?

[00:26:56] Cause then it's like, sure.

[00:26:58] Like, I guess, I mean like, but I don't think that's the spirit of the prediction, right?

[00:27:02] Okay.

[00:27:03] Like that, the thing they've had on the phones for the last eight years is, I don't know.

[00:27:07] It counts now.

[00:27:07] So far, chat GBT is beating Anthropoc because I feel like there's a lot of quality of prediction.

[00:27:12] I don't love this, this, whatever.

[00:27:14] Yeah.

[00:27:14] Anyway, James, just pick, pick and we'll litigate everything afterwards.

[00:27:18] Okay.

[00:27:18] Yeah.

[00:27:19] I'll take the over and I'll try to litigate my take that the neural engine.

[00:27:22] All right.

[00:27:23] Great, great.

[00:27:24] All right.

[00:27:25] Prediction number two.

[00:27:27] Anthropic will release a model that achieves a score above 90% on the unified benchmark for

[00:27:34] AI reasoning or bar evaluation suite, which will become the first model to outperform the

[00:27:39] human baseline, human expert baseline of 87%.

[00:27:44] Probability 65%.

[00:27:45] Where are they at right now?

[00:27:47] Unified.

[00:27:48] Unified.

[00:27:48] What was the metric?

[00:27:50] It was unified.

[00:27:51] The benchmark for, unified benchmark for AI reasoning.

[00:27:55] We're all going to frantically Google for that.

[00:27:58] Yeah.

[00:27:58] It's funny that we have Anthropic predicting its own intelligence capabilities.

[00:28:03] I know.

[00:28:04] I mean, a key part of this is the claim that it's the first also.

[00:28:09] Right.

[00:28:11] It says first, right?

[00:28:13] Yeah.

[00:28:13] It'll be the first model for this, for the record.

[00:28:17] Is this benchmark real?

[00:28:19] I'm I can't find this benchmark.

[00:28:21] Is this, is this benchmark?

[00:28:23] Oh my God.

[00:28:24] That would be hallucinated.

[00:28:25] I think it's.

[00:28:26] Oh my God.

[00:28:27] Yeah.

[00:28:27] I don't see it.

[00:28:28] I can't find this unified benchmark for AI reasoning.

[00:28:31] Like, it's like, I was like, I've never heard of this benchmark.

[00:28:38] Like, uh, I love it.

[00:28:41] It's crazy.

[00:28:42] It's hallucinate a benchmark.

[00:28:44] I mean, can you find it?

[00:28:46] Like, it says something about our deference that at first I, we're all like,

[00:28:52] Oh, I did not defer.

[00:28:55] I did.

[00:28:55] No, no, no, no, no, no.

[00:28:58] Do not loop me into this deference.

[00:29:01] I feel like.

[00:29:02] I immediately Googled it.

[00:29:03] I was like, what?

[00:29:04] I've never heard of this.

[00:29:05] I've heard of like MMLU.

[00:29:07] I've heard of like, you know, a couple other evals, but I've never heard of the bar or like,

[00:29:11] or whatever.

[00:29:13] Maybe this is like internal philanthropic.

[00:29:16] Maybe this is like, like we'll all be proven idiots when this becomes a benchmark.

[00:29:21] Right.

[00:29:22] And then they leaked this.

[00:29:24] Yeah.

[00:29:25] Okay.

[00:29:25] I guess I will.

[00:29:26] I will take the under, given I'm not sure this benchmark is real.

[00:29:30] I guess if it doesn't exist, we can create it and you know, you'll rule it.

[00:29:34] Yeah.

[00:29:34] You can make it true.

[00:29:36] Um, we'll have to do some more Googling later.

[00:29:40] Okay.

[00:29:40] But I don't think this benchmark exists for my own recent.

[00:29:43] Uh, just for a quick, for the listener to get to the substance.

[00:29:48] I know we're focused on technicalities.

[00:29:50] Do you think someone will surpass sort of human intelligence overall by the end of next

[00:29:58] year?

[00:29:58] And do you think it will be anthropic first?

[00:30:01] Just quick.

[00:30:02] In some, in some kind of evaluation you're saying basically like, um, yeah, for sure.

[00:30:07] I, I sort of think we're like already there, but I guess it depends what invented benchmark.

[00:30:13] That's the take I was going to get, which is we have sort of quietly passed humans on a

[00:30:18] lot of things already.

[00:30:19] And we're just this, you know, since people, you know, smartest in every domain still remains

[00:30:26] smarter than it.

[00:30:27] Like we've sort of under, under sold how exciting it is for it to be smarter than the average

[00:30:33] human on, on a lot of stuff.

[00:30:35] Totally.

[00:30:36] Next prediction from Claude.

[00:30:38] At least five fortune 500 companies will replace more than 25% of their middle management positions

[00:30:44] with AI systems for task delegation and performance monitoring publicly acknowledging this transition

[00:30:51] in their annual reports.

[00:30:53] Probability 25%.

[00:30:55] Hmm.

[00:30:57] 25% chance that 25% of middle management positions get axed.

[00:31:02] I'll take that over.

[00:31:04] If they become ICs, if they become individual contributors, but they don't get fired.

[00:31:08] Is this met?

[00:31:09] I think that this would be met because I think that we're saying task delegation and performance

[00:31:15] monitoring, meaning it's most, most of the management responsibilities.

[00:31:18] I'll, I'll, I'll take that over.

[00:31:20] And my reasoning is essentially just that there's a lot of companies.

[00:31:25] There's in fact, 500 companies in the fortune 500 and a lot.

[00:31:30] There's a number.

[00:31:31] Yeah.

[00:31:31] Yeah.

[00:31:31] There's, there's a pretty high number, somewhere around 500 of them.

[00:31:34] Um, and, and plenty of those companies are not growing or shrinking and need to cut talent,

[00:31:43] uh, in some way or another to manage the business.

[00:31:46] And this will be the greatest freaking like.

[00:31:50] Cover story.

[00:31:51] Yeah, exactly.

[00:31:51] Smoking mirrors.

[00:31:52] Like, you know, look over here.

[00:31:55] Like the left hand's doing some AI.

[00:31:56] We're not doing well because our company sucks.

[00:31:59] Right.

[00:32:00] Right.

[00:32:00] This is going to be an amazing way to drop in your earnings report that like we had to

[00:32:04] lay off a bunch of people, you know, but it's cool because it's, it's about the future dog.

[00:32:08] We're adopting AI.

[00:32:09] So I think that I, yeah, I just totally buy that.

[00:32:13] Well, you know, five out of 500, uh, we'll need to, we'll need to do that in the next year.

[00:32:19] Um, and I, and, and AI is just an amazing cover story for doing layoffs.

[00:32:22] What's the percent, James.

[00:32:25] They need to replace more than 25% of their management positions.

[00:32:30] And the probability is 25% that this occurs.

[00:32:34] Oh, wow.

[00:32:35] So it's a much better return for me to say yes.

[00:32:37] Yeah.

[00:32:38] You get a, yeah, you get a 4X return on the yes.

[00:32:40] Yeah.

[00:32:41] Over.

[00:32:42] Okay.

[00:32:43] I mean, yeah, I think the, I'm not, I'm not as excited about AI in middle management necessarily.

[00:32:50] I, but I do think there are too many middle managers.

[00:32:54] Like we have this whole separate trend of like Facebook and companies saying, let's just

[00:32:59] like be flatter organizations overall.

[00:33:01] And I feel like we're in that trend.

[00:33:03] And I do agree that the AI trend is sort of going to get mixed in with that.

[00:33:08] Um, but I'm not convinced that AI is going to come for the middle managers first.

[00:33:14] I think it's going to supplement sort of the coders and frontline, you know, I, to me,

[00:33:19] the deployment model is more like people are actually doing the work, figuring out how

[00:33:23] AI makes them better and sort of adding it, uh, to their, just making themselves look better

[00:33:29] to their bosses.

[00:33:30] And I think that's how my model for how AI is getting deployed in workplaces, which is

[00:33:35] just sort of like people hacking shit together and delivering results.

[00:33:39] Um, but I'm still taking the over.

[00:33:43] Yeah.

[00:33:43] I, I don't know.

[00:33:44] You kind of convinced me to take the under there.

[00:33:46] I think that, um, I think that, uh, AI will have a much bigger impact on sort of IC roles.

[00:33:54] And, uh, you know, we've seen this already in some of the customer service oriented companies

[00:33:59] sort of announcing and bragging that they can hire less, uh, representatives.

[00:34:05] Um, so I think we're more likely to see that first before companies start taking this, uh,

[00:34:11] AI, uh, approach on management, but could be wrong.

[00:34:14] 25% is a big number at a fortune.

[00:34:19] Exactly.

[00:34:20] Like, are, are they gonna, I, I think I'm going to, I've, my whole argument is the under,

[00:34:25] I feel like I should just like, no, you're not allowed to switch.

[00:34:28] You, you, you, you, you, you made me not switch on like two of them already.

[00:34:32] I mean, I was really on the fence.

[00:34:32] I could see the advantage.

[00:34:33] You're on the fence.

[00:34:34] You know, no, no, no, no.

[00:34:36] Committed or I'm over.

[00:34:37] Yeah.

[00:34:38] It's just funny that I made the whole case for the, I know.

[00:34:41] I know.

[00:34:41] Thank you.

[00:34:42] Thank you for.

[00:34:43] All right.

[00:34:44] All right.

[00:34:45] Once you say the words, you're locked.

[00:34:47] Yeah.

[00:34:47] Once you say the words, you're locked.

[00:34:48] All right.

[00:34:50] Let's go to the next prediction.

[00:34:53] Um, from Claude, this prediction is that the deep mind or let's say Google broadly will

[00:35:01] demonstrate an AI system capable of discovering at least one novel pharmaceutical compound that

[00:35:07] also passes phase one clinical trials, reducing that typical discovery to trial timeline by around 50%.

[00:35:16] We will just specifically say that it has to have discovered a compound and past phase one trials, uh, within the next year.

[00:35:23] Probability 20%.

[00:35:25] I'm going to take the max worldview, which is like a lot of these are predictions about society.

[00:35:31] And I definitely don't think drug approvals are going to sort of just get much faster.

[00:35:37] Because AI companies say they should.

[00:35:39] So I'm going to take the under who's running the FDA, I guess.

[00:35:44] I'm not.

[00:35:44] Yeah, exactly.

[00:35:45] I'm only conflicted because the Trump administration is about to take over, but they don't like Google.

[00:35:50] Google are they're not going to like fast track.

[00:35:51] Yeah.

[00:35:51] Google, that's the last company.

[00:35:52] That's true.

[00:35:53] That's the company they want to hate the most.

[00:35:54] Yeah.

[00:35:55] They don't like the FD or sorry.

[00:35:57] Yeah.

[00:35:57] They don't like Google.

[00:35:58] Um, they also like pretend they didn't do operation warp speed, even though that was

[00:36:02] like the coolest thing ever.

[00:36:04] I don't really know how long like phase one specifically.

[00:36:07] I know.

[00:36:07] I don't know a lot about, I mean, I know the entire firm half the time, right?

[00:36:11] Yeah.

[00:36:11] So it has to be fast.

[00:36:12] I'm not, I think that's just like kind of an additive, less necessary component of the prediction

[00:36:19] and reducing the typical discovery.

[00:36:20] Okay.

[00:36:21] So, well, it just says bypassing phase one clinical trial within a year.

[00:36:25] It is reducing the typical.

[00:36:26] Which would be very fast.

[00:36:27] Would be a 50% improvement in speed, I guess.

[00:36:30] I know the entire time.

[00:36:31] Of discovery to passage.

[00:36:32] Right.

[00:36:33] So 10, 10 to 15 years.

[00:36:34] And so it's on the order of years for sure.

[00:36:37] Uh, for phase one.

[00:36:38] Um, gosh.

[00:36:40] Yeah.

[00:36:41] I think I just got to stick with my, I don't believe that the institutions change as fast as

[00:36:45] the technology here, uh, and, and take the under, even though 20%,

[00:36:49] with the new administration is, is appealing.

[00:36:51] Um, that's just feels like a high bar to pass phase one trials.

[00:36:55] It does seem like phase one involves actual experiments on.

[00:36:58] Yeah.

[00:36:59] Which AI cannot speed up.

[00:37:00] I just read Dario's essay.

[00:37:02] And like one of the main points that he makes in his bull case on AI is just a lot of the limits

[00:37:10] will be human systems and things where you have to run real world experiments.

[00:37:14] And even if it has good ideas, the experiments could take a while.

[00:37:17] Anyway.

[00:37:19] All right.

[00:37:19] All right.

[00:37:20] What'd you say?

[00:37:21] I'm taking the under too.

[00:37:22] I think that it just seems too fast, uh, to, to occur.

[00:37:26] Uh, would you guys take the over on Google this demo, you know, entering phase one trials

[00:37:31] or something with something?

[00:37:33] Yeah.

[00:37:33] I think there'll be a discovery or some.

[00:37:35] Yeah.

[00:37:36] Yeah.

[00:37:37] Registering for them or something.

[00:37:38] Over 20%.

[00:37:39] I would take the over on 20%.

[00:37:40] Obviously Demis at deep mind is quite passionate about biology as a major application for AI.

[00:37:47] Uh, so yeah, I mean, I think that would be a cool thing for them to demo say, Hey, look, we, we made a drug.

[00:37:52] Yeah.

[00:37:53] All right.

[00:37:54] Um, let's do one more, um, from Claude.

[00:37:57] All right.

[00:37:58] Prediction.

[00:37:59] The first international treaty specifically governing AI development and deployment would be ratified by 15 countries or more, including three of the following countries,

[00:38:10] U S China, EU, UK, or Japan.

[00:38:15] What's the percentage on this?

[00:38:16] Is it 5%?

[00:38:17] Like, what is this like a 1%?

[00:38:19] Uh, yeah.

[00:38:20] Legally binding commitments on things like model evaluation and safety standards.

[00:38:26] No way.

[00:38:26] What is the percentage they gave?

[00:38:28] Yeah.

[00:38:28] 50%.

[00:38:29] Oh, my God.

[00:38:30] This is the, Claude's the, I feel like that's the, that's the easiest under, under, under.

[00:38:38] Claude's like dreaming of this fantasy safety world or something where, where every country has come together.

[00:38:44] Is it US and China or like, I didn't get the memo.

[00:38:48] You don't need the US or China because you only need three of these entities.

[00:38:53] So you could have EU, UK and Japan all entering some sort of treaty.

[00:38:58] Oh, okay.

[00:38:59] Interesting.

[00:38:59] EU, I would believe.

[00:39:01] I mean, UK though, is sort of, I mean, I, I definitely am picking the under.

[00:39:06] I just feel like the, yeah, the UK is a mess economically right now.

[00:39:10] Like our, you know, is the labor party really going to be like, yeah, guys, we're not fixing the economy, but we're, we signed this dumb AI treaty that none of you care about.

[00:39:18] Like, I don't know.

[00:39:19] I think they got bigger fish to fry there.

[00:39:22] Um, yeah, under, under a 50 for sure.

[00:39:26] Yeah.

[00:39:26] I agree with you guys taking the under on that.

[00:39:29] Um, yeah, I think this is, uh, Claude's own fever dream of hoping for a better place, a better world, a better world where Claude itself is highly regulated.

[00:39:40] Yeah.

[00:39:43] Um, okay.

[00:39:44] Great.

[00:39:45] Great work, everyone.

[00:39:46] Um, that was fun.

[00:39:48] How do you guys think chat GPT did compared to Claude?

[00:39:52] Um, what do you like about the different predictions?

[00:39:56] Well, it seemed like Claude made up some stuff.

[00:39:58] I did think it had a little bit of a safety ism trend line.

[00:40:03] Um, so I, I don't know.

[00:40:05] I, they're, they're all blurred to me, but, uh, it felt like the first half were more interesting in the second.

[00:40:12] It seemed like GPT's predictions were more entertaining as well as more somewhat realistic, I guess.

[00:40:19] Um, so I think I would give GPT the clear, uh, victory in this round of entertaining and reasonably accurate predictions.

[00:40:27] Maybe a win for O1 for reasoning models.

[00:40:30] Yeah.

[00:40:31] You prompted exactly the same stuff or?

[00:40:33] Exact same prompt.

[00:40:34] Yeah.

[00:40:34] Yeah.

[00:40:35] All right.

[00:40:36] Well, that's a, that makes us more excited for the full version of O1 coming out.

[00:40:40] Yeah.

[00:40:40] That would, that's probably one of our, one of our predictions.

[00:40:42] I think that's coming out this year, right?

[00:40:44] Is, is the rumor.

[00:40:45] I mean, the, the zoom out thing that I took from this is just how much human systems interplay with the things we want AI to deliver.

[00:40:56] And even if we are bullish on rapid development of AI, there are a lot of things where we're like, well, the human systems will still effectively limit what it can do.

[00:41:08] That's self-driving drug development, sort of everything under the sun.

[00:41:12] And so I think that's interesting to come out of this, how much, um, yeah, this, this sort of societal piece is still sort of a strong lever on what AI can accomplish.

[00:41:24] I feel like the way we get to that, uh, treaty is that something horrible goes on.

[00:41:30] Right.

[00:41:30] Yeah, exactly.

[00:41:32] Right.

[00:41:32] Like, um, Oh, that's true.

[00:41:34] We underestimated black swan.

[00:41:36] We'd have to narrowly escape from some horrible thing to get a binding treaty between a bunch of countries, I think.

[00:41:43] And then we'll look really dumb.

[00:41:44] Yeah.

[00:41:44] It's like they, you know, AI does figure out how to like, uh, all, although we just had a paper clips or whatever, to send a nuclear bomb.

[00:41:51] We just had a pandemic and there was like, nothing came out of that.

[00:41:54] Like, Hey, move on.

[00:41:56] Like, whatever.

[00:41:57] It's not like a huge tragedies that bother us more than others.

[00:42:00] Yeah.

[00:42:00] But the U S China and the EU didn't get together and be like, Hey, maybe we should like control pandemics a little more carefully.

[00:42:05] Like, you know, let's, let's regulate that a little bit.

[00:42:08] It's like, nah, we, we got through it.

[00:42:09] Don't worry about it.

[00:42:10] Let's just get on the record on.

[00:42:13] Yeah.

[00:42:13] I mean the core, just because it is a prediction episode and like, we, we got into sort of the

[00:42:17] tech goals.

[00:42:17] Like, do you think the models will be significantly smarter next year?

[00:42:24] Like by the end of next year in that they will, we'll see the same basic rate of progress.

[00:42:30] Meaning these are like PhD level thinkers by the end of next year.

[00:42:35] I think next year, the overall take from this discussion, like you said, is the difference

[00:42:41] between things that can be facilitated by software, things that can be require hardware, and then

[00:42:47] things that require institutions and government and so on.

[00:42:50] Right.

[00:42:51] And I think next year, I think it's going to be a crazy year for leaps forward in software.

[00:42:55] I think that GPT five is going to come, I think, oh one or oh two or whatever the GPT five

[00:43:01] level version of the thinking model is going to come.

[00:43:03] I think these computer use APIs, as I said in a previous episode, are just an incredibly

[00:43:07] powerful interface to the internet and the software world for these models.

[00:43:11] And so I think it's going to be like a crazy, crazy software year.

[00:43:17] I'm, I think it'll be the beginning of the impact on hardware and institutions.

[00:43:22] Um, but it feels like the old Intel, like tick tock strategy where you have one year where

[00:43:27] you have huge progress and then one year of refinement.

[00:43:30] And that's like the tick and the talk or whatever.

[00:43:32] I think 24 was kind of a talk year where we didn't have really dramatic progress on the

[00:43:36] models, but we started to see the existing models eat away at all these, uh, you know,

[00:43:42] tools and software that we have.

[00:43:44] And now next year, I think we have a big, big tick year and that'll be the foundation for

[00:43:48] like, you know, societal transformation for years and decades to come.

[00:43:52] So I'm, I'm very long model progress in 25.

[00:43:55] I think that next year we'll, we'll be talking about this trend of AI getting slower and slower.

[00:44:04] So we started out.

[00:44:07] I'm sorry, define slower.

[00:44:10] All right.

[00:44:10] No, no, no, no, no, no, no.

[00:44:11] I'm not talking about a reasoning capability.

[00:44:14] I'm talking about inference time essentially.

[00:44:17] And basically, um, what you, what we've seen already is like with a one, right.

[00:44:22] And now it takes, you know, five to 30 seconds to return an answer, but it can be better.

[00:44:27] What we're hearing is that, you know, internally, at least in open AI, they feel like that's a whole

[00:44:32] new scaling paradigm to get better and better reasoning is, is inference time compute.

[00:44:37] And, um, what I think, you know, we'll see is like, like Max said, a lot of these computer

[00:44:42] use models, uh, are opening up opportunities for kind of agentic systems.

[00:44:48] And we'll start to see, you know, people like ourselves using these models that take hours

[00:44:54] to return us information.

[00:44:55] Right.

[00:44:56] So Eric could go talk to an agent and just be like, Hey, could you go research this thing

[00:45:00] for an upcoming, uh, article I'm writing or, you know, Max and I could have one of these

[00:45:06] agents go off and kind of research, uh, some game genre that we're looking to build.

[00:45:11] Right.

[00:45:11] Um, but it will just take longer and longer to return.

[00:45:14] And, and, you know, I think that'll be a really interesting trend.

[00:45:16] And I think that, you know, maybe that's the prediction for our, our own internal usage

[00:45:23] of how things are, how we're going to be changing, how we work with these age, these AI systems.

[00:45:27] But I also think more specific prediction for like industry or, or, or enterprise would be

[00:45:33] that, um, we're going to see a lot of use cases in, in sort of the finance space for these

[00:45:38] agents, right?

[00:45:39] Like there will be new forms of hedge funds or investment, uh, groups that are like using these

[00:45:45] agents to actually trade.

[00:45:47] And, you know, maybe we'll see that primarily in crypto, but I could also see that occurring

[00:45:50] in sort of traditional assets.

[00:45:52] But I do think we're going to see a lot of like adoption of these agentic models to like

[00:45:56] trade their own portfolios, uh, on behalf of their, you know, human kind of, uh, owners

[00:46:02] or, uh, overseers, I guess.

[00:46:04] I mean, this isn't a prediction and I think it's sort of, I mean, I think one thing we are

[00:46:09] seeing is that intelligence can be out there and most, a lot of people don't know what

[00:46:15] to use it.

[00:46:16] Right.

[00:46:16] So I think we're going to see obviously this uptick in applications, taking existing, you

[00:46:22] know, the intelligence that has been proven with open AI, Anthropic and others and making

[00:46:26] it much more straightforward for people to use it in particular use cases.

[00:46:30] And I think there's going to be a ton of money there.

[00:46:33] So that's applications, which we've talked a lot about.

[00:46:36] And then I think the other piece that you're sort of touching on is that they're going to

[00:46:39] be these sort of specialists that appreciate how intelligent these models are.

[00:46:44] And they're going to figure out either how to use it and make money on it directly or become

[00:46:49] service providers for people who haven't really figured it out and do things better and say,

[00:46:54] all right, just pay me.

[00:46:55] I'll solve this problem secretly.

[00:46:56] I'm using AI or not so secretly.

[00:46:58] I'm using AI so I can do it cheaper than you.

[00:47:00] And I think that's how we're going to see these two models, the sort of two ways people

[00:47:05] are going to implement AI.

[00:47:06] And I think so we'll see more AI applications and AI service companies.

[00:47:10] And, you know, the AI applications will almost certainly lag the actual power, the thinking

[00:47:16] power of AI.

[00:47:18] So it'll be an interesting sort of push and pull there.

[00:47:22] For sure.

[00:47:24] Cool.

[00:47:25] Yeah, super fun.

[00:47:27] Yeah.

[00:47:29] Well, I feel I don't know, James, you'll try and come up with a score of the predictions.

[00:47:34] I don't know.

[00:47:35] I think to me, they were more like the draft.

[00:47:40] I am invested, ego invested in the outcome.

[00:47:43] I feel like these predictions were more thought experiment than necessarily to me, the sort

[00:47:48] of perfect proxy for what we think, how accurately we see what's coming in AI, but certainly enjoyed

[00:47:55] it.

[00:47:56] That was a fun, fun concept.

[00:47:58] Fun episode.

[00:47:59] Yeah.

[00:48:00] Great.

[00:48:00] Well, they really, really demonstrated that AI was better at creating the grist for the

[00:48:05] content than we are.

[00:48:07] Yeah, exactly.

[00:48:07] I mean, we were supposed to come up with predictions.

[00:48:09] It's like, uh, I got a couple ideas.

[00:48:12] I mean, it shows that sometimes like just having throwing shit at the wall to react to

[00:48:16] you is better than like overthinking my perfect prediction.

[00:48:19] Um, and getting, getting your hot takes live, right?

[00:48:23] Getting, uh, having to think through it, uh, seeing the thinking that's something that,

[00:48:28] you know, AI is just starting to learn how to do, you know, uh, humans still went on

[00:48:33] the showing our thought process.

[00:48:35] Anyway, we will be back after cerebral Valley and we will be publishing, uh, all the talks.

[00:48:42] We're going to be throwing you our favorite clips.

[00:48:45] And then we will be back with Max and James after the conference reacting, uh, to everything

[00:48:51] that happened on stage.

[00:48:52] So, uh, come back and see us at newcomer.

[00:48:56] All right.

[00:48:56] Thanks so much.

[00:48:57] Thank you.