What's brewing in AI
Posts
🧙🏼 OpenAI and Google playing leapfrog

🧙🏼 OpenAI and Google playing leapfrog

Also: DeepSeek's new reasoning AI challenges o1-preview

Dario Chincha
November 22, 2024

Howdy, wizards.

Here’s what’s brewing in AI this Friday.

DARIO’S PICKS

1. Gemini reclaims the spot as the #1 model

Google just released a new model, Gemini Exp-1121, but…

…this needs context:

1) It started by Google releasing an upgraded model last week, Gemini Exp-1114, climbing to the top of the model ranking in the lmarena leaderboard.

2) Apparently, OpenAI wasn’t having it, because yesterday they put out an improved GPT-4o version (particularly better at creative writing) which quickly dethroned Google’s new model.

3) However – it looks like Google knew what was coming – as they launched yet another new model today, Gemini Exp-1121. It has already climbed to the #1 rank in lmarena.

As a side note, while lmarena’s leaderboard is by far the most cited one, the alternative (“contamination-free”) benchmark at LiveBench tells a different story. Here, Gemini Exp-1121 ranks below 5 other models including o1-preview/mini and Claude 3.5 Sonnet.

The release timings and lack of detail about all these new models are noteworthy. Some say Google played OpenAI by throwing the first, slightly better model out as bait, and waiting for them to launch something marginally better, only to come back even stronger. But – this is an ongoing story and we don’t know what’s going to be released in the next few weeks.

‎ Why it matters‎ ‎ In practice, I think we’ll see little difference to our AI applications with each of these updates. I think there’s a bigger, more comprehensive, and better explained model release coming from both OpenAI and Google in the near future – which will be far more important. For now, it’s mostly entertaining to watch them play leapfrog with each other.

TOGETHER WITH VISIBLE BY VERIZON

It's time to switch phone plans.

Let me guess: you've been putting up with spotty service and those sneaky rate hikes from your current wireless provider for years, right? Well, guess what? You deserve way better than that.

With Visible, you can get unlimited data, talk, text, and hotspot starting at just $275/year. Enjoy dependable 5G and 4G network coverage with no contracts and no hidden fees— so you can enjoy reliable wireless without the long-term commitment.

Switch to Visible today and experience the freedom of great service at an unbeatable price. What’s brewing in AI readers save up to $145 on an annual plan, no code required. But it’s for a limited time so don’t wait to make the switch.

Try Visible

DARIO’S PICKS

2. DeepSeek launches a reasoning model that rivals OpenAI’s o1

🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!
— DeepSeek (@deepseek_ai)
11:40 AM • Nov 20, 2024

Chinese AI research lab DeepSeek launched a powerful reasoning model this week called R1-Lite-Preview. It matches OpenAI’s o1 on certain benchmarks like AIME and MATH – and shows you even more information about its chain of thought process as its thinking. The R1-Lite-Preview model is accessible through DeepSeek Chat, and capped at 50 daily messages.

‎ Why it matters‎ ‎ Just over two months after OpenAI released o1-preview, a lesser-known lab has developed a competitive alternative; AI is really moving at lightning speed. This might put some extra pressure on OpenAI to release the full o1 version soon.

FROM OUR PARTNERS

Give your inbox a brain. Ceejay is an assistant that finds any email instantly and manages your calendar through natural conversation. Stop searching, just ask: 'What did Sarah say about the budget?'

Download for iPhone today, and reclaim 2-3 hours of your day.

Try Ceejay AI

THAT’S ALL FOLKS!

Was this email forwarded to you? Sign up here.

Want to get in front of 13,000 AI enthusiasts? Work with me.

This newsletter is written & curated by Dario Chincha.

What's your verdict on today's email?

Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU – it will make it possible for me to continue to do this.