What's brewing in AI
Posts
🧙🏼 Google's new model takes the lead

🧙🏼 Google's new model takes the lead

Also: Microsoft's biggest AI spenders

Dario Chincha
November 18, 2024

Happy Monday, wizards.

Here’s what’s brewing in AI today.

DARIO’S PICKS

1. Google's new model takes the lead

Massive News from Chatbot Arena🔥
@GoogleDeepMind's latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision… x.com/i/web/status/1…
— lmarena.ai (formerly lmsys.org) (@lmarena_ai)
5:17 PM • Nov 14, 2024

Say hello to Google’s new frontier model that goes by the tongue twister name of Gemini Exp-1114. Four days after its release, it has taken first place in the Chatbot Arena leaderboard (users vote for the best models), scoring above GPT-4o and o1-preview. Apparently, it’s leading the ranks on things like math and creative writing, but lagging behind the o1-series when it comes to coding.

You can easily test the new model yourself inside Google AI Studio (just use the model picker dropdown on the right):

Fun facts:

I tested it briefly, and noticed it takes several seconds to answer. It seems slow in the same way as o1-preview and o1-mini, but doesn’t give you details about its process like they do.
Ethan Mollick tested it for reviewing academic papers and was impressed. Points out we have multiple models now that can understand research and methods on PhD level.
It has a notably small context window, only 32k tokens (with Gemini Pro having up to 2 million) – yet it’s reasoning capabilities seem to outperform other models on several tasks.
Apparently, it has no idea which date it is.

‎ Why it matters‎ ‎ Chatbot rankings seem to be relatively fleeting, especially when models score close to one another. For Gemini Exp-1114, note the bigger confidence interval vs OpenAI’s models due to fewer votes; this means there’s a chance it might go down in ranking over the next couple of weeks as more users test it out.

ALSO, OpenAI seems to be responding by testing a new 4o version. Something tells me Google’s time in the “top model”-spotlight will be short lived this time around….

TOGETHER WITH

Stop overpaying for wireless. Get money back in your pocket with Visible.

What if we told you that you could enjoy unlimited data, talk, text, and hotspot for just $275/year? With Visible, it's not just an inflation-induced dream—it's reality. Experience reliable 5G and 4G network coverage, no contracts, and no hidden fees. For a limited time, What’s brewing in AI readers save up to $145 on an annual plan, no code required.

Try Visible

DARIO’S PICKS

2. Microsoft's biggest AI spenders: TikTok, Meta and Adove

Earlier this year, it was reported that Bytedance (owner of TikTok) was Microsoft’s largest AI customer, spending nearly $20 million per month to access OpenAI’s advanced language models through Azure. TikTok still holds the lead of top customers but, according to The Verge, Meta and Adobe are other top customers, each spending over $1 million in September alone for access to OpenAI's models through Azure.

Microsoft has reportedly managed to diversify its base of large customer, with Bytedance now representing less than 15% of their Azure OpenAI revenues.

‎ Why it matters‎ ‎ Despite the reports of a cooling relationship between Microsoft and OpenAI, it looks like Microsoft’s cash registers are still happily ringing.

The expanding base of customers with hefty spending amounts also highlights the growing commitment to GenAI by enterprises.

THAT’S ALL FOLKS!

Was this email forwarded to you? Sign up here.

Want to get in front of 13,000 AI enthusiasts? Work with me.

This newsletter is written & curated by Dario Chincha.

What's your verdict on today's email?

Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU – it will make it possible for me to continue to do this.