- What's brewing in AI
- Posts
- The Sunday recap✨
The Sunday recap✨
Your weekly AI catch-up is here
Howdy, wizards.
⏪ I’m back with your weekly recap email on Sundays — all the best links I’ve shared during the week which you might have missed. No fluff or unnecessary details included.
🤷🏻♂️ Don’t need the Sunday recap? You can update your subscriber preferences at the end of this email to pick what emails you’d like to receive.
Get comfy, pour yourself a nice brew of your choice, and let’s recap!
TOGETHER WITH WRITER
The fastest way to build AI apps
Writer Framework: build Python apps with drag-and-drop UI
API and SDKs to integrate into your codebase
Intuitive no-code tools for business users
THE SUNDAY RECAP
THE MOST IMPORTANT DEVELOPMENTS IN AI THIS WEEK
In the last week, OpenAI and Google have engaged in a rapid exchange of AI model releases, each briefly capturing the top spot in language model rankings.
Here’s what unfolded in the last week:
First, Google took the lead on the LLM leaderboard last week with a new model. Gemini Exp-1114, as the model is called, climbed to first place in the Chatbot Arena leaderboard—surpassing GPT-4o and o1-preview. It was the leading model in areas like math and creative writing although lagging behind in coding. → Read the full newsletter here
Then, GPT-4o got a small upgrade earlier this week — reclaiming the #1 spot. OpenAI might’ve seen this coming, and responded by upgrading GPT-4o. It’s clear the move aimed to dethrone Gemini Exp-1114. The latest GPT-4o reclaimed the number 1 overall rank, excelling in creative writing and coding. → Read the full newsletter here
Finally, the next day, Google strikes back with yet another model, with Gemini Exp-1121, climbing to #1 again. It’s interesting to note that alternative benchmarks like LiveBench tell a different story, with Gemini Exp-1121 ranking below other models. → Read the full newsletter here
You can test the new models from Google right inside AI studio. And the latest GPT-4o is available inside ChatGPT as the default when you select GPT-4o.
In practice, I think we’ll see little difference to our AI applications with each of these updates. I think there’s a bigger, more comprehensive, and better explained model release coming from both OpenAI and Google in the near future – which will be far more important. For now, it’s mostly entertaining to watch them play leapfrog with each other.
Allie K. Miller just posted the tutorial on how to easily setup a Claude Agent on your coffee break, so you can start testing the Computer Use feature! This means you’ll be one of the first people in the world to try AI agents. Everything is super simply explained – no technical skills needed. I’ve read so much about this feature, but testing it in real-life—seeing the steps it takes, the process and that it actually works—it’s truly impressive. Of course it’s still a bit clumsy, but it’s an early look on what’s coming.
→ Read the full newsletter here
FROM OUR PARTNERS
HubSpot provides a comprehensive customer relationship management platform to help you grow. With powerful features to manage leads and improve customer relationships, HubSpot’s CRM is completely free, with no restrictions on users or data, making it ideal for businesses at any stage.
→ Try Hubspot
Big news today from the Perplexity, the AI chatbot positioned as an “answer-engine” is taking things one step further by enabling shopping directly from its platform:
You can now purchase products with one click without leaving Perplexity, with free shipping included.
For shopping-related queries, Perplexity now researches and curates a comparison of alternatives, including key features, pricing, reviews and, of course, the “Buy with Pro” button.
“Snap to Shop” is an additional feature that lets you take a picture of a product, and gives you more information about it, including where to buy.
Perplexity is also introducing a Merchant Program to make it easier for businesses to bring their products right into Perplexity’s search results.
Currently, the feature is only available for Pro members based in the US, but will be expanded internationally soon, according to Perplexity’s CEO.
Perplexity is doing something smart, focusing 100% on end-to-end UI and actions, including providing organized, real-time info on important industries (Sports, Finance, Elections++) and now, enabling easy transactions.
→ Read the full newsletter here
4. Mistral drops a new flagship model + chatbot with image generation, web search and document analysis
Mistral has released Pixtral Large, a 124B parameter open-weights multimodal model building on Mistral Large 2. It also powers their chatbot, le Chat, now features image generation (via Black Forest Labs’ Flux Pro), web search, and a canvas feature. The model boasts superior image understanding, reportedly outperforming Gemini 1.5 Pro and GPT-4 on chart and document analysis. Mistral is different than the leading AI companies, it’s non-US based and its models are more open compared to OpenAI, Anthropic and Google. It’s great to see some more diversity coming to the model layer of LLMs.
→ Read the full newsletter here
Perplexity just launched a feature that lets you read live transcripts from earnings calls, as well as key highlights. Transcripts and highlights are remain accessible inside the company’s dedicated dashboard in Perplexity. They’re starting with NVIDIA, whom held their Q3 earnings call yesterday (more on that below), but will support all the major stocks soon. The feature comes on top of the upgrades to finance-related queries that Perplexity did last month. I think the earnings call transcripts will be widely used for high-level research by investors and researchers – paired with Perplexity’s finance dashboards, it also makes it way easier to digest the information in context.
→ Read the full newsletter here
Chinese AI research lab DeepSeek launched a powerful reasoning model this week called R1-Lite-Preview. It matches OpenAI’s o1 on certain benchmarks like AIME and MATH – and shows you even more information about its chain of thought process as its thinking. The R1-Lite-Preview model is accessible through DeepSeek Chat, and capped at 50 daily messages. This might put some extra pressure on OpenAI to release the full o1 version soon.
→ Read the full newsletter here
THAT’S ALL FOLKS!
Was this email forwarded to you? Sign up here. Want to get in front of 13,000 AI enthusiasts? Work with me. This newsletter is written & curated by Dario Chincha. |
What's your verdict on today's email? |
Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU – it will make it possible for me to continue to do this.