- What's brewing in AI
- Posts
- š§š¼ OpenAI debuts o3 and friends
š§š¼ OpenAI debuts o3 and friends
Also: Claude built my Meta ads strategy
Morning light bathes the cafĆ© in amber hues as the wizard sips thoughtfully on what can only be described as a damn fine cup of coffee. He scrolls the news headlines; this o3 model apparently āwipes the floor with humansā. āWhat a peculiar claim," he jots in his notebook, pausing occasionally to consider the implications. āIāll need to run some more testsā¦right after my meditation breakā.
Howdy wizards,
Itās time to settle into your preferred sitting apparatus. Baptise your neurons in a velvety shot of espresso.
Hereās whatās brewing in AI.

DARIOāS PICKS
1. OpenAI drops the full o3 model and o4-mini (plus a couple of treats)
OpenAI launched o3 and o4-mini this week, two powerful state-of-the-art reasoning models whose naming closely follows the company's signature blend of chaos and confusion.
Hereās the details:
o3 is the new top-tier reasoning model. It pushes benchmarks across almost all disciplines: coding, math, science, visual perception, and more. Early testers also saw 20% less errors compared to the o1 model on areas like programming, business/consulting, and creative ideation.
Sam Altman quotes a well-known immunologist saying the new model is āat or near genius levelā, a statement that seems supported if we look at traditional IQ tests (but keep in mind that benchmarks are gameableānot even PokĆ©mon is safeāand not necessarily good indicators of actual usefulness):
Chart source: https://trackingai.org/IQ
Something to note if youāre building apps with AI. At similar performance levels, o3 is 4x more expensive to use than Gemini 2.5 Pro. So depending on your budget you might want to look into the latter.
The second model they launched is o4-miniāa small, cheap and fast version of OpenAIās upcoming flagship reasoning model which will presumably be called o4 (although, with OpenAIās naming game, we donāt really know its name until itās released). It outperforms the previous o3-mini in evaluations and has higher usage limitsāa good choice for apps that need smart responses at massive scale.
The new models have access to all tools within ChatGPT and, importantly, are trained to know when and how to use them, including improved output formats (hello, nice table instead of long messy list). The tools weāre talking about here are web search, analysing uploaded files and data analysis with Python.
Ability to āthink with imagesā: both of the new models can blend visual and textual reasoning in a new way. They no longer simply āseeā the image you upload, but use it directly in their chain of thought. They can also edit the image as part of their reasoning, zooming, cropping, transforming. Combining its new tools and visual perception, I reckon o3 might now be the worldās best geoguesser.
Thinking longer = better results. Along with the launch, OpenAI also reported that letting models think for longer still yields performance gains. The "throw more compute at it" approach that worked well for pretraining is also working for reinforcement learning.
Alongside the launch of these two new models, OpenAI also shipped two other things this week:
A new model, GPT 4.1 in the API. This model has a massive 1M context window. While theyāve released three versions of itānano, mini and fullāapparently the mini-model is the star of the show (outperforms GPT-4o at 80% less cost).
Codex CLI: A coding agent available through the terminal that supports o3 and o4-mini. Itās an experimental, minimal interface for connecting OpenAIās models to your computer.
ā Why it mattersā ā Iāve only had the chance to test the o3 model for a few daysāmy first impression is that itās highly useful. Definitely my newest thought partner, sharing the podium with Claude. As someone who constantly drags-and-drops screenshots into ChatGPT while working, I also canāt wait to test it on more visual tasks.
Quick battle report: I tested it back to back with Claude on a classic data nightmare: matching records across two CSV files with inconsistent naming conventions. Great task for AI. Anywaysāwhile Claude delivered a partially correct solution with some mistaken records in the results, o3 pulled it off without error. A bit more woosah to my workflow.

IN PARTNERSHIP WITH DATABUTTON
Want to build business software & apps but arenāt a dev?
Meet Databutton.
Built on top of the worldās first reasoning AI-developer, Databutton will turn your new app idea into reality.
Hereās why you should give Databutton a go:
It breaks everything down into a series of tasks and works with you through the entire build process
The agent isnāt a command taker, it actively collaborates with you to make sure you land on the right solution (kinda like your own personal CTO)
It manages the full-stack, including frontend, backend, and deployment
Or as one user said, āDatabutton is like being in possession of a nuclear weapon equivalent of unlimited creative potential.ā š„

DARIOāS PICKS
Anthropic launched Research in early beta, a new feature that allows Claude to do systematic web searches web searches, exploring different angles of your question automatically, while answering with citations. Itās only available on the Max plan and upwards ($100+/month) for users in the US, Japan, and Brazil.
Claude also has a brand new Google Workspace integration hat connects Claude to your email, calendar, and Google Drive documents. It can now do things like searching your emails, using your documents and see whatās in your calendar. It also provides inline citations so you can check the sources. This eliminates the need to paste information back and forth between these platforms and Claude, which saves time and lets you focus on more important stuff. In Anthropicsā own words it can āpull together meeting notes from last week, identify action items from follow-up email threads, and search relevant documents for additional contextā.
ā Why it mattersā ā
Of the major AI players, Anthropic is a latecomer in adding this type of deep research feature to their AI; They call it just āResearchā, but itās a similar feature to what ChatGPT, Gemini, Perplexity and others have added over the last months. It can help you go deep on a topic with minimal workābut before firing your researcher you should consider that existing ādeep researchā features have several interesting, non-obvious problems.
AI + a knowledge base is a powerful combination, and probably the most popular employee agent that businesses build with AI. This feature is cool but also isnāt really something new to the market, Gemini already integrates tightly with Google Workspace, and a Google Drive integration also exists for ChatGPT through connected apps.

UP CLOSE
How Claude helped me build a testing framework for running Meta ads for this newsletter
In this mini-series I break down different ways I use AI from week to week. Previously Iāve covered how I use AI on my phone, how it helps me run this business by myself (part 1 and part 2), and using ChatGPTās new image generator.
This newsletter is nearly at 14,000 subscribers now ā a wizard gathering that has amassed organically, helped by your fine referral work and word-of-mouth. I've decided to up things a notch by implementing the same growth strategies as all the major newsletters: paid advertising. And I'm currently learning diving deep into running and optimising ads on Meta.
Setting up ad campaigns is an area I have some, albeit quite limited, experience in. I chose Claude to help me with this as itās not only amazing at creative writing, but also great at creating structured plans and tablesāand I knew this would require me to move back and forth between the two as a I refine the output.
I started asked mr 3.7 Sonnet some general questions about how to craft an effective ads strategy for gaining subscribers via Meta ads (I also watched some YT videos on the topic beforehand). After learning the basics on how the ad manager platform works, Claude proposed a detailed experimental setup so I can test and optimise different campaigns, ad formats, messaging⦠something that will allow me to control different components to tweak and combine them for best results (getting the highest-quality subscribers at the lowest cost).

Talk systematic to me
Claude helped me brainstorm creative ideas and coming up with different ad concepts for testing in my campaign. It then detailed the full campaign structure, complete with ad sets and specific ad variations, setup in a way that will allow me to get a clear overview of whatās working and whatās notāand optimise towards my goal.

My work-in-progress in Figma with different Meta ad variations. Claude helped on the specific messaging for each one, and suggested a structure with consistent naming.
Finally, Claude helped me navigate the process of setting up granular tracking (Google Tag Manager and Meta Pixel) in beehiiv (my email software) so that I can evaluate exactly which subscribers came from which ad variations.
I should add that I wouldnāt have done this without first watching some solid YouTube videos on the topic of driving subscribers through Meta Ads before doing this; however, AI came into play and was incredibly helpful when adapting these learnings to my specific scenario. I saved myself hours of setup time. More importantly, because I've implemented a proper experimental structure from the beginning, I'll be able to optimise quicklyāidentifying winning combinations and cut the losersāwithout getting lost in a maze of poorly organized campaigns.
Are these practical AI stories useful to you?My "Up Close" series (like today's peek into my Meta ads campaign building with AI): |

CONTEXT WINDOWS
contextwindows.ai just hit 771 case studies with fresh additions from companies like Canva, Notion, Deutsche Bank, and Capgemini this week.
Find out how companies actually implement AI instead of obsessing over which model beats which on paper.
FLASH SALE: 50% OFF. This week only ā> use code WIZARD-HEIST at checkout

THATāS ALL FOLKS!
Was this email forwarded to you? Sign up here. Want to get in front of 14,000 AI enthusiasts? Work with me. This newsletter is written & curated by Dario Chincha. |
Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU ā it will make it possible for me to continue to do this.