- What's brewing in AI
- Posts
- š§š¼ Claude caught red-handed
š§š¼ Claude caught red-handed
HBR: AI doesn't reduce work
Claude caught red-handed
AI news sans hype from the last week
Was this email forwarded to you? Sign up here.

Howdy wizards,
In case you missed it: I published an essay last week called Dream itās instant. It feels quite relevant this week with both Anthropic and OpenAI launching much faster versions of their best models.
Hereās whatās brewing in AI.
Industry moves
Anthropic published a sabotage risk report for Claude Opus 4.6 that found the model manipulating other agents, stealing auth tokens, and reporting fake results when the real ones didn't suit it. Anthropic's verdict is that sabotage risk is very low but not negligible. Related to this: the day before the report dropped, Anthropic's safeguards lead Mrinank Sharma resigned, writing that the company "constantly faces pressures to set aside what matters most."
Anthropic raised $30B in Series G funding at a $380B post-money valuation, with revenue run rate hitting $14B. Claude Code alone accounts for $2.5B in revenue and 4% of all GitHub commits. Read together with the sabotage report above: they're raising big money while simultaneously publishing evidence that their own model misbehaves. That tension is kind of Anthropic's whole thing.
xAI co-founders Tony Wu and Jimmy Ba left the company, within 48 hours of each other, bringing the total co-founder departures to 6 out of the original 12. Musk framed the exits as part of a reorg for "speed of execution" that "required parting ways with some people." Wu led Grok's reasoning efforts and reported directly to Musk.
New tools & product features
Anthropic launched Fast Mode for Claude Opus 4.6, with 2.5x higher speed. It can be toggled via /fast in the Claude Code CLI. Be warned though, itās 2.5x the speed but at up to 6x the cost! Some are suspecting it could be a profit motive for slowing down the normal service with airline-boarding-style incentive structures, where premium is only better because standard is artificially degraded.
Ex-GitHub CEO Thomas Dohmke launched Entire, a cool idea for a CLI that records full agent sessions inside Git. It recorded $60M seed at $300M valuation. It goes beyond just recording your code at different points in time, it records things like prompts, model responses, tool calls, and token usage as structured metadata in the commit, so anyone can understand why the code was written in the way it was. That gives humans and agents context so they donāt start from zero. Iād put money on Claude Code and Codex copycatting this feature very soon, though.
IN PARTNERSHIP WITH GRANOLA
Finally, an AI notetaker I donāt have to apologize for.
You know when youāre in a meeting and someoneās random bot joins the call?
With Granola, you get exactly zero meeting bots joining your call.
It uses your deviceās own audio to transcribe and works with any meeting tool (Zoom, Google Meet, Microsoft Teams, etc). It even works for in-person conversations.
You stay focused and jot down notes like you normally would. Granola quietly transcribes and enhances the important bits in the background.
If you want to be extra thoughtful (or your company requires it) Granola can auto-send a quick consent email beforehand.
Try Granola on your next meeting and see how much easier it is to stay present.
ā¦
Models
OpenAI released GPT-5.3-Codex-Spark, a coding model optimized for speed. It generates 1,000+ tokens/sec, but testers have found it has some capability trade-off versus the full 5.3-Codex. OpenAI is also diversifying in terms of their hardware: this model runs on Cerebras chips and is OpenAIās first product that doesnāt use Nvidia. Itās rolling out as a research preview for ChatGPT Pro subscribers first.
Google released big improvements to Gemini 3 Deep Think, posting record-high scores on benchmarks like ARC-AGI-2. Deep Think explores multiple solution paths simultaneously and selects the most consistent result. Google is gating access to Deep Think behind their Ultra subscription now, positioning their most capable (and slowest) reasoning as a premium product, versus making the fast models cheap and available to everyone. Btw, donāt get totally swayed on the benchmark stuff; Iāve read Google tested on semi-private rather than fully hidden test sets.
Research
DeepMind introduced Aletheia, a math research agent powered by Gemini Deep Think that iteratively generates, verifies, and revises proofs end-to-end. It reaches 90% on IMO-ProofBench Advanced and can solve Olympiad problems, PhD-level exercises, and open Erdos problems. Okay, maybe itās not a stinky model after all.
A Harvard Business Review study tracked 200 employees at a US tech company using AI tools over 8 months. They found it didn't lighten workloads but expanded role scope, increased workload, and blurred work-life boundaries. A Hacker News commenter noted something interesting: advantages from individual AI use often seem to disappear when itās adopted by the company, and it eliminates the type of routine work which gave people a ācognitive bufferā so now they have to do more mental effort throughout the day. I actually donāt have an opinion on this yet, but itās worth reflecting on.
IN PARTNERSHIP WITH THESYS
Build AI agents that reason dynamically and respond with charts, cards, forms, slides, and reports without creating any workflows manually. Set up in just 3 easy steps:
1. Connect your data
2. Add instructions
3. Customize style
Publish and share with anyone or embed on your site.

What's your verdict on today's email? |
THATāS ALL FOR THIS WEEK!
Was this email forwarded to you? Sign up here. Want to get in front of 21,000+ AI builders and enthusiasts? Work with me. This newsletter is written & shipped by Dario Chincha. |
Disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes work with sponsors and may earn a commission if you buy something through a link in here. If you choose to click, subscribe, or buy through any of them, THANK YOU ā it will make it possible for me to continue to do this.


