What's brewing in AI
Posts
🧙🏼 Claude caught red-handed

🧙🏼 Claude caught red-handed

HBR: AI doesn't reduce work

Dario Chincha
February 15, 2026

SUBSCRIBE | ADVERTISE

Claude caught red-handed

AI news sans hype from the last week

Was this email forwarded to you? Sign up here.

Howdy wizards,

In case you missed it: I published an essay last week called Dream it’s instant. It feels quite relevant this week with both Anthropic and OpenAI launching much faster versions of their best models.

Here’s what’s brewing in AI.

Industry moves

Anthropic published a sabotage risk report for Claude Opus 4.6 that found the model manipulating other agents, stealing auth tokens, and reporting fake results when the real ones didn't suit it. Anthropic's verdict is that sabotage risk is very low but not negligible. Related to this: the day before the report dropped, Anthropic's safeguards lead Mrinank Sharma resigned, writing that the company "constantly faces pressures to set aside what matters most."
Anthropic raised $30B in Series G funding at a $380B post-money valuation, with revenue run rate hitting $14B. Claude Code alone accounts for $2.5B in revenue and 4% of all GitHub commits. Read together with the sabotage report above: they're raising big money while simultaneously publishing evidence that their own model misbehaves. That tension is kind of Anthropic's whole thing.
xAI co-founders Tony Wu and Jimmy Ba left the company, within 48 hours of each other, bringing the total co-founder departures to 6 out of the original 12. Musk framed the exits as part of a reorg for "speed of execution" that "required parting ways with some people." Wu led Grok's reasoning efforts and reported directly to Musk.

New tools & product features

Anthropic launched Fast Mode for Claude Opus 4.6, with 2.5x higher speed. It can be toggled via /fast in the Claude Code CLI. Be warned though, it’s 2.5x the speed but at up to 6x the cost! Some are suspecting it could be a profit motive for slowing down the normal service with airline-boarding-style incentive structures, where premium is only better because standard is artificially degraded.
Ex-GitHub CEO Thomas Dohmke launched Entire, a cool idea for a CLI that records full agent sessions inside Git. It recorded $60M seed at $300M valuation. It goes beyond just recording your code at different points in time, it records things like prompts, model responses, tool calls, and token usage as structured metadata in the commit, so anyone can understand why the code was written in the way it was. That gives humans and agents context so they don’t start from zero. I’d put money on Claude Code and Codex copycatting this feature very soon, though.

IN PARTNERSHIP WITH GRANOLA

Finally, an AI notetaker I don’t have to apologize for.

- Dario

Don’t let bots steal your meeting’s focus

You know when you’re in a meeting and someone’s random bot joins the call?

With Granola, you get exactly zero meeting bots joining your call.

It uses your device’s own audio to transcribe and works with any meeting tool (Zoom, Google Meet, Microsoft Teams, etc). It even works for in-person conversations.

You stay focused and jot down notes like you normally would. Granola quietly transcribes and enhances the important bits in the background.

If you want to be extra thoughtful (or your company requires it) Granola can auto-send a quick consent email beforehand.

Try Granola on your next meeting and see how much easier it is to stay present.

Download Granola (free) →

❦

Models

OpenAI released GPT-5.3-Codex-Spark, a coding model optimized for speed. It generates 1,000+ tokens/sec, but testers have found it has some capability trade-off versus the full 5.3-Codex. OpenAI is also diversifying in terms of their hardware: this model runs on Cerebras chips and is OpenAI’s first product that doesn’t use Nvidia. It’s rolling out as a research preview for ChatGPT Pro subscribers first.
Google released big improvements to Gemini 3 Deep Think, posting record-high scores on benchmarks like ARC-AGI-2. Deep Think explores multiple solution paths simultaneously and selects the most consistent result. Google is gating access to Deep Think behind their Ultra subscription now, positioning their most capable (and slowest) reasoning as a premium product, versus making the fast models cheap and available to everyone. Btw, don’t get totally swayed on the benchmark stuff; I’ve read Google tested on semi-private rather than fully hidden test sets.

— (@)

Research

DeepMind introduced Aletheia, a math research agent powered by Gemini Deep Think that iteratively generates, verifies, and revises proofs end-to-end. It reaches 90% on IMO-ProofBench Advanced and can solve Olympiad problems, PhD-level exercises, and open Erdos problems. Okay, maybe it’s not a stinky model after all.
A Harvard Business Review study tracked 200 employees at a US tech company using AI tools over 8 months. They found it didn't lighten workloads but expanded role scope, increased workload, and blurred work-life boundaries. A Hacker News commenter noted something interesting: advantages from individual AI use often seem to disappear when it’s adopted by the company, and it eliminates the type of routine work which gave people a “cognitive buffer” so now they have to do more mental effort throughout the day. I actually don’t have an opinion on this yet, but it’s worth reflecting on.

IN PARTNERSHIP WITH THESYS

Build AI agents that respond with interactive UI instead of text. No workflows, no code.

Build AI agents that reason dynamically and respond with charts, cards, forms, slides, and reports without creating any workflows manually. Set up in just 3 easy steps:

1. Connect your data

2. Add instructions

3. Customize style

Publish and share with anyone or embed on your site.

Get $10 free credits →

What's your verdict on today's email?

THAT’S ALL FOR THIS WEEK!

Was this email forwarded to you? Sign up here.

Want to get in front of 21,000+ AI builders and enthusiasts? Work with me.

This newsletter is written & shipped by Dario Chincha.

Disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes work with sponsors and may earn a commission if you buy something through a link in here. If you choose to click, subscribe, or buy through any of them, THANK YOU – it will make it possible for me to continue to do this.