- What's brewing in AI
- Posts
- 🧙🏼♂️ Say hello to GPT-4o
🧙🏼♂️ Say hello to GPT-4o
Also: GPT-4 is now free!
nr 39 / OpenAI Spring Update special
Subscribe|Sponsor|Submit GPT|Top 2,500 GPTs
…and it’ll say it back, without the lag.
113,000 people joined in on the OpenAI’s Spring Update livestream yesterday. There’s some serious upgrades to cover, so I’ll save the other AI news for later this week.
Here’s my breakdown of OpenAI’s spring update:
OpenAI launches a new model, GPT-4o. It’s faster, cheaper and its voice & vision capabilities is like AI from the movies.
Free GPT-4 for everyone. That includes GPT-4o (with limited messages) and using custom GPTs.
The voice and vision capabilities are off the hook. See OpenAI’s demos.
ChatGPT’s new Desktop app. You can talk with it and it can see what you’re doing. Coming first to macOS.
Feeling confused? My summary of what’s already rolled out and what’s coming.
This issue is brought to you by
Are You Ready for the AI Age?
The age of artificial intelligence has arrived, and businesses must adapt or risk falling behind. Discover how AI technologies like machine learning and natural language processing could impact your business and how you can harness them to grow your competitive advantage.
In the MIT Artificial Intelligence: Implications for Business Strategy online short course you’ll gain:
Practical knowledge and a foundational understanding of AI's current state
The ability to identify and leverage AI opportunities for organizational growth
A focus on the managerial rather than technical aspects of AI to prepare you for strategic decision making
Dario’s Picks - OpenAI’s Spring Updates special
1. The new model: GPT-4o
GPT-4o has GPT-4 level intelligence and it’s omnimodal. That is, it combines audio-text-video in the same model and processes all modalities natively.
This means ChatGPT can now see, hear you and respond to you almost in real-time.
It’s also 2x faster than GPT-4 Turbo, something already very noticeable in ChatGPT.
There’s also API access for developers and it’s 50% cheaper to use.
Why it matters
The best AI model in the world just got twice as fast and it’s cheaper to build apps on. Also, the naming choice is pretty on-brand for OpenAI – very cryptic and hard to pronounce 😄
2. GPT-4 is now free, including using custom GPTs
Biggest actual implication of today's OpenAI announcement is very practical: the top barrier I see when I give talks on using AI is that people don't pay for AI to start, and they use GPT-3.5 (the free model) and are disappointed.
Now everyone around the world gets GPT-4 free.
— Ethan Mollick (@emollick)
7:24 PM • May 13, 2024
Most of the features previously in ChatGPT Plus is now becoming free to all users. That includes GPT-4o (with a message limit), web browsing, advanced data analysis, file uploads and cross-chat memory.
Free users will also have access to use custom GPTs and the GPT store – which is huge for free users and for the GPTs ecosystem alike.
Why it matters
This move is not to be underestimated for the mainstream adoption of AI; free users will have a much more capable model now, with more useful features to boot, which is likely to reduce churn.
Cleverly, OpenAI isn’t allowing free users to create GPTs, only use existing ones. A lot of the potential in custom GPTs lies in creating ones tailored to your specific workflow, so this could result in good conversion from free to paid.
3. GPT-4o voice and vision capabilities are now way more real-time
GPT4-o brings near real-time voice interactions and image/video understanding to ChatGPT. It can do this because it is trained on text, video and audio from scratch. The previous voice and image modes were using a few different technologies stitched together, but this model is natively multimodal (thus, big latency improvements).
So, interacting with voice is now much more seamless. For example, you can interrupt ChatGPT while its speaking and it can change its speed and voice tone easily. It can also understand what’s happening from your camera or desktop view in real-time.
Check out some of OpenAI’s best demos of GPT-4o:
Acting as a real-time translator between two people
Solving math problems (voice + screen sharing at the same time)
Two GPT4-o models interacting with each other and singing
Bringing it in a meeting (answering participant’s questions and summarising)
The new omnimodal capabilities are rolling out gradually, and OpenAI emphasised the importance of safety, probably to check that their guardrails are working as expected.
Why it matters
This is huge for the user experience. If you’ve seen the movie her – that’s basically what the new model is capable of (not just for romance, but for anything). Check out the demo with the translation between two languages, to see just how real-time we’re talking.
4. ChatGPT’s Desktop app means you have a copilot for everything you do
New chatGPT desktop app!!!
Can see your laptop screen!
One hotkey to copy text— Nick Dobos (@NickADobos)
5:21 PM • May 13, 2024
ChatGPT is getting a new desktop app which can interact with what you’re doing on your computer. You launch it with Option + Space, and you can easily pass screenshots or share your screen in real-time with it. That’s right, the desktop app has vision (you need to give it permission, it’s not active all the time)! For example, you can have it look at a graph on your screen with you and analyse it.
The Desktop app will roll out over the next few weeks to Plus users, and gradually become available to everyone. Some tech-savvy folks have already hacked the system and are already running the app, though. Funnily, that “hacker” is Nick Dobos, creator of one of the most popular custom GPTs (he’s got some nerve lol). But if you’re not technical or transgressive enough to hack your way to early access, don’t fret, you should get it soon enough.
Why it matters
A final goodbye to your privacy! (just kidding. I hope.) This means you’ll have an AI sidekick that can be part of your tasks and communicate through voice and vision, easily available throughout your workday. Saw someone else ask this timely question: will everyone talking to AI now be the nail in the coffin to open-landscape workspaces?
5. What’s already available and what’s yet to roll out
With all the new features launched, the gradual roll-out and the different pricing tiers, its easy to get confused.
Here’s my recap of what’s already here and what’s still to come:
The basic GPT-4o model in ChatGPT should already be available, both for free and paid users.
GPT-4o’s full capabilities in ChatGPT will roll out gradually over the coming weeks.
The new version of Voice mode you see in the demos is coming to ChatGPT Plus in the coming weeks.
The GPT-4o API for developers is already available, but the upgraded audio and video capabilities are launching first to a small group of trusted partners.
The ChatGPT desktop app will become available to both free and paid users. Rolling out over the next few weeks, starting with Plus users. Currently macOS only, coming to Windows later this year.
Using Custom GPTs is still not available on the free plan, but should become so shortly.
The advantages of ChatGPT Plus vs free are still big: 5x higher message limits on GPT-4o (including Advanced Data Analysis, file uploads, vision and web browsing), ability to create custom GPTs and DALL·E.
The wizard’s favourite AI newsletters
what i’m reading right now
TLDR 💨 - essential tech news in 5 mins a day. Read by 1.2 million people.
simple.ai - 🕵🏻♂️ - deep-dives on AI by Hubspot’s co-founder (highly recommended)
The Neuron 😸 - easy weekday read on AI’s latest developments
*if you sign up for these newsletters I may earn a commission at no cost to you. If you do, THANK YOU, it will make it possible for me to continue to do this.
That’s a wrap for the spring update! Fellow sorcerers – join me on LinkedIn. Until next time, Dario Chincha 🧙🏼♂️ |
What's your verdict on this week's email? |