- What's brewing in AI
- Posts
- 🧙🏼 Meta's Video Gen
🧙🏼 Meta's Video Gen
Also: Apple intelligence is around the corner
Happy Monday, wizards.
And welcome to the 100th edition of this newsletter. Really appreciate all of you who read and follow the stuff I share here every day. Wouldn’t be much fun without you!
Without further ado – here’s the most interesting things happening in AI today.
DARIO’S PICKS
Meta just released a new AI model series that can generate and edit videos + generate soundtracks and sound effects for your video clips. It claims to outperform or be on par with current top video models on on all quality metrics: OpenAI’s Sora, Runway Gen3 and Kling 1.5. Apparently, it also beats them all on realness and aesthetics.
The system can generate up to 16 seconds long clips with a syncronised audio track. Mark Zuckerberg announced the new model is coming to Instagram next year – along with a video of himself pushing about 20,000 chicken nuggets on leg day. Check it out.
Why it matters The video generated with these models look truly impressive, and it’s interesting to see what’s coming. That being said, I think many of us are growing slightly tired of new yet-to-be-realeased video models claiming to beat other yet-to-be-released video models.
TOGETHER WITH DRAFTBOARD
Why should company employees have all the fun with referral bonuses? Meet Draftboard: A platform where referral bonuses are open to everyone. Share job opportunities from companies like SeatGeek, Via, Formlabs, Bilt, Triple Whale, and OneSignal with your network. Earn when your friends get hired. | Here’s How:
|
DARIO’S PICKS
2. Three recent case studies from OpenAI: agents, role-playing and AI web development
There’s some fascinating recent case studies shared by OpenAI which I haven’t featured in this newsletter yet:
Creative Agents – Media & Entertainment
Altera uses OpenAI's GPT-4o to create “digital humans” that play Minecraft, AI agents that play the game with users like friends. Altera is combining GPT-4o and a brain-inspired multi-module system – letting these agents operate autonomously for up to four hours.
Customer Agents – Education
Speak is an app for practicing new languages that’s making the experience of language learning a lot more fun. It OpenAI's new Realtime API to power its role-play feature, which allows users to practice conversations in a new language with the same speed and naturalness of the speech-to-speech model (same model as Advanced Voice Mode in ChatGPT).
Code Agents – Software & IT
Coframe uses GPT-4o to build an AI engineering assistant that generates new sections of a website based on existing code and images. By fine-tuning GPT-4o with vision and text, they improved the consistency of visual style and layout by that could be generated by 26%.
Why it matters Altera’s use case shows that we’re getting closer to autonomous agents that can do tasks without human intervention. Speak and Coframe are good showcases of what can be done with OpenAI’s latest API improvements: the Realtime API and the vision fine-tuning capability.
PS I’ve started tracking all the use cases of companies implementing AI—and the technology they use for it—from all around the web. I’m building something to make it easier for everyone to see what’s working with AI. Stay tuned for it in the next days!
FROM OUR PARTNERS
200+ hours of research on AI tools, prompting techniques & hacks packed in a solid 3 hour masterclass.
DARIO’S PICKS
Bloomberg is reporting that Apple Intelligence is going to be launched on October 28th. Initially planned for mid-October, Apple is apparently taking extra time to fix bugs and ensuring their private cloud compute servers can handle all the traffic that will come their way.
The release is expected to feature several of the cool stuff Apple has been showcasing—the new Siri interface, writing tools, notification summaries, and more. However, the full brand-new Siri experience (the best and most innovative part of it), isn’t set to release until later, possibly delaying into 2025.
The experience isn’t going to be available to everyone though – you’ll need an iPhone 15 Pro or later to install Apple Intelligence. Ipads and Macbooks with A1 chips or later will also be able to use it.
Why it matters Many testers have been underwhelmed by Apple Intelligence so far, as it uses GPT-3.5-level intelligence for all native features to power everything locally on the phone. That’s great for privacy, but not so much for performance. I still think having an AI on your phone, integrated with your apps, could be a massive upgrade in terms of the user experience though.
THAT’S ALL FOLKS!
Was this email forwarded to you? Sign up here. Want to get in front of 13,000 AI enthusiasts? Work with me. This newsletter is written & curated by Dario Chincha. |
What's your verdict on today's email? |
Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU – it will make it possible for me to continue to do this.