- What's brewing in AI
- Posts
- š§š¼āāļø OpenAIās Sora breaks the internet
š§š¼āāļø OpenAIās Sora breaks the internet
What's brewing in AI #28
Subscribeāāāāā|Submit GPT|Top 2,000 GPTs
Promote yourself to nearly 10k subscribers by sponsoring this newsletter.
Welcome to the 158 new subscribers who joined in the last week.
Sit back, grab a coconut (or coffee), and check out some pretty mind-blowing advancements in AI.
Whatās brewing in AI this week:
OpenAIās Sora breaks the internet
Gemini 1.5 dwarfs ChatGPTās context window
GPTs: Editorās choice and new arrivals in the GPT store
The other top stories in AI this week
Looking for visuals and charts, rather than words, to understand the daily news?
Bay Area Times is a visual-based newsletter on business and tech, with 250,000+ subscribers.
āDarioās Picks:
I. OpenAI introduces Sora
OpenAI introduced Sora last Thursday ā a next generation text-to-video model which, by the look of it, vastly outperforms anything weāve seen so far.
An AI generated video says more than a thousand words:
Sora has currently only been made available to early testers without a set release date to the public.
In addition to text-to-video, Sora can turn images into video, extend existing videos with AI, altering styles of existing videos and merge two videos.
Why it matters:
Sora understands and simulates the physical world in motion with unprecedented realism ā all from some simple text instructions. If it delivers what it says it does weāre officially at the point that, without a lot of scrutiny, you can no longer tell if a video has been AI generated or not.
Our perception of reality and āwhatās realā is going to be challenged. The opportunities for productive use and solving real-life problems is massive, as is the potential for abuse. The dynamics and costs in media production, with video now being as easy as typing on a keyboard, will change too.
Paraphrasing the philosopher Marshall McLuhan: technology is a means to extend our senses and ā as soon as weāve adopted it ā societal norms, values, practices, and structures are bound to change. Iād say weāve just taken a groundbreaking leap as far as extending and enlarging the visual human imagination goes. But donāt take it from me. Take it from Will Smith eating spaghetti.
II. Google launches Gemini Pro 1.5
Google has announced a new model (again): Gemini 1.5.
The most notable thing is the massive context window of 10m tokens (GPT-4 has 128k). To get a glimpse of the opportunities it brings, I recommend Googleās own demos:
Source: Google
Key features:
Itās Googleās first Gemini 1.5 model, and the first model available for testing is Gemini Pro 1.5.
Itās mid-sized, but performs in benchmark tests at a similar level to the larger Ultra 1.0.
Right now, the model has a 128k context window for most users ā same as GPT-4 Turbo. However, a limited group of developers have gotten access to the model with a massive context window of 10M tokens. This feature seems like it will be rolled out eventually, but first needs to be optimised in terms of latency and computational requirements to enhance the user experience (and, Iām presuming, control Googleās costs).
Itās based on a new Mixture-of-Experts (MoE) architecture.
Itās multimodal ā understands images, video and audio. In other words, itās not just big PDF files youāll be able to feed the model, but also lengthy audio transcripts, image databases, entire movies and more.
Why it matters: The context window jump is huge compared to all other existing models. According to Google, the performance persists at a very high level even as the context window increases. Judging by Googleās demo videos, we could soon be able to feed AI all our data and get high-quality responses back.
III. AI generated sound effects coming soon to Eleven Labs
Eleven Labs, a leader in text-to-speech, is getting a notable feature soon: AI generated sound effects. Thatās right, you will soon (no dates give yet) be able to simply describe a sound with a prompt and generate the audio for it. For their demo, the company cleverly chose to overlay AI generated sounds on some of OpenAIās Sora clips.
Why it matters: Not as revolutionary compared to the other news above, but nevertheless a cool new feature that seems inevitable and pairs well with the advances in AI video. Our AI generated worlds will need sound, wonāt they? š
GPTs
Editorās Choice
Weekly picks from me to you
š¬100+ |
This GPT will help you develop AI use cases for your company.
ā This GPT asks the right questions, and suggests tailored solution on how to best leverage AI in your organisation. Itās made by AI influencer and entrepreneur Allie K. Miller.
A couple of useful SEO tools
š¬900+ ā100 |
Enhances content relevance based on Search Quality Evaluator Guidelines, utilizing methodologies from recent academic research.
ā Reviews the relevance of your siteās content based on Googleās official guidelines. Tested it on a page on my site and found a good optimisation opportunity.
š¬600+ ā100 |
Generates 25 SEO-focused blog titles.
ā Simple tool to create optimised titles for articles (or get ideas for new articles). Enter your target keyword and it gives you title suggestions categorised into lists, stories, opinions, questions and frameworks.
New arrivals in the GPT store
New GPTs featured in OpenAIās official GPT store in the last 7 days
Diagrams ā”PRO BUILDERā” Rank 7 in Programming
Website Generator Rank 9 in Programming
AI Humanizer Pro Rank 8 in Writing
Physics Oracle Rank 7 in Education
Math Solver Rank 12 in Education
math Rank 7 in Lifestyle
Bytesā
Groq (not to be confused by Elonās Grok) is a new AI model with really fast response time. It uses LPU (language processing units) instead of GPU, unlocking faster speeds. Hereās a demo by Matt Shumer. BTW, despite the naming similarity to Elon Muskās chatbot (as if this space isnāt already confusing enough when it comes to names), Groq was actually first, founded in 2016.
Fresh of the rumour mill: OpenAI is developing a Web Search product to challenge Google.
Reddit signs AI content licensing deal ahead of its IPO.
Deutsche Telekom showcased an AI phone at MWC 2024. The phone is launched in collaboration with Qualcomm and Brain.ai, and uses AI assistants to replace the apps.
ChatGPTās web traffic is down 11% since itās peak in May last year.
Anthropic is testing Prompt Shield to avoid misinformation in the upcoming elections. It redirects questions on politics and voting to āauthoritativeā sources of voting information.
The wizardās favourite AI newsletterswhat iām reading right now The Neuron šø - easy weekday read on AIās latest developments Bagel Bots š„Æ - best hands-on tips & tricks agent.ai šµš»āāļø - deep-dives on AI by Hubspotās co-founder AI Minds š§ - semi-technical breakdown of trending AI topics |
*if you sign up for these newsletters I may earn a commission at no cost to you. If you do, THANK YOU, it will make it possible for me to continue to do this.
Thatās a wrap for this week! Fellow sorcerers ā join me on LinkedIn. Until next time, Dario Chincha š§š¼āāļø |
What's your verdict on this week's email? |
Want your GPT mentioned in this newsletter? Submit it to whatplugin and get featured in the newsletter here.