- What's brewing in AI
- Posts
- 🧙🏼♂️ ChatGPT gets eyes and ears
🧙🏼♂️ ChatGPT gets eyes and ears
whatplugin weekly #12
Welcome to the 224 new subscribers who joined last week. I’m thrilled that you’re here to learn with me.
Here’s what’s brewing in AI (huge news this week):
This week’s issue is brought to you by: Jot it Down
Jot it Down remembers details from chats by jotting them down, and also supports sharing memories or note collections with friends. It extends plugin capabilities and allows for cross-conversation memory.
Dario’s Picks
I.
ChatGPT can now see, hear and speak. Multimodal ChatGPT is here! OpenAI is rolling out Voice and Image to ChatGPT plus subscribers over the next 2 weeks. Other users will also get access, but later.
Voice: Forget Siri – you can now have a back-and-forth conversation with ChatGPT. The voices sound impressively realistic and human-like.
It enables new uses like reading a bedtime story, or settling that dinner debate just using your voice. And, just imagine impact this could have for visually impaired individuals.
The new voice feature is coming to ChatGPT on iOS and Android only.
Image: Snap or upload a picture to ChatGPT – it can now read and understand it. You can send a picture of your half-empty fridge and get recipe ideas back, or a snapshot of a page from your child’s homework to get hints on how to solve it.
It can handle multiple images, and you can even focus on specific parts of the images with a dedicated drawing tool.
The image feature will become available on ChatGPT across platforms.
II.
OpenAI’s DALL-E 3 will become available directly inside ChatGPT for Plus and Enterprise subscribers. Launch is expected in October.
The new version of DALL-E, a generative AI visual art platform, will become integrated within the ChatGPT interface. This could a leap forward compared to existing platforms like Midjourney and Stability AI, as DALL-E 3 reportedly understands context better, and the integration with ChatGPT will let users seamlessly craft prompts and generate images on the same platform.
III.
Microsoft launching Copilot, a unifying AI companion that uses context from the web, your work data and what you’re currently doing on your computer.
Microsoft has been adding so-called copilots to many of their most used products for some time, and Microsoft Copilot will be the unifying experience, available in Windows 11, Microsoft 365, as well as inside Edge and Bing.
In Focus
Google Bard joins the plugin game with Extensions
Google launched Extensions for their AI chatbot Bard last week, which lets it pull information from other Google apps – including Gmail, Drive, Docs, Flights, Hotels, Maps and YouTube – to personalise responses. Check out the demo video from Google.
The new feature is especially interesting in the context of Google launching its Gemini multimodal LLM soon, which will likely make Bard even more powerful.
Additionally, Bard now has a new [G] button that lets you verify the results with a traditional Google search, as well as and updated AI model for better accuracy.
What I think:
Bard now combines seamless web access and ability to use your own data – it’s a major step-up for Google that could encourage users to switch from ChatGPT to Bard.
Launching Extensions with a very limited amount of Google’s own apps is a smart move, allowing the company to ensure a good user experience before potentially including third-party extensions.
Extensions currently have significant shortcomings, though, and the results are only as good as Bard’s ability to pick out the right docs to analyse. For example, it only handles email questions effectively if they can be answered using maximum 5 emails. Generally, the less emails you have, the easier it will be for Bard to pick the most relevant ones.
I still prefer ChatGPT for two reasons:
Accuracy: ChatGPT (using GPT-4) generally feels more concise and insightful in the answers it gives me. I also seem to get less hallucinations (made-up stuff) than with Bard, something which users have already reported as a problem for Extensions as well.
The Advanced Data Analysis feature: Advanced Data Analysis in ChatGPT is currently also miles ahead of any way to process quantitive data with Bard.
That said, I’m looking forward to keep playing around with Bard alongside ChatGPT and discover interesting ways to use the new extensions. And of course, seeing the impact on Bard from Google’s upcoming model Gemini.
ChatGPT plugins
There’s now 1,058 ChatGPT plugins, of which 21 were launched in the past 7 days.
Notable launches:
Power Automate - Creates and executes automated workflows. Supports integration with over 1000 apps and services.
New plugin by Microsoft that lets you invoke flows made with Power Automate directly from within ChatGPT, including combining it in workflows with other plugins. It requires a Microsoft account with Power Automate to use it.
Categories with most new launches:
Miscellaneous - 114 plugins (5 new)
PDF Conversations - 15 plugins (3 new)
Web Development Tools - 69 plugins (3 new)
AI Bytes & Resources
OpenAI Red Teaming Network. OpenAI is inviting domain experts for Red Teaming (trying to break the system to test its safety), formalising the collaboration with outside experts to ensure safety of its products.
Anthropic is expanding access to safer AI with Amazon, announcing up to $4 billion investment from Amazon.
Spotify’s AI voice translation pilot means your favorite podcasters might be heard in your native language.
Until next time,
ChatGPT Wizard & creator of whatplugin.ai 🧙🏼♂️