- What's brewing in AI
- Posts
- š§š¼ OpenAI's performance plateau
š§š¼ OpenAI's performance plateau
Also: Google engineer's prompting playbook
Happy Monday, wizards.
Whatās a soup, compared to a sauce, or a drink? Thereās a lot of edge cases š
Fortunately, Claude can help us make sense of it.
Hereās whatās brewing in AI today.
DARIOāS PICKS
Fresh off the rumour mill. A new article by The Information (paywalled) shows that OpenAIās next model, which is anticipated to release later this year, isnāt as much of an improvement as seen previously between GPT-3 and GPT-4. In fact, it might not be reliably better in some areasācoding is mentionedācompared to its predecessors. One of the key challenges to training more intelligent models is the diminishing availability of training data.
To tackle the slowdown in performance gains, OpenAI has reportedly set up a foundations team to focus on new ways to improve its models ā leveraging synthetic data and improving the post-training process.
Sam Altman previously confirmed that the company has some āvery good releasesā coming later this year, albeit nothing called GPT-5. He also said that āthe thing that will feel like the next giant breakthrough will be agentsā.
ā Why it mattersā ā Itās becoming clear, unless the rumours are totally missing the mark, that OpenAIās next model release wonāt be blowing our minds. At least not as hard as earlier releases. Going by Altmanās recent statement on Reddit, it seems more plausible that the next ābreakthroughā will be agentsāin other words, workflow rather than performance relatedāorchestrating existing models in a way that they can carry out complex, multi-step tasks or projects on their own.
TOGETHER WITH WRITER
The fastest way to build AI apps
Weāre excited to introduce Writer AI Studio, the fastest way to build AI apps, products, and features. Writerās unique full-stack design makes it easy to prototype, deploy, and test AI apps ā allowing developers to build with APIs, a drag-and-drop open-source Python framework, or a no-code builder, so you have flexibility to build the way you want.
Writer comes with a suite of top-ranking LLMs and has built-in RAG for easy integration with your data. Check it out if youāre looking to streamline how you build and integrate AI apps.
DARIOāS PICKS
Excited to share our prompt tuning playbook! (Not an official product. Just authors tips & tricks for better prompting). I'm most excited about first half on mental models for post-training & prompting. Feedback/forks welcome! #LLM#PromptEngineering
github.com/varungodbole/pā¦
ā Varun Godbole (@VarunGodbole)
2:41 PM ā¢ Nov 9, 2024
Varun Godbole and Ellie Pavlick, software engineers at Google Deepmind, shared their LLM Prompt Tuning Playbook ā a practical guide with tips to improve prompting skills without requiring deep technical knowledge.
Top tips:
Be clear and specific. Clearly define what you want the AI to do, specifying the task, format, tone, and length.
Example: Instead of saying "Summarize this," use "Summarize the following report in 200 words, highlighting key findings and recommendations."
Use positive language: Instruct the AI on what to do rather than what not to do to avoid confusion.
Example: "Use simple language to explain the concept," instead of "Don't use technical jargon."
Keep prompts concise: Avoid overloading the AI with too many instructions at once; break complex tasks into smaller steps.
Example: Break down a request like "Analyze the data and create a report with charts and recommendations" into two steps: 1) "Analyze the data and provide key insights," then 2) "Create a report with charts based on the insights."
Provide context: Offer relevant background information to guide the AI's response effectively.
Example: "As a financial analyst, explain the implications of the new tax law for small businesses."
Iterate and refine: Test your prompts and adjust based on the AI's outputs to improve results.
ā Why it mattersā ā While itās true that some of the big models today are less sensitive to prompt variations than their predecessors, knowing the basics on how to prompt effectively can still save you a lot of time and boost your productivity.
Beyond quick tips, the full playbook dives into mental models for prompting, not just techniques, and distills years of experience into actionable advice for getting more useful outputs from LLMs.
FROM OUR PARTNERS
Looking for your next job? Express can help. Express Employment Professionals has put more than 10 million people to work in the last 40 years. Express has a wide variety of jobs available, in all industries. With flexible schedules, competitive pay, and access to a variety of benefits, what are you waiting for? Let ExpressPros help you land your next great job today. And the best part? They donāt charge any fees to help you find one.
DARIOāS PICKS
Remember Elon Muskās AI chatbot Grok? Yes, the one with the image generator that lets you generate almost anything.
It has so far been limited to premium, paid users on X.com. However, several users in certain regions posted that they have access with some limits: 10 queries per two hours on Grok-2 and 20 queries per two hours on Grok-2 mini.
In addition to image generation, Grok got image understanding capabilities last month.
ā Why it mattersā ā xAI is currently in talks to raise several more billions. Might this be a way to strengthen the pitch deck with a bigger user base for itās AI? Or is it simply adjusting to the market? After all, the other leading general assistant chatbots do offer some level of free usage.
THATāS ALL FOLKS!
Was this email forwarded to you? Sign up here. Want to get in front of 13,000 AI enthusiasts? Work with me. This newsletter is written & curated by Dario Chincha. |
What's your verdict on today's email? |
Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU ā it will make it possible for me to continue to do this.