šŸ§™šŸ¼ OpenAI's performance plateau

Also: Google engineer's prompting playbook

In partnership with

Happy Monday, wizards.

Whatā€™s a soup, compared to a sauce, or a drink? Thereā€™s a lot of edge cases šŸœ 

Fortunately, Claude can help us make sense of it.

Hereā€™s whatā€™s brewing in AI today.

DARIOā€™S PICKS

Fresh off the rumour mill. A new article by The Information (paywalled) shows that OpenAIā€™s next model, which is anticipated to release later this year, isnā€™t as much of an improvement as seen previously between GPT-3 and GPT-4. In fact, it might not be reliably better in some areasā€”coding is mentionedā€”compared to its predecessors. One of the key challenges to training more intelligent models is the diminishing availability of training data.

To tackle the slowdown in performance gains, OpenAI has reportedly set up a foundations team to focus on new ways to improve its models ā€“ leveraging synthetic data and improving the post-training process.

Sam Altman previously confirmed that the company has some ā€œvery good releasesā€ coming later this year, albeit nothing called GPT-5. He also said that ā€œthe thing that will feel like the next giant breakthrough will be agentsā€.

ā€Ž Why it mattersā€Ž ā€Ž Itā€™s becoming clear, unless the rumours are totally missing the mark, that OpenAIā€™s next model release wonā€™t be blowing our minds. At least not as hard as earlier releases. Going by Altmanā€™s recent statement on Reddit, it seems more plausible that the next ā€œbreakthroughā€ will be agentsā€”in other words, workflow rather than performance relatedā€”orchestrating existing models in a way that they can carry out complex, multi-step tasks or projects on their own.

TOGETHER WITH WRITER

The fastest way to build AI apps

Weā€™re excited to introduce Writer AI Studio, the fastest way to build AI apps, products, and features. Writerā€™s unique full-stack design makes it easy to prototype, deploy, and test AI apps ā€“ allowing developers to build with APIs, a drag-and-drop open-source Python framework, or a no-code builder, so you have flexibility to build the way you want.

Writer comes with a suite of top-ranking LLMs and has built-in RAG for easy integration with your data. Check it out if youā€™re looking to streamline how you build and integrate AI apps.

DARIOā€™S PICKS

Varun Godbole and Ellie Pavlick, software engineers at Google Deepmind, shared their LLM Prompt Tuning Playbook ā€“ a practical guide with tips to improve prompting skills without requiring deep technical knowledge.

Top tips:

  • Be clear and specific. Clearly define what you want the AI to do, specifying the task, format, tone, and length.

    • Example: Instead of saying "Summarize this," use "Summarize the following report in 200 words, highlighting key findings and recommendations."

  • Use positive language: Instruct the AI on what to do rather than what not to do to avoid confusion.

    • Example: "Use simple language to explain the concept," instead of "Don't use technical jargon."

  • Keep prompts concise: Avoid overloading the AI with too many instructions at once; break complex tasks into smaller steps.

    • Example: Break down a request like "Analyze the data and create a report with charts and recommendations" into two steps: 1) "Analyze the data and provide key insights," then 2) "Create a report with charts based on the insights."

  • Provide context: Offer relevant background information to guide the AI's response effectively.

    • Example: "As a financial analyst, explain the implications of the new tax law for small businesses."

  • Iterate and refine: Test your prompts and adjust based on the AI's outputs to improve results.

ā€Ž Why it mattersā€Ž ā€Ž While itā€™s true that some of the big models today are less sensitive to prompt variations than their predecessors, knowing the basics on how to prompt effectively can still save you a lot of time and boost your productivity.

Beyond quick tips, the full playbook dives into mental models for prompting, not just techniques, and distills years of experience into actionable advice for getting more useful outputs from LLMs.

FROM OUR PARTNERS

Looking for your next job? Express can help. Express Employment Professionals has put more than 10 million people to work in the last 40 years. Express has a wide variety of jobs available, in all industries. With flexible schedules, competitive pay, and access to a variety of benefits, what are you waiting for? Let ExpressPros help you land your next great job today. And the best part? They donā€™t charge any fees to help you find one.

DARIOā€™S PICKS

Remember Elon Muskā€™s AI chatbot Grok? Yes, the one with the image generator that lets you generate almost anything.

It has so far been limited to premium, paid users on X.com. However, several users in certain regions posted that they have access with some limits: 10 queries per two hours on Grok-2 and 20 queries per two hours on Grok-2 mini.

In addition to image generation, Grok got image understanding capabilities last month.

ā€Ž Why it mattersā€Ž ā€Ž xAI is currently in talks to raise several more billions. Might this be a way to strengthen the pitch deck with a bigger user base for itā€™s AI? Or is it simply adjusting to the market? After all, the other leading general assistant chatbots do offer some level of free usage.

THATā€™S ALL FOLKS!

Was this email forwarded to you? Sign up here.

Want to get in front of 13,000 AI enthusiasts? Work with me.

This newsletter is written & curated by Dario Chincha.

What's your verdict on today's email?

Login or Subscribe to participate in polls.

Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU ā€“ it will make it possible for me to continue to do this.