What's brewing in AI
Posts
🧙🏼 OpenAI's performance plateau

🧙🏼 OpenAI's performance plateau

Also: Google engineer's prompting playbook

Dario Chincha
November 11, 2024

In partnership with

Happy Monday, wizards.

What’s a soup, compared to a sauce, or a drink? There’s a lot of edge cases 🍜

Fortunately, Claude can help us make sense of it.

Here’s what’s brewing in AI today.

DARIO’S PICKS

1. OpenAI’s next breakthrough model isn’t around the corner

Fresh off the rumour mill. A new article by The Information (paywalled) shows that OpenAI’s next model, which is anticipated to release later this year, isn’t as much of an improvement as seen previously between GPT-3 and GPT-4. In fact, it might not be reliably better in some areas—coding is mentioned—compared to its predecessors. One of the key challenges to training more intelligent models is the diminishing availability of training data.

To tackle the slowdown in performance gains, OpenAI has reportedly set up a foundations team to focus on new ways to improve its models – leveraging synthetic data and improving the post-training process.

Sam Altman previously confirmed that the company has some “very good releases” coming later this year, albeit nothing called GPT-5. He also said that “the thing that will feel like the next giant breakthrough will be agents”.

‎ Why it matters‎ ‎ It’s becoming clear, unless the rumours are totally missing the mark, that OpenAI’s next model release won’t be blowing our minds. At least not as hard as earlier releases. Going by Altman’s recent statement on Reddit, it seems more plausible that the next “breakthrough” will be agents—in other words, workflow rather than performance related—orchestrating existing models in a way that they can carry out complex, multi-step tasks or projects on their own.

TOGETHER WITH WRITER

Writer RAG tool: build production-ready RAG apps in minutes

RAG in just a few lines of code? We’ve launched a predefined RAG tool on our developer platform, making it easy to bring your data into a Knowledge Graph and interact with it with AI. With a single API call, writer LLMs will intelligently call the RAG tool to chat with your data.

Integrated into Writer’s full-stack platform, it eliminates the need for complex vendor RAG setups, making it quick to build scalable, highly accurate AI workflows just by passing a graph ID of your data as a parameter to your RAG tool.

Learn more about our production ready RAG tooling here.

DARIO’S PICKS

2. Two Deepmind engineers’ prompt tuning playbook

Excited to share our prompt tuning playbook! (Not an official product. Just authors tips & tricks for better prompting). I'm most excited about first half on mental models for post-training & prompting. Feedback/forks welcome! #LLM#PromptEngineering
github.com/varungodbole/p…
— Varun Godbole (@VarunGodbole)
2:41 PM • Nov 9, 2024

Varun Godbole and Ellie Pavlick, software engineers at Google Deepmind, shared their LLM Prompt Tuning Playbook – a practical guide with tips to improve prompting skills without requiring deep technical knowledge.

Top tips:

Be clear and specific. Clearly define what you want the AI to do, specifying the task, format, tone, and length.
- Example: Instead of saying "Summarize this," use "Summarize the following report in 200 words, highlighting key findings and recommendations."
Use positive language: Instruct the AI on what to do rather than what not to do to avoid confusion.
- Example: "Use simple language to explain the concept," instead of "Don't use technical jargon."
Keep prompts concise: Avoid overloading the AI with too many instructions at once; break complex tasks into smaller steps.
- Example: Break down a request like "Analyze the data and create a report with charts and recommendations" into two steps: 1) "Analyze the data and provide key insights," then 2) "Create a report with charts based on the insights."
Provide context: Offer relevant background information to guide the AI's response effectively.
- Example: "As a financial analyst, explain the implications of the new tax law for small businesses."
Iterate and refine: Test your prompts and adjust based on the AI's outputs to improve results.

‎ Why it matters‎ ‎ While it’s true that some of the big models today are less sensitive to prompt variations than their predecessors, knowing the basics on how to prompt effectively can still save you a lot of time and boost your productivity.

Beyond quick tips, the full playbook dives into mental models for prompting, not just techniques, and distills years of experience into actionable advice for getting more useful outputs from LLMs.

FROM OUR PARTNERS

Looking for your next job? Express can help. Express Employment Professionals has put more than 10 million people to work in the last 40 years. Express has a wide variety of jobs available, in all industries. With flexible schedules, competitive pay, and access to a variety of benefits, what are you waiting for? Let ExpressPros help you land your next great job today. And the best part? They don’t charge any fees to help you find one.

Get started today

DARIO’S PICKS

3. xAI is testing a free version of it’s AI chatbot Grok

Remember Elon Musk’s AI chatbot Grok? Yes, the one with the image generator that lets you generate almost anything.

It has so far been limited to premium, paid users on X.com. However, several users in certain regions posted that they have access with some limits: 10 queries per two hours on Grok-2 and 20 queries per two hours on Grok-2 mini.

In addition to image generation, Grok got image understanding capabilities last month.

‎ Why it matters‎ ‎ xAI is currently in talks to raise several more billions. Might this be a way to strengthen the pitch deck with a bigger user base for it’s AI? Or is it simply adjusting to the market? After all, the other leading general assistant chatbots do offer some level of free usage.

THAT’S ALL FOLKS!

Was this email forwarded to you? Sign up here.

Want to get in front of 13,000 AI enthusiasts? Work with me.

This newsletter is written & curated by Dario Chincha.

What's your verdict on today's email?

Affiliate disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes link to products and other newsletters. Please assume these are affiliate links. If you choose to subscribe to a newsletter or buy a product through any of my links then THANK YOU – it will make it possible for me to continue to do this.