- The AI Analyst by Ben Parr
- Posts
- The AI Race Has Moved Beyond the Numbers
The AI Race Has Moved Beyond the Numbers
The next phase of AI is UX: modalities, form factors and especially devices
From chat to voice; from screens to wearables; from AI assistants you occasionally summon to AI companions that work alongside you. The real race isn’t to build the largest model, but to design the most human way to use it.
It took becoming a dad for this realization to hit home for me. As some of you know, I am now a dad to a beautiful 4-month-old named Ella (thus why I haven’t written a newsletter recently!). Before I became a dad, I knew that 1) I wanted to capture all the special moments with her, and 2) that I didn’t want to have to pull out a phone to take those photos.
My solution was to buy the Meta Ray-Bans — Meta’s smart sunglasses. They could take photos with voice commands AND I could take phone calls! Plus, they are far more stylish than the ill-fated Google Glasses (which were, in my opinion, both ugly and ahead of their time.)

Wearing the Meta Ray-Bans with my daughter Ella (who is already full of opinions!)
But what stood out about the Ray-Bans — and what I wasn’t expecting — was how naturally their AI assistant fit into my daily life. It’s a fluid experience to ask the AI glasses any question you have. Since I became a dad, I’ve been asking my Ray-Bans things like: How long should her wake window be? Does this poop look normal? What should I ask the pediatrician? How old until my baby can crawl?
Speaking hands-free, in the moment, has been far more natural and useful than pulling out a phone and having to type into a chat box.
The AI Race Has Moved Beyond LLM Benchmarks
For the last three years, the tech world has obsessed over building the “best” large language model — proving progress through benchmarks (essentially test scores for AI models). These scores fueled a race for bigger, faster, smarter models, with each new release from OpenAI, Anthropic, Google, and others promising extraordinary upgrades.
And while GPT-5 and its peers are undeniably impressive, we’re seeing signs of diminishing returns. It’s the “iPhone” moment for AI, where the form factor has been figured out. From here, improvements will feel more like upgrades than breakthroughs.
Take this chart comparing outputs from GPT-1 through GPT-5. Same exact prompt; wildly different results.

A comparison of GPT-1 through GPT-5 on the same prompt. Source: x.com/gdb
The progress is striking; GPT-1 spit out disjointed fragments, while GPT-5 can produce a polished, coherent limerick. Each step forward represents years of research, billions of dollars in AI data training and computing, and massive engineering feats. But the progress has clearly become more and more incremental with each iteration of GPT; at a certain point, GPT figured out the basics of limerick structure, and any other improvements are about more subtle things like word choice and cleverness.
But here’s the thing: this chart obfuscates the real reason why AI went mainstream. The biggest leap wasn’t the incremental language improvements from GPT-3 to GPT-5; it was OpenAI making GPT accessible to the public through ChatGPT at the end of 2022.
That breakthrough wasn’t about model size — it was about user experience (UX) and modality.
ChatGPT’s launch changed everything. Transformer models (the AI models behind LLMs) had existed for years, but chat made them accessible to the masses. A simple conversational UX brought AI to hundreds of millions of people — no PhD or code required.
Benchmarks =/= utility. Meta’s Llama large language model is nowhere near as powerful or sophisticated as GPT-5, but it’s so easy to access. Just say “Hey Meta” and I have access to a powerful AI, even if it isn’t the most powerful one out there.
With LLMs quickly sounding more human — but still being far away from superintelligence — we’re entering a new phase where UX, not model, is what will determine the next phase of AI’s story.
Why the UX of AI Matters More Than Ever
The Meta Ray-Bans and Google’s Gemini-powered smart glasses are just the beginning. OpenAI’s rumored AI device, expected sometime next year, will likely be something you can wear that can answer your questions on the fly, but will also be proactive; think of a real-life assistant who can suggest restaurants to you in a new neighborhood, or remind you of critical tasks that you forgot to complete. (e.g., you forgot to visit the grocery store)
This somehow reminds me of the Robin Williams movie Bicentennial Man. I was a huge fan growing up.

Robin Williams as the Bicentennial Man (Source: imdb.com)
Williams plays Andrew, a robot who starts as a housekeeper and manager but eventually shows creativity, asks for independence, falls in love, and becomes more human as the story progresses. The movie isn’t really about hardware; it’s about the changing interaction and interface between human and machine.
These AI-enabled devices — and others like them — will introduce AI to an entirely new group of people who’d never think to open a chat window. That’s what true AI assistants will look like.
But devices are just a small part of the evolution in the way we are interacting with AI; it’s also about how we are using AI in our daily lives. Look at vibe coding, for example. Instead of memorizing syntax or debugging errors for hours, you just describe what you want to build and AI writes the code. This UX enables millions of creators who previously couldn’t have built software at all (including my co-founder Matt, who has vibe coded an array of products.
Vibe coding is not just a better way to interact with LLMs; it’s a UX that has opened up coding and product invention to millions of people who previously did not have a way to turn their ideas into reality.
At Octane AI, we’re building towards a vision of using AI to make software easier to use. Rather than making customers click through endless menus to build a quiz, we’ve created a user experience where you just tell the AI what kind of quiz you want — and it does the rest. UX is the shortcut to making AI actually usable.
At Theory Forge Ventures, we’re investing in companies building new AI modalities – like Decode, which literally turns your drawings into code; or Gumloop, which lets you build any kind of AI automation flow using a drag-and-drop interface. We firmly believe entrepreneurs can create huge companies by building new ways to interact with AI that make it useful and usable in every industry.
(If you’re building a new AI modality, UX layer, or form factor, hit me up.)
UX Is Where the Next Breakthroughs Will Happen
In the next three to five years, the biggest breakthroughs in AI won’t come from bigger LLMs; they’ll come from new user experiences and modalities — the ways we interact with AI. That’s not to say there won’t be breakthroughs with LLMs or other types of AI models that provide foundational leaps in AI capabilities. (e.g., selective state-space models, world models, liquid neural networks, all of which are worth tracking.) But right now, expecting the same leaps in LLM capabilities as we had in 2020 is both unrealistic and counterproductive.
The iPhone’s most recent innovations have come from entrepreneurs building world-changing apps like Spotify, Uber, and yes, ChatGPT. I’d argue we’re entering a similar phase with GPT-5 and LLMs as a whole – model improvement will slow down, but tinkerers, developers and entrepreneurs are far from finding the limits of what’s possible with large language models and their APIs.
We’ve hit diminishing returns on raw model gains; the next S-curve is the AI UX layer—new modalities, form factors, and companions that fit into real life. Maybe someone will actually build the Bicentennial Man for real. (I’ll invest.)
What I’m Reading in the World of AI
Hackers, spies, and defenders alike are now using AI in cyber operations, with Russian intelligence caught deploying malicious code built with LLMs and cybersecurity firms racing to counter. It’s quickly becoming an AI arms race — and security AI is set to be one of the biggest investment categories of the next few years.
At the recent AI Film Festival, short films generated with Runway’s Gen-4 software drew both curiosity and backlash — sparking debate over whether AI outputs are art or just glossy imitations of human creativity. AI can make stunning visuals, but storytelling is still a deeply human craft.
FieldAI just raised $405M at a $2B valuation to build “brains” for robots — a reminder that while robotics lags behind LLMs, it’s the next frontier for making robots truly useful in the real world. (Don’t worry, if the robots go rogue, we have Will Smith.)
Elon Musk’s xAI has sued Apple and OpenAI, accusing them of monopolistic practices around ChatGPT’s iOS integration — but the lawsuit is unlikely to go anywhere.
Silicon Valley leaders, including Andreessen Horowitz and OpenAI’s Greg Brockman, are putting $100M into pro-AI super PACs ahead of the 2026 midterms. It’s a pragmatic move on their part, especially with how our current political system works.
Intel struck a deal with Trump’s administration that converts $8.9B in Chips Act grants into a 10% government stake. I expect more of these deals ahead, where Washington ties funding to ownership in strategic tech firms.
TSMC is the most important company most people have never heard of. By cutting Chinese chipmaking equipment from its 2nm chip plants, it’s making sure it stays in Washington’s good graces.
New Stanford research shows AI is hurting young Americans’ job prospects in fields like software development and customer service — with employment for 22–25-year-old developers down nearly 20% since ChatGPT launched. The job disruption will continue, which is why it’s critical for professionals to gain AI skills.
Personal Notes

Introducing Ella Jo Parr — the newest Parr, and the reason I’m behind on newsletters.
Hi! I’m back! It’s been a few months since my last newsletter, but for a good reason — I’m now a dad! Ella Jo Parr was born on April 17th, and in just four months, she’s gone from sleeping beauty to an active, silly, and very aware baby with super clear opinions — and no problem expressing them! Now that I have started to find my rhythm, you will see more regular newsletters from me.
My co-founder Matt has been vibe coding a new way to analyze, understand, and get an edge on reading AI papers. I’ve been testing the beta, and it’s quite amazing to see the latest AI research the moment it hits Arxiv. You can request early access here.
Our venture fund, Theory Forge Ventures, has now made 19 investments since launching last year. If you’re building the future of AI, DM me — even if you’re not raising right now.
I recently spoke at the Akash Networks AI conference about the future of AI. Here's a clip from my panel on the future of AGI and ASI.
Lastly, a special thank you to Audrey Cabrera for helping me get my newsletter back up and running!
Until the next newsletter,
~ Ben
Reply