The Truth Can’t Be Improved Upon
Let’s talk about Large Language Models (LLMs) for a minute. Don’t get me wrong, these things are pretty cool, but there’s been a lot of chatter lately about them hitting a wall. Well, I’ve got some thoughts on that.
First off, the idea that we’re hitting a data wall? That’s a load of nonsense. What’s really going on is that LLMs seem to be plateauing, but there’s a good reason for that.
Here’s the deal: a lot of the questions we throw at these models have correct answers. Take ‘2 + 2 = 4’ for example. Once an LLM gets that right, there’s nowhere else to go. You can’t improve on the truth, folks.
Let that sink in for a second. You can’t improve upon truth.
This is why it might look like we’re not making progress from an intelligence perspective. But in reality, it’s just that many of the questions we’re asking have true answers that most state-of-the-art models are already nailing.
So, when you’re already hitting the bullseye, it’s pretty darn hard to do better than that. It’s not a data wall we’re up against - it’s just tough to squeeze out extra gains when these models are already so spot-on so often.
What’s Next for LLMs?
Now, don’t think for a second that we’re done improving these bad boys. Oh no, we’ve got some interesting paths ahead. One big area that’s ripe for improvement is agentic and action-based stuff.
Here’s what I’m thinking:
- We need better evals (that’s evaluations for you non-techies out there).
- Current leaderboards like LLM SIS are mostly judging based on what humans find good or bad.
- But here’s the kicker - hardly any of these questions are about planning.
So, what do we do? We need to create a planning-based eval. I’m talking about an evaluation where all the questions are requests for tasks. Then we judge the LLM’s ability to create a solid plan based on its tools and resources.
This is where things get exciting. Imagine an LLM that doesn’t just spit out information, but can actually help you plan and execute tasks. That’s the kind of improvement that could really move the needle.
In the end, while it might look like we’re hitting a wall with LLMs, we’re really just at the beginning of a new chapter. The truth might not need improving, but how we use these models? Now that’s where the real potential lies.
So, keep your eyes peeled, folks. The world of LLMs is about to get a whole lot more interesting.
- Joseph
Sign up for my email list to know when I post more content like this. I also post my thoughts on Twitter/X.