Why small special purpose models are the future.
The idea that a so-called big world model can and knows everything is a big dream and a well-cherished narrative. The big AI vendors have been selling this thesis for a long time. The idea, and I believe the general perception, is still that Large Language Models (LLM) will lead us directly to Artificial General Intelligence (AGI), i.e. an all-knowing/all-knowing AI. And when this AGI is achieved, we will be able to delegate all unpleasant work to this type of model. That’s not going to happen.
Struggling with adaptation
Every day, I see how companies create many great proof of concepts but then fail to operationalize them.
There are often various reasons for this; for one thing, PoCs often involve people who do not understand the business problem to be solved well enough. Automation in enterprise companies – and this is where the real potential lies – is time-consuming and complicated. If you don’t have a deep understanding of all the details, you will fail.
Secondly, the requirements for operational business are often fundamentally different from the “environment” of the PoC.
People in the line often only have a weary smile for the PoCs. What’s more, nobody in the line supports radical efficiency efforts, as they are sawing at the branch on which entire departments are sawing. To make matters worse, the “Altmans and Amodeis” of this world were still proclaiming until recently that AI would soon replace all jobs anyway. This has an effect – but unfortunately not a positive one on the adaptation curve.
Story for investors
Altman and Amodei of course know very well that this is not true. Why they are telling this story anyway (even if they have backtracked in recent days) is simple: the news is intended for investors and C-levels.
It should generate a “fear of missing out”. After all, if a technology could quickly replace the majority of human labor at low cost, it would be the greatest business opportunity ever. The fact that this would not be such an incredibly clever thing in macroeconomic terms is a given. The message is catching on and investors in particular seem to have put all rationality (and calculators) to one side.
Money is burning like tinder right now
This is also desperately needed. All LLM providers burn through vast sums of money, be it in research, i.e. the development of models, or in operation, known as “inference”. Without liquidity, not only the further development of the models but also the operation of the models will run out of fuel.
For a long time, the theory was that the cost problems would be solved, on the one hand by incredibly high and rapidly growing sales and on the other by radically lower inference costs. “Moore’s Law” sent its regards as an analogy until Nvidia CEO Huang moderated it and the buzzword “scaling laws” was established. But both the reduction of research costs and operating costs are not going in the right direction to the extent required.
Although sales and growth are gigantic from a conventional point of view, they are inadequate compared to the financial bets that have been made. And on the Inference side, demand is initially causing “unit costs” to rise. To such an extent that it is reflected in the chip prices.
Rethinking becomes necessary
The drama is on the horizon. New news arrives every day. The bubble is going to burst, perhaps everything is exaggerated. Just 5 months ago, I was quite alone in my belief that the AI hype per se could not go well. Today, I see more and more heads nodding.
To think that AI applications would not be groundbreaking and would not change the way we work and live more fundamentally and faster than before is of course very wrong and naive.
So what needs to happen for the benefits of this technology to reach companies “at scale”, so to speak? I think there are three things:
The models need to get better
LLMs have impressive capabilities across the board, but if you want to automate/autonomize a group of things really well, the performance is often not good enough for real-world use. The reason for this is simple: the training data that went into the model for this task is too small and/or too poor.
The models have to become cheaper
Operating hyperscaler models is expensive, precisely because the models are so large. To make matters worse, the prices are highly “subsidized”. In other words, the effective costs – and we’re not even talking about margins – are orders of magnitude higher. You can imagine what will happen to prices in the future when investors want to see their money. That’s why we’re now seeing rate restrictions and price increases.
Operations must become more secure
“Sovereign data” is the hot topic of the moment. And rightly so, because it is much more than just “We no longer trust the Americans”. No, in the era of data and artificial intelligence, data sovereignty is a hard business reality, so to speak.
For too long, companies have sent their data to hyperscalers in good faith. The LLM providers, “data-hungry by nature” so to speak, are happy about the influx of data and have shown few scruples in the past about simply taking what is on the table without asking. Better sorry than safe.
Small Special Purpose Models (SSPM)
The solution to this dilemma are so-called Small Special Purpose Models, often referred to as Small Language Models (SLMs) or Domain-Specific Models.
These are small models that are trained for a domain, for example document processing with a huge pool of high-quality data. They can be operated with “low-key” hardware. This opens the door to self-determined, local and “air-gapped” operation without an Internet connection.
We had to learn this lesson ourselves at Parashift and have been working on such SVLMs for quite some time, reducing our inference costs. These are models that handle the handful of typical document automation tasks orders of magnitude better than the LLMs we have been using for some time.
In the next few months, we will be launching a corresponding model product that can easily handle the workload of a medium-sized German company on my son’s EUR 1500 gaming PC, for example – without any connection to the Internet. The unit economics are correspondingly fantastic.
The reason why this is the case follows a simple rationale: in business automation, I streamline tasks into processes and use a model with the same tasks every time. It only needs to be able to do a handful of tasks really well. All other skills are just costly ballast.
Adam Smith reloaded
If you think about these things, Adam Smith’s pin factory inevitably comes to mind as an example of the division of labor. The same mechanisms that apply to human labor also apply – what a surprise – to machine labor.
If the AI industry wants sustainable, profitable and, yes, rapid adaptation of this technology, and I as part of this industry want that, then we have to think in different setups.
Data, data, data
To build really good SSPMs, two things are necessary; a large amount of extremely good, legally acquired training data and in-depth domain knowledge of the task to be solved. As a company, you can’t have both for all possible work that the economy requires.
Outlook
SSPMs are the first chamber of a long-lasting, profitable and sustainable AI boom. After the moderation of the hype and the associated disappointment, I predict a greatly accelerated combination of human and machine work. A swarm of specialized models, a topic that we at Parashift have been working on in various forms for a long time.
What has been missing so far and will be really difficult to solve is the autonomous interpretation of tasks and subsequent coordination of the small models – in order to then bring the work results back together in a structured way.
This is where I believe large “router” models will play an important role, because generalist knowledge and skills are required. And it can also cost a little more.
So if you want to use AI to automate work in your company in the future, look out for specialized models that you can operate safely without strings attached. I’m sure we’ll see more and more offers and options like this in the near future.
Artikel auf Social Media teilen:
