That ‘low-cost’ open-source AI mannequin is definitely burning by means of your compute funds

August 15, 2025

81

Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues to enterprise AI, information, and safety leaders. Subscribe Now

A complete new examine has revealed that open-source synthetic intelligence fashions eat considerably extra computing assets than their closed-source rivals when performing an identical duties, doubtlessly undermining their price benefits and reshaping how enterprises consider AI deployment methods.

The analysis, performed by AI agency Nous Analysis, discovered that open-weight fashions use between 1.5 to 4 occasions extra tokens — the essential items of AI computation — than closed fashions like these from OpenAI and Anthropic. For easy information questions, the hole widened dramatically, with some open fashions utilizing as much as 10 occasions extra tokens.

Measuring Pondering Effectivity in Reasoning Fashions: The Lacking Benchmarkhttps://t.co/b1e1rJx6vZ
We measured token utilization throughout reasoning fashions: open fashions output 1.5-4x extra tokens than closed fashions on an identical duties, however with enormous variance relying on activity sort (as much as… pic.twitter.com/LY1083won8
— Nous Analysis (@NousResearch) August 14, 2025

“Open weight fashions use 1.5–4× extra tokens than closed ones (as much as 10× for easy information questions), making them generally dearer per question regardless of decrease per‑token prices,” the researchers wrote of their report revealed Wednesday.

The findings problem a prevailing assumption within the AI trade that open-source fashions provide clear financial benefits over proprietary options. Whereas open-source fashions usually price much less per token to run, the examine suggests this benefit might be “simply offset in the event that they require extra tokens to cause a few given downside.”

AI Scaling Hits Its Limits

Energy caps, rising token prices, and inference delays are reshaping enterprise AI. Be a part of our unique salon to find how high groups are:

Turning vitality right into a strategic benefit

Architecting environment friendly inference for actual throughput beneficial properties

Unlocking aggressive ROI with sustainable AI techniques

Safe your spot to remain forward: https://bit.ly/4mwGngO

The true price of AI: Why ‘cheaper’ fashions might break your funds

The analysis examined 19 completely different AI fashions throughout three classes of duties: fundamental information questions, mathematical issues, and logic puzzles. The workforce measured “token effectivity” — what number of computational items fashions use relative to the complexity of their options—a metric that has acquired little systematic examine regardless of its vital price implications.

“Token effectivity is a essential metric for a number of sensible causes,” the researchers famous. “Whereas internet hosting open weight fashions could also be cheaper, this price benefit may very well be simply offset in the event that they require extra tokens to cause a few given downside.”

Open-source AI fashions use as much as 12 occasions extra computational assets than essentially the most environment friendly closed fashions for fundamental information questions. (Credit score: Nous Analysis)

The inefficiency is especially pronounced for Giant Reasoning Fashions (LRMs), which use prolonged “chains of thought” to unravel advanced issues. These fashions, designed to suppose by means of issues step-by-step, can eat 1000’s of tokens pondering easy questions that ought to require minimal computation.

For fundamental information questions like “What’s the capital of Australia?” the examine discovered that reasoning fashions spend “tons of of tokens pondering easy information questions” that may very well be answered in a single phrase.

Which AI fashions truly ship bang to your buck

The analysis revealed stark variations between mannequin suppliers. OpenAI’s fashions, notably its o4-mini and newly launched open-source gpt-oss variants, demonstrated distinctive token effectivity, particularly for mathematical issues. The examine discovered OpenAI fashions “stand out for excessive token effectivity in math issues,” utilizing as much as thrice fewer tokens than different business fashions.

Amongst open-source choices, Nvidia’s llama-3.3-nemotron-super-49b-v1 emerged as “essentially the most token environment friendly open weight mannequin throughout all domains,” whereas newer fashions from firms like Magistral confirmed “exceptionally excessive token utilization” as outliers.

The effectivity hole diversified considerably by activity sort. Whereas open fashions used roughly twice as many tokens for mathematical and logic issues, the distinction ballooned for easy information questions the place environment friendly reasoning must be pointless.

OpenAI’s newest fashions obtain the bottom prices for easy questions, whereas some open-source options can price considerably extra regardless of decrease per-token pricing. (Credit score: Nous Analysis)

What enterprise leaders must find out about AI computing prices

The findings have fast implications for enterprise AI adoption, the place computing prices can scale quickly with utilization. Corporations evaluating AI fashions usually deal with accuracy benchmarks and per-token pricing, however might overlook the whole computational necessities for real-world duties.

“The higher token effectivity of closed weight fashions usually compensates for the upper API pricing of these fashions,” the researchers discovered when analyzing whole inference prices.

The examine additionally revealed that closed-source mannequin suppliers seem like actively optimizing for effectivity. “Closed weight fashions have been iteratively optimized to make use of fewer tokens to cut back inference price,” whereas open-source fashions have “elevated their token utilization for newer variations, probably reflecting a precedence towards higher reasoning efficiency.”

The computational overhead varies dramatically between AI suppliers, with some fashions utilizing over 1,000 tokens for inner reasoning on easy duties. (Credit score: Nous Analysis)

How researchers cracked the code on AI effectivity measurement

The analysis workforce confronted distinctive challenges in measuring effectivity throughout completely different mannequin architectures. Many closed-source fashions don’t reveal their uncooked reasoning processes, as a substitute offering compressed summaries of their inner computations to stop rivals from copying their methods.

To handle this, researchers used completion tokens — the whole computational items billed for every question — as a proxy for reasoning effort. They found that “most up-to-date closed supply fashions won’t share their uncooked reasoning traces” and as a substitute “use smaller language fashions to transcribe the chain of thought into summaries or compressed representations.”

The examine’s methodology included testing with modified variations of well-known issues to reduce the affect of memorized options, corresponding to altering variables in mathematical competitors issues from the American Invitational Arithmetic Examination (AIME).

Completely different AI fashions present various relationships between computation and output, with some suppliers compressing reasoning traces whereas others present full particulars. (Credit score: Nous Analysis)

The way forward for AI effectivity: What’s coming subsequent

The researchers counsel that token effectivity ought to change into a major optimization goal alongside accuracy for future mannequin growth. “A extra densified CoT will even permit for extra environment friendly context utilization and will counter context degradation throughout difficult reasoning duties,” they wrote.

The discharge of OpenAI’s open-source gpt-oss fashions, which display state-of-the-art effectivity with “freely accessible CoT,” may function a reference level for optimizing different open-source fashions.

The whole analysis dataset and analysis code are accessible on GitHub, permitting different researchers to validate and prolong the findings. Because the AI trade races towards extra highly effective reasoning capabilities, this examine means that the true competitors is probably not about who can construct the neatest AI — however who can construct essentially the most environment friendly one.

In spite of everything, in a world the place each token counts, essentially the most wasteful fashions might discover themselves priced out of the market, no matter how nicely they will suppose.

Each day insights on enterprise use instances with VB Each day

If you wish to impress your boss, VB Each day has you lined. We provide the inside scoop on what firms are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for max ROI.

Learn our Privateness Coverage

Thanks for subscribing. Try extra VB newsletters right here.

An error occured.

Previous articleNew Vary Of Cermet Trimmers Launched
Next articleGoogle AI Introduces Gemma 3 270M: A Compact Mannequin for Hyper-Environment friendly, Job-Particular Nice-Tuning

RELATED ARTICLES

Big Data

High 5 Excessive-Paying AI Jobs That Don’t Require Coding

February 24, 2026

Big Data

A Full Information for Time Collection ML

February 24, 2026

Big Data

Prime AI Agent Improvement Firms in USA (2026 Information)

February 24, 2026

That ‘low-cost’ open-source AI mannequin is definitely burning by means of your compute funds

The true price of AI: Why ‘cheaper’ fashions might break your funds

Which AI fashions truly ship bang to your buck

What enterprise leaders must find out about AI computing prices

How researchers cracked the code on AI effectivity measurement

The way forward for AI effectivity: What’s coming subsequent

High 5 Excessive-Paying AI Jobs That Don’t Require Coding

A Full Information for Time Collection ML

Prime AI Agent Improvement Firms in USA (2026 Information)

LEAVE A REPLY Cancel reply

Most Popular

Illinois staff outlines emit-then-add path to photonic graph states

Dutch court docket orders investigation into China-owned Nexperia

ZTE outlines 6G technique and unveils GigaMIMO, main AI-native wi-fi for 6G evolution

This Week’s Superior Tech Tales From Across the Net (Via February 28)

Recent Comments

ABOUT US

POPULAR POSTS

Illinois staff outlines emit-then-add path to photonic graph states

Dutch court docket orders investigation into China-owned Nexperia

ZTE outlines 6G technique and unveils GigaMIMO, main AI-native wi-fi for 6G evolution

POPULAR CATEGORY