Open Token Exchange · Futures Market for AI Inference

01 A new commodity

Intelligence is becoming a commodity.

has surpassed to become AI’s dominant cost, approaching two-thirds of all compute, up from roughly a third in 20231Deloitte finds inference is roughly two-thirds of all compute in 2026, up from a third in 2023 and half in 2025. Deloitte, “More compute for AI, not less,” Nov 2025.

This explosion is driven by the extraordinary growth in token consumption volume, which is doubling every year and on track to grow more than 20× by 20302Goldman Sachs projects token consumption multiplying 24× to 120 quadrillion tokens per month between 2026 and 2030, about 2.2× per year. Goldman Sachs Research, May 2026. By revenue, the inference market is already over $100 billion, and is set to more than double, reaching $255 billion by the end of the decade3The global AI inference market is projected to grow from $106.15 billion in 2025 to $254.98 billion by 2030, a 19.2% CAGR. MarketsandMarkets, AI Inference Market 2025–2030.

The token explosion: volume2Goldman Sachs projects token consumption multiplying 24× to 120 quadrillion tokens per month between 2026 and 2030, about 2.2× per year. Goldman Sachs Research, May 2026 & market size3The global AI inference market is projected to grow from $106.15 billion in 2025 to $254.98 billion by 2030, a 19.2% CAGR. MarketsandMarkets, AI Inference Market 2025–2030

Token volumeMarket size ($B)

This shift turns the output of a model into something an economist would recognize as a commodity. A unit of inference, one million tokens, is now bought and sold much like a barrel of oil or a kilowatt-hour of electricity.

We are standing at the cusp of commoditized intelligence.

02 The problem

The missing instrument.

Here is the problem. Every company that builds on AI now carries a large, growing, and unavoidable inference bill. When the cost gets big enough, predictability becomes a necessity.

In every other commodity market, that exact situation calls for a futures market: a way to agree today on the price you will pay tomorrow. For oil, electricity, and carbon, such markets exist. For intelligence, none does.

03 The product

Two products. One market for intelligence.

An open futures market needs two things that build on each other: a public price for a unit of intelligence, and a place to trade that price forward. We are building both.

The Index

A public, reproducible price for a unit of intelligence.

It rests on two definitions. The Standard Inference Token (SIT) normalizes one unit of model output to a fixed capability bar, so stronger models count for proportionally more. The Token Price Index (TPI) is the volume-weighted average of what qualified providers charge for it, with no single provider allowed to dominate the figure.

The SIT capability barqualifying

MMLU	≥ 86%
HumanEval	≥ 67%
GSM8K	≥ 92%

TPI_t = Σ w_i · P_i,t

The Token Futures Market

A cash-settled futures market built on the index.

Buyers lock tomorrow’s token price today; suppliers with spare capacity guarantee their revenue. Because the index is public and hard to manipulate, the whole market can settle against it. Since tokens cannot be physically delivered, every contract settles in cash against the TPI, exactly as electricity and compute futures already do.

A token futures contractcash-settled

Underlying	TPI ($/SIT)
Settlement	Cash vs index
Tenors	1 to 12 months

04 Why prices move

Why the price fell, and why that ends.

Since 2023, the price of a fixed level of intelligence has fallen by roughly tenfold a year, one of the steepest cost declines in the history of computing. Four forces drove it down.

Labs subsidize. The leading labs price tokens below cost to win users and lock in market share.
Chips got faster. Each new generation of inference hardware produces more tokens per dollar, but slowly: performance per dollar improves only about 1.3× a year, doubling roughly every two years.
Models got leaner. Distillation and smaller architectures cut the compute behind a given answer by roughly 3× a year, halving it every eight months. This is the fastest of the four forces, and also a finite one.
Competition intensified. Open-weight challengers undercut the incumbents hard (DeepSeek launched at roughly 90% below prevailing prices), and each new frontier model turns last year’s premium capability into this year’s commodity.

Two of these are pure supply mechanics, captured in a simple identity:

Q_token = (η_H · η_A / C_E) · K

C_E: energy cost ($/kWh)

η_H: hardware efficiency (FLOPS/$)

η_A: algorithm efficiency (tokens/FLOP)

K: total capital deployed

Faster chips raise hardware efficiency and leaner models raise algorithmic efficiency, and both multiply the tokens a fixed amount of capital and energy can produce. Subsidy and competition then push the market price below even what those mechanics require.

But every one of these forces is near its limit, and a fifth has begun to push the other way.

Subsidy ends. Below-cost pricing cannot outlast the race for share. Headline prices are already decelerating toward roughly 40% a year, with some services nudging back up.
The buildout slows. Chip stock can double every seven months on paper, but it is gated by power, which grows only about 15% a year and takes five to seven years to bring online.
Chips run short. That 1.3× a year is already near the physical floor, and a supply crunch (HBM sold out for 2026, year-long GPU lead times) can push prices the other way.
Good enough arrives. Open models are already good enough for most work, so chasing an ever-higher frontier stops driving prices down. Buyers settle on what works, and the capability they actually use stops getting cheaper.
Demand explodes. Above all, autonomous agents consume tokens at a scale no human ever did: business token use grew more than tenfold in the sixteen months to early 2026, a single agent burns 5 to 30 times the tokens of a chat, and demand is forecast to multiply another 24× by 2030. Their appetite is inelastic: they cannot drop to a weaker model when prices rise.

Line them up and the cross looks inevitable: efficiency buys about 3× a year and is slowing toward its 1.3× hardware floor, while demand has lately grown closer to 8× a year. The downward forces are decelerating; the upward one is not.

Why prices swing: demand outruns what can be built

DemandSupplyThe gap = volatility

Illustrative growth paths, indexed to 2023.

05 No inventory

A commodity with no inventory.

And because a token cannot be stored, nothing absorbs the gap. It is produced and consumed in the same instant, with no warehouse to draw down when demand spikes and none to fill when it falls. Supply has to meet demand in real time, so any mismatch shows up at once as a price swing. This is why electricity, which also cannot be stored, is one of the most volatile commodities on Earth. Intelligence is about to join it, and the shift comes in three phases:

Token prices: a steep one-way fall, then two-way swings

Decline anchored to reported figures; Phase III is illustrative.

Supply-driven decline 2023–2025

All four forces push together; prices fall by roughly tenfold a year.

Rebalancing 2025–2027

Demand grows faster than data centers, chips, and power can be built. The decline slows, and the first rebounds appear.

III

Demand-driven volatility post-2027

A single popular application can multiply token demand in days, while new supply takes years to arrive: one to three for a data center, and far longer for the power and substations to feed it. The mismatch produces electricity-style swings.

Demand moves at the speed of software. Supply moves at the speed of construction. The distance between those two speeds is where volatility lives.

06 The evidence

Does the hedge actually work?

A hedge is only worth building if it measurably lowers risk. Calibrate a standard price model to the dynamics we expect, a downward trend, reversion toward a moving average, and occasional sharp upward jumps, and the answer is clear. A buyer who hedges cuts the volatility of their procurement costs by roughly 62 to 78 percent in every scenario tested. The same model shows why the need is real: about 15 percent of simulated price paths contain a spike of 100 percent or more within three years. A risk that large, with no instrument to manage it, is the gap this market fills.

62–78%cut in procurement-cost volatility, every scenario tested

~15%of price paths spike 100%+ within three years

07 Roadmap

How to build it.

A market like this cannot be declared into existence, and the sequence is the whole thesis. Each stage is cheap to run, proves out the next, and leaves behind an asset the eventual exchange is built on.

Match by hand Building now

Today the market is thin and sell-biased: with prices still falling, more parties want to lock in a sale than a purchase. So we do not open an order book and wait for one to form. We broker it. We find owners of spare compute willing to sell tokens forward at a discount, the natural shorts, and companies that want to fix their token costs, the natural longs, and we match them deal by deal. This takes little capital and carries little balance-sheet risk. More important, every trade we broker quietly builds the two assets that matter most: the price data behind the index (the TPI) and the capability standard that defines a unit of intelligence (the SIT). We leave this stage not as a broker but as the owner of the benchmark everyone else will have to price against.

Productize the deal Next

Once the matchmaking has steady flow and the index has a track record, we replace bespoke deals with one standard product: a redeemable voucher, a prepaid claim on future inference the holder can redeem for real tokens or resell to exit. Because it can always be redeemed for the real thing, arbitrage keeps its price honest and we never have to trust an outside oracle. This turns a hand-run brokerage into a two-sided market that scales without us standing in the middle of every trade.

Open the exchange Gated

With a proven index and a liquid voucher market, the full cash-settled futures follow, settling against the TPI and open to the speculators and market-makers who add depth. This is the prize: whoever owns the settlement layer for intelligence owns a toll on every contract written against it.