The AI Capital Loop: Who Does What in the AI Flywheel

Written by:

Lumida Team

Date:

October 9, 2025

Example H3

A few firms set the tempo for AI.

Nvidia manufactures the scarce compute.

OpenAI converts compute into frontier models.

Microsoft, Oracle, CoreWeave, and others rent capacity, bundle software, and finance the cycle.

The diagram works as a procurement map: circle-size signals gravity; arrow direction signals dependence; color signals whether the tie is hardware, services, or capital.

Nvidia: center of gravity

Nvidia sells the rarest shovels in a gold rush. Those shovels are GPUs plus the software that makes them useful at scale.

When you hear “CUDA,” think of it as Nvidia’s software toolkit that lets developers squeeze maximum speed from its chips. That toolkit is why switching away isn’t like swapping phones. It’s closer to moving factories.

Clouds and AI labs don’t place small orders.

They sign multi-year commitments because the real risk isn’t price, it’s not getting delivery on time.

Under the hood there are two chokepoints most people never see.

One is “HBM,” a kind of ultra-fast memory stacked right next to the chip.

The other is advanced packaging, which is the precision assembly that binds chips, memory, and networking together.

If either is short, finished GPUs are short. That’s why Nvidia’s margins look unusual for a hardware company and why backlogs matter more than headlines about list prices.

Think about the flywheel.

Strong demand improves Nvidia’s profits.

Those profits pay for more factories and capacity.

More capacity enables bigger models.

Bigger models renew demand.

You don’t need to believe in AI as magic to see why this loop holds while supply is tight and competitors are still closing gaps.

OpenAI: algorithmic kingmaker

OpenAI turns datacenter spend into models people want to use. That sounds obvious, but the mechanics matter.

Training a frontier model is a staged industrial process: months of renting massive compute, careful experiments, then a product phase where inference, the act of answering user requests, burns steady GPU cycles.

Because the training phase is so hungry, OpenAI spreads work across multiple clouds and chip suppliers.

Datacenters draw power like small cities. A six-gigawatt plan means power contracts, substations, and cooling become part of the product roadmap.

To reduce dependence on a single supplier, OpenAI is bringing AMD into the mix.

You’ll hear “ROCm,” which is AMD’s answer to Nvidia’s CUDA.

If ROCm keeps improving and the hardware lands on time, OpenAI gets bargaining power and a second lane for growth.

If it lags, the economics tilt back to Nvidia.

Either way, OpenAI’s business rests on three questions you can track without a PhD:

How fast do models get better, what does it cost to serve a query, and how many enterprises wire those models into real workflows?

Cloud supernodes

Microsoft is the easy one to explain. Azure sells capacity, and Microsoft bundles AI into products you already pay for: Office, security, developer tools. That bundling turns GPU hours into recurring software revenue, which is a nicer business than raw infrastructure.

Oracle is doing something different. It is acting like a capacity broker for AI workloads, signing very large agreements with model providers and buying GPUs aggressively to fulfill them.

The upside is obvious: if you can add regions, power, and networking faster than rivals, you win customers who need to train now, not in twelve months.

The constraint is also obvious: you must secure energy, land, and fiber at a pace that traditional IT planning never required.

Then there are specialists like CoreWeave and Nebius.

Think of them as GPU dispatchers. They don’t try to be everything. They try to be fast.

If a lab needs 20,000 high-end GPUs next quarter, a specialist might deliver sooner than a hyperscaler with a full queue.

The trade-off is that specialists live and die by access to power, chips, and network backbones they don’t fully control.

Second-order suppliers and challengers

AMD is the practical challenger because it already ships competitive silicon in some workloads and its software is catching up.

When you hear that a customer will deploy “6 GW of AMD GPUs,” translate that to:

If the software stack stays stable and memory supply is there, AMD’s share can move faster than most people expect.

Intel has real assets in networking and accelerators, but to win big training jobs it must close gaps in performance per watt and developer tooling. That is fixable, but it is not overnight.

On the demand side, names like xAI, Mistral, Figure, Harvey, Ambience, and Nscale matter for a simple reason.

Every product that users adopt becomes a steady inference load. That steady load, multiplied across companies, justifies the next datacenter build.

You don’t need every app to be a hit. You need enough sticky ones to keep utilization high.

The money loop

Capital funds datacenters.

Datacenters train models.

Models power applications.

Applications produce revenue that justifies more capital.

When utilization is high and energy is available, payback shortens and the loop tightens.

When either slips, purchase commitments and delivery schedules become fragility points.

Bottlenecks and choke points

Power is the new gating input.

A single customer planning multiple gigawatts signals that substation access, interconnect queues, and cooling will pace deployments.

Supply chains for HBM and advanced packaging remain tight, which sustains pricing power for successful vendors.

Software lock-in matters as much as silicon; portability improves on paper faster than it does in production.

Where margins accrue

In the near term, the scarce thing earns the spread, and today the scarce thing is cutting-edge silicon delivered on schedule.

As software platforms mature and integration deepens, more value shifts to model providers that can sell reliability, security, and governance to enterprises.

Over a longer horizon, the best returns accrue to applications that become part of daily work, because those customers pay for outcomes, not tokens or teraflops, and they churn less when hardware prices normalize.

Fault lines to watch

Two contests will define the slope of outcomes.

The first is AMD versus CUDA gravity; parity in performance and tooling would reprice procurement.

The second is Oracle and specialized clouds versus hyperscalers; speed of region buildouts and energy access will decide wallet share.

A third, slower contest pits closed platforms against open-source and xAI on quality, latency, and price.

Regulatory scrutiny on vertical ties and exclusive deals runs in parallel and can alter terms with little notice.

Copy

Text is copied

Facebook

Twitter