Nvidia Vendor Financing Trap: Is the AI Infrastructure Bubble About to Burst?
- Tony Grayson
- Nov 20
- 8 min read
Updated: 23 hours ago
By Tony Grayson Tech Executive (ex-SVP Oracle, AWS, Meta) & Former Nuclear Submarine Commander

In my years managing physical infrastructure at Oracle, building AWS's global design and engineering practice, and running operations at Meta, I've seen technology cycles come and go. But what's happening right now in AI infrastructure financing isn't just another cycle; it's a boom built on vendors' (NVIDIA vendor financing) own funding, aggressive assumptions about useful life, and physics that don't care about your spreadsheet. This is not a problem we can afford to ignore.
We are witnessing a market where the vendor finances its own customers to buy its own products, aimed at AI data centers that, on current roadmaps, will likely become functionally obsolete within 2–3 years.
Let me be clear: I'm not bearish on AI. I run a company building AI-ready data centers. But I am deeply concerned about how companies are financing this infrastructure boom, and more importantly, how few people understand the fundamental physics constraints we're hitting. We need honest discussions about asset lifecycles and financing sustainability in the AI industry.
The problem is fourfold:
Data Center Obsolescence: Physics vs. 10–15 year depreciation.
Circular Financing: Vendor-funded demand creating a potential AI bubble.
Hardware Depreciation Mismatch: 1–2 year economics vs. 3–6 year books.
Commoditization: TPUs/ASICs undercutting Nvidia’s margin stack.
Let's break down each one.
1. Infrastructure Obsolescence: The GB200 & Liquid Cooling Crisis
This is where my background in nuclear engineering and data center design becomes relevant. I've spent my career managing thermal loads, electrical distribution, and structural capacity at scale. And what I'm seeing now violates every principle of sustainable infrastructure design.
The Power Density Problem
Today's Nvidia GB200 NVL72 systems require approximately 120–132kW per rack versus the traditional 6–10kW. But here's what keeps me up at night: Nvidia's own roadmap presented at GTC shows systems targeting approximately 600kW per cabinet by 2027 (Vera Rubin/Kyber generation), moving the industry toward 0.5–1MW-class footprints per compute block. If this trajectory continues, follow-on systems will effectively reach megawatt-class per cabinet.
That's not an upgrade path. That's a complete infrastructure replacement.
To put this in perspective: engineers designed a typical enterprise data center built five years ago for 6–10kW per rack. Even our "high-density" zones at AWS and Meta typically operated in the 20–30kW/rack range. Now we're talking about 120kW+ liquid-cooled racks today and 600kW by 2027. That's not 10% more cooling capacity; it's a different thermal engineering domain entirely. The potential consequences of this rapid data center obsolescence are concerning.
The Weight Problem & Structural Limits
A GB200 NVL72 configuration weighs approximately 3,000 lbs. Compare that to standard server racks at 1,000–1,500 lbs. Most legacy raised floors can support only 1,500–2,000 lbs. per tile.
You cannot roll a 3,000 lbs. rack onto a standard raised floor without significant structural reinforcement or switching to a slab-on-grade design. And based on early Kyber chassis designs, next-generation systems could weigh double or more than today's GB200 racks.
Let me translate that: even if you build a "state of the art" facility for today's GB200 hardware, you're facing structural obsolescence in 2–3 years. The concrete you're pouring today already falls short of what the hardware arriving in 2027 will require.
The Architecture Problem: NVLink Domains
The NVL72 configuration isn't a traditional "server rack," it's essentially a backplane with a copper spine (NVLink switch tray) that connects all 72 GPUs into a single coherent domain. In practice, operators deploy it as a single 72-GPU NVLink domain across one rack. You don't get to scatter half the system across another row like legacy server racks; you must provision power and cooling for the whole 132kW block in one spot.
The copper cabling length limits are absolute. The physics of signal loss dictates the room's geometry. This defeats every standard density management strategy we've developed over 20 years. Hot-aisle/cold-aisle containment? It doesn't work at this scale. Distributed load balancing? Can't do it. Traditional PDU (Power Distribution Unit) sizing? Insufficient.
The Cooling Problem: Direct-to-Chip vs. Air
Here's where physics becomes brutal: you cannot cool 120–600kW racks practically with air at scale. It requires liquid cooling; specifically, direct-to-chip (DLC) liquid cooling.
"New cooling" means installing CDUs (Coolant Distribution Units), manifolds, rack-level water loops, and potentially rear-door heat exchangers. Marketing says "one rack," but physics demands massive adjacent CDUs and hydraulic support infrastructure. You can't just swap a tile.
Most of the existing colocation footprint lacks this infrastructure entirely. Retrofitting data centers means trenching concrete slabs to run supply and return lines, installing overhead piping with appropriate seismic bracing, adding leak detection, and ensuring your building's chiller plant can handle the thermal load. We're talking major construction, not a simple upgrade.
The Timeline Trap
Here's the killer insight from my infrastructure background: if I start designing a conventional GB200 data center today, I'm looking at 18–24 months for design, permitting, and construction. That puts commissioning in late 2026 or 2027.
By that timeline, I'm obsolete before I cut the ribbon. The 600kW-class systems will ship. My facility, purpose-built for 120–132kW racks, cannot accommodate 600kW loads without gutting the electrical distribution and cooling systems. The data center itself becomes a stranded asset.
2. Circular Financing Risks: Nvidia, CoreWeave, and the Debt Trap. Nvidia Vendor Financing.
Here's the uncomfortable truth: Nvidia owns an estimated multi-billion-dollar stake in CoreWeave (approximately 7%, valued around $3B) and has backstopped up to $6.3B in take-or-pay capacity agreements through 2032. They've structured up to $100B in vendor financing for OpenAI in ten $10B tranches tied to deployment milestones. OpenAI's CFO, Sarah Friar, has publicly acknowledged that "most of the money will go back to Nvidia."
Let that sink in. The vendor finances its own customers to buy its own products.
Jay Goldberg at Seaport Global, who holds the only "sell" rating on Nvidia on Wall Street, states flatly: "Nvidia is buying demand here." Morgan Stanley's Lisa Shalett observed: "Nvidia is in a position to prop up customers so that it's able to grow."
And now we're beginning to see the lag. Nvidia revenue growth has stepped down from roughly 110% to the mid-50s–60% range year over year. When growth rates are cut in half, it's a signal. Either demand normalizes, or the vendor financing strategy hits its natural limits.
The Lucent & Nortel Parallel
Historically, this hasn't ended well. In the late 1990s, Lucent and Nortel extended billions in vendor financing to telecom carriers to buy their own equipment, then recorded those loans as revenue. Nortel's market cap went from $398B to zero. The pattern eerily mirrors the present: accelerating vendor financing as organic growth slows. The gap between reported revenue and actual cash-generating demand widens. Eventually, the customers can't service the debt, the vendor's balance sheet deteriorates, and the whole structure collapses.
The question isn't whether Nvidia makes great chips—they do. The question is whether revenue backed by vendor financing to customers represents sustainable demand.
3. Accelerated Hardware Obsolescence: The "Big Short 2.0"
Nvidia has effectively moved to an annual product cycle: Hopper shipped in 2022, Blackwell in 2024, Rubin in 2026, and Vera Rubin in 2027. Nvidia itself guides to an order-of-magnitude reduction in cost per token from Hopper to Blackwell on specific workloads.
Think about that: a 10x improvement in economics in two years.
Jonathan Ross, CEO of Groq, has argued that companies should amortize AI accelerators on 1–2-year schedules. Yet CoreWeave has reportedly adopted 6-year depreciation schedules for GPU infrastructure, whereas hyperscalers typically use 3–5-year schedules. Notably, AWS already reversed course from 6 years back to 5 years in 2025—a signal they recognize the mismatch.
Short sellers now bet that the actual useful life for AI accelerators runs 2–3 years, not the 5–6 years companies report. The "Big Short 2.0" thesis centers on companies overstating asset life to inflate earnings today while building massive write-offs tomorrow.
Here's the operational reality: when performance improves by an order of magnitude per generation, last-generation hardware becomes uneconomical. Even if the chips still work, your OpEx on power consumption alone kills you. If I can deliver the same workload for 1/10th the power cost with Blackwell or Rubin, your old Hopper hardware instantly becomes obsolete.
4. Commoditization: Google TPUs vs. Nvidia Margins
This might be the most underestimated risk in the market. Google's Ironwood TPU pods perform on par with Nvidia's latest racks on many inference workloads, with Google claiming significantly better performance per watt. But here's the key: because Google doesn't pay Nvidia's 70-80% margin stack, they structurally keep both the infrastructure margin and the model margin.
Today, Gemini 2.5 Pro and other frontier models already undercut GPT-4.1-class pricing per token in many configurations. Anthropic has announced plans to access up to 1 million TPUs and well over a gigawatt of capacity. That's not a small pilot—that's tens of billions in committed infrastructure and a strategic shift away from Nvidia dependence.
Amazon has Trainium and Inferentia. Microsoft develops Maia. Meta has MTIA. Every hyperscaler builds alternatives because at their scale, Nvidia's margins represent billions in potential savings. The merchant AI compute market (CoreWeave, Lambda Labs) finds itself trapped: paying Nvidia's full margin stack while competing against hyperscalers who don't.
The Systemic Risk: Putting It All Together
The supplier model (Nvidia): Assumes continuous selling as products become obsolete faster than depreciation schedules can recognize the loss.
The merchant compute model (CoreWeave): Assumes they can generate sufficient cash flow to repay vendor financing while competing against hyperscalers.
The infrastructure model: Assumes facilities built today will remain viable for 10–15 years, despite physics showing obsolescence in 2–3 years.
The data starts to tell the story: Revenue growth has stepped down from ~110% to the mid-50s-60% range year over year, even as vendor financing has accelerated. That's not a coincidence, it's a symptom. When organic demand can't sustain triple-digit growth, vendor financing fills the gap. But vendor financing doesn't create demand; it borrows demand from the future.
Conclusion: Physics, Math, and Accountability
In my nuclear submarine training, we had a saying: "You can't argue with physics." The reactor doesn't care about your mission timeline. The laws of thermodynamics don't negotiate.
The same holds true here. You can't cool 600kW racks with infrastructure engineers designed for 20–30kW. You can't load 6,000+ lbs. onto a floor rated for 2,000 lbs. You can't generate returns on assets with 6-year depreciation schedules that become economically obsolete in 2 years. And you can't sustain triple-digit growth indefinitely through vendor financing.
You can support AI and still recognize systemic risk in how companies finance this infrastructure boom. The companies that survive this cycle will understand the physics, respect the math, and build sustainable business models. The ones that don't? Well, history suggests they'll join the long list of technology infrastructure companies that confused vendor-financed growth with real demand.
The music plays. The question is who will stand when it stops.
What do you think? Are we seeing sustainable growth or a vendor-financed bubble? Drop your thoughts in the comments.
Tony Grayson
____________________________
Tony Grayson is a recognized Top 10 Data Center Influencer, a successful entrepreneur, and the President & General Manager of Northstar Enterprise + Defense. A former U.S. Navy Submarine Commander and recipient of the prestigious VADM Stockdale Award, Tony is a leading authority on the convergence of nuclear energy, AI infrastructure, and national defense.
His career is defined by building at scale: he founded and sold a top 10 modular data center company before leading global infrastructure strategy for AWS, Meta, and Oracle as a Senior Vice President. Today, he advises organizations on designing the AI factories and cloud regions of the future, creating operational resilience that survives contact with reality.



Comments