NVIDIA Vera Rubin Price: NVL72 $5M-$7M, Specs & Release Date

The NVIDIA Vera Rubin NVL72 rack currently quotes between $5 million and $7 million per unit, with roughly $1 million of that tied to networking gear according to supply chain estimates reported by Tom’s Hardware. The platform bundles 72 Rubin GPUs, 36 custom 88-core Vera CPUs, NVLink 6 switches, ConnectX-9 SuperNICs, and BlueField-4 DPUs into a liquid-cooled Oberon rack. Full production shipments are scheduled for the second half of the year, with cloud availability through AWS, Google Cloud, Microsoft Azure, Oracle, CoreWeave, and Lambda. Performance targets land at 3.6 exaFLOPS dense FP4 and 1.2 exaFLOPS FP8, roughly 3.3 times the inference throughput of GB300 NVL72.

NVIDIA Vera Rubin Price, Specs, Release Date, and Performance by original pricing

Vera Rubin Pricing at a Glance

Rack-scale AI systems are sold as full integrated solutions, not single GPUs, which is why quoted numbers land in the millions rather than the thousands.

Vera Rubin NVL72 (VR200): $5M to $7M per rack
Networking included in rack quote: approximately $1M
Vera Rubin Ultra NVL144 (VR300, late 2028): $7M to $8.8M forecasted
Blackwell GB200 NVL72 (reference): $2.8M to $3.4M
Blackwell GB300 NVL72 (reference): $6M to $6.5M
Cooling system per NVL144 rack: approximately $55,710 (Morgan Stanley)

Prices come from supply chain sources cited by Tom’s Hardware and DigiTimes. NVIDIA has never published an official list price for any NVL72 or NVL144 product.

Enterprise deals are negotiated directly, and hyperscalers secure favorable volume terms that are not reflected in these quotes.

Gamers looking at consumer NVIDIA hardware should instead check the RTX 5070 Ti original price or explore compact workstations covered in our roundup of NVIDIA DGX Spark alternatives, which sit in an entirely different product tier than rack-scale AI infrastructure.

Vera Rubin NVL72 Specs and Architecture

The platform combines three core chip types: the Rubin GPU, the Vera CPU, and the networking silicon that ties them together into a single rack-scale unit. Each Rubin GPU is a dual-die package using TSMC’s 3nm node with HBM4 memory, while Vera is NVIDIA’s second-generation Arm server CPU succeeding Grace.

Key hardware specs:

Rubin GPU: 50 PFLOPS dense FP4 per package, 288 GB HBM4, 13 TB/s memory bandwidth
Vera CPU: 88 custom Arm cores, 176 threads via spatial multithreading
NVLink-C2C interconnect: 1.8 TB/s coherent CPU-to-GPU link
NVLink 6 switch: 3.6 TB/s all-to-all scale-up bandwidth per GPU
ConnectX-9 SuperNIC: 1.6 Tb/s per-GPU scale-out bandwidth
Process node: TSMC 3nm
Rack: third-generation MGX Oberon, liquid cooled

Vera CPUs share a unified address space with Rubin GPUs, which removes a chunk of overhead that used to bottleneck agentic and multi-step inference workloads. Memory bandwidth and interconnect performance matter more than raw FLOPs once a model gets large enough, and the platform is engineered around that reality.

Full Comparison Table

Feature	GB300 NVL72	Vera Rubin NVL72	Vera Rubin NVL144 CPX	Rubin Ultra NVL144
Measurement	Per rack	Per rack	Per rack	Per rack
GPU packages	72	72	72 + 144 CPX	144
CPUs	36 Grace	36 Vera (88 cores)	36 Vera	36+ Vera
Dense FP4 compute	1.5 EF	3.6 EF	8 EF	15 EF target
Memory per GPU	288 GB HBM3e	288 GB HBM4	288 GB HBM4 + 128 GB GDDR7	1 TB HBM4e
NVLink bandwidth	1.8 TB/s	3.6 TB/s	3.6 TB/s	3.6 TB/s
Rack power (approx.)	~130 kW	~120-130 kW	~370 kW	~600 kW
Rack price (approx.)	$6M-$6.5M	$5M-$7M	TBD	$7M-$8.8M
Availability	Shipping	H2 release year	End of release year	Late 2028

Release Date and Cloud Availability

NVIDIA kicked off the Rubin platform at CES in Las Vegas, with Jensen Huang confirming full production and samples already in customer hands. Rack-scale shipments to partners land in the second half of the year, and the specialized Rubin CPX variant arrives at year-end.

First cloud deployments are rolling out through AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure, plus CoreWeave, Lambda, Nebius, and Nscale. Microsoft confirmed that future Fairwater AI superfactory sites will deploy Vera Rubin NVL72 racks at scale.

Enterprise buyers should expect six to twelve month lead times for on-premise deployments, given the power and cooling upgrades required to handle 120+ kW racks.

Vera Rubin vs Blackwell Performance

NVIDIA claims the Rubin generation trains equivalent models with a quarter of the GPUs, and runs inference at roughly one-tenth the cost per million tokens of Blackwell. These numbers come from NVIDIA benchmarks and represent ideal-case scenarios.

Throughput gains at the rack level:

3.3x FP4 compute over GB300 NVL72
Up to 5x better inference performance per watt
10x lower cost per million tokens (NVIDIA figure)
7.5x performance with NVL144 CPX variant over GB300 NVL72
$5 billion in token revenue per $100 million invested (NVIDIA ROI claim)

Real-world gains depend heavily on the model, batch size, and whether workloads take advantage of CPX prefill acceleration. Mixture-of-experts and long-context reasoning systems benefit most, while smaller standard inference workloads see smaller improvements.

Power Consumption and Cooling

Performance this generation comes with a real energy cost. Each Rubin GPU ships with two power profiles: Max-Q at 1,800W for efficiency and Max-P at 2,300W for peak throughput.

Rubin GPU TDP (Max-Q): 1,800W
Rubin GPU TDP (Max-P): 2,300W
NVL72 rack power: approximately 120 to 130 kW
NVL144 CPX rack power: approximately 370 kW
Rubin Ultra NVL576 (future Kyber rack): 600 kW per rack
Cooling system cost per NVL144 rack: approximately $55,710

Max-P trades roughly 20 percent more power draw for gains that fall short of that figure, which is why NVIDIA markets Max-Q as the sweet spot for large fleets.

HBM4 memory supply remains tight, which feeds back into pricing pressure. Memory spot markets like CXMT DDR5 RAM prices show how upstream shortages ripple across every part of the AI stack.

NVDA Stock Price Context

NVIDIA trades under the ticker NVDA on Nasdaq. Shares sit near $201 in recent trading, up roughly 90 percent over the prior twelve months and more than 10x over five years. The 52-week range spans $95.04 to $212.19.

Market cap hovers near $4.82 trillion, placing NVIDIA as the world’s most valuable publicly traded company by a wide margin. Analyst consensus price targets average around $268, with highs near $380 and lows near $140.

Vera Rubin revenue is expected to show up in results from Q3 onward, assuming production stays on track.

Conversational Queries Users Actually Type

How people search for this hardware says a lot about what they actually need to know:

“How much does one Vera Rubin rack cost?”

Answer: $5M to $7M per rack

“Is Vera Rubin available yet?”

Answer: In full production, shipments planned for the second half of the year

“What is the difference between NVL72 and NVL144 CPX?”

Answer: CPX adds 144 prefill GPUs for long-context inference

“How much power does a Rubin GPU use?”

Answer: 1,800W (Max-Q) to 2,300W (Max-P)

“Is Rubin better than Blackwell?”

Answer: Yes, 3.3x compute and roughly 10x lower token costs per NVIDIA

“Can I buy a Vera Rubin GPU for gaming?”

Answer: No, this is a datacenter-only platform with no consumer SKU

Users searching for consumer hardware should look elsewhere, since Rubin has no retail SKU. Competing silicon from Axera and other edge AI chip vendors targets a different tier of the market entirely.

Common Mistakes Buyers and Readers Make

Confusing the NVL72 name with GPU count: an NVL72 rack has 72 Rubin packages but 144 reticle-sized GPU dies
Assuming “Vera Rubin” is a consumer GPU: this is a datacenter AI platform, not an RTX replacement
Comparing per-GPU prices: rack-scale systems include networking, cooling trays, and software that a standalone GPU does not
Ignoring power costs: a 130 kW rack running continuously costs roughly $125,000 per year in electricity at U.S. datacenter rates
Mixing up Rubin, Rubin Ultra, and Rubin CPX: three distinct product lines with different release windows and use cases
Treating NVIDIA’s 10x inference claim as universal: gains depend on workload type, especially long-context versus short-batch
Expecting immediate cloud pricing from AWS or Google: spot rates follow deployment capacity, not the announcement cycle

NVIDIA Vera Rubin Price, Specs, Release Date, and Performance: Everything You Need to Know in 2026

Vera Rubin Pricing at a Glance

Vera Rubin NVL72 Specs and Architecture

Full Comparison Table

Release Date and Cloud Availability

Vera Rubin vs Blackwell Performance

Power Consumption and Cooling

NVDA Stock Price Context

Conversational Queries Users Actually Type

Common Mistakes Buyers and Readers Make

Leave a Comment Cancel Reply

Vera Rubin Pricing at a Glance

Vera Rubin NVL72 Specs and Architecture

Full Comparison Table

Release Date and Cloud Availability

Vera Rubin vs Blackwell Performance

Power Consumption and Cooling

NVDA Stock Price Context

Conversational Queries Users Actually Type

Common Mistakes Buyers and Readers Make

Related Posts

Leave a Comment Cancel Reply