Dell PowerEdge XE9712 with NVIDIA GB300 NVL72, Explained

NewsUniqcli TeamJune 3, 20264 min read

The NVIDIA GB300 NVL72 has become the reference design for frontier AI, and Dell Technologies brings it to market in the Dell PowerEdge XE9712. If you are evaluating rack-scale AI for training large language models or serving high-volume reasoning inference, this is the platform to understand. Below is a plain-language breakdown of what the system is, what changed versus the prior generation, and how federal and enterprise buyers can procure it.

What the PowerEdge XE9712 actually is

The Dell PowerEdge XE9712 is a single, integrated, liquid-cooled rack rather than a server you buy by the node. It packages the NVIDIA GB300 NVL72 design: 72 NVIDIA Blackwell Ultra GPUs and 36 Arm-based NVIDIA Grace CPUs, wired together as one tightly coupled accelerator domain. The defining feature is NVLink. Through NVLink 5 and the NVLink switch trays, all 72 GPUs communicate as if they were a single, very large GPU. That shared-memory behavior is what makes trillion-parameter models practical to train and serve without the bottlenecks you hit when stitching together separate boxes over a conventional network.

This is a meaningful departure from how most data centers were built. A traditional general-purpose rack might hold a dozen or more 1U and 2U servers, such as Dell PowerEdge R660 and R760 systems, each doing independent work. The XE9712 instead treats the entire rack as one compute unit purpose-built for AI.

The numbers that matter

NVIDIA positions the GB300 NVL72 around a few headline figures. The platform delivers roughly 1.1 exaFLOPS of dense FP4 compute in a single rack. Each Blackwell Ultra GPU carries 288 GB of HBM3e memory, a 50 percent increase over the prior Blackwell generation, achieved with taller 12-high memory stacks. Across the rack that adds up to roughly 20 TB of HBM and tens of terabytes of fast memory overall, with NVLink bandwidth in the range of 130 TB/s.

On networking, the integrated NVIDIA ConnectX-8 SuperNIC provides up to 800 Gb/s of connectivity per GPU, with a choice of NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet for scaling across multiple racks. Dell's own materials emphasize large gains in AI reasoning inference output and throughput versus earlier Blackwell systems, driven by the Ultra GPU's higher FP4 tensor performance and improved attention performance. These are vendor figures for relative improvement, not guarantees for any specific workload, so treat them as directional and validate against your own models.

Why liquid cooling is not optional here

A GB300 NVL72 rack draws on the order of 120 kW, far beyond what air cooling can remove. The XE9712 uses direct-to-chip liquid cooling paired with Dell PowerCool, Dell's thermal-management approach for high-density AI infrastructure. The practical implication for buyers is that deployment is a facilities project, not just a hardware purchase. You will need to plan for coolant distribution, supply and return temperatures, leak detection, and the electrical capacity to feed a rack of this density. Dell's reference designs and services exist precisely to de-risk that planning, and engaging early on power and cooling readiness is the single biggest factor in a smooth install.

Where it fits in the Dell AI Factory

The XE9712 is the high end of the Dell AI Factory with NVIDIA, an architecture that spans compute, networking, storage, and software so that organizations can move from pilot to production without assembling parts from scratch. In practice an XE9712 deployment is paired with high-throughput storage such as Dell PowerStore or PowerMax to keep the GPUs fed, and with a data-protection layer like Dell PowerProtect for the datasets and checkpoints that represent enormous training investment. For teams that want consumption-based or fully managed infrastructure rather than a capital purchase, Dell APEX offers flexible commercial models around the same underlying technology. Operational visibility across the fleet comes from Dell CloudIQ, which provides AIOps-style monitoring, health, and capacity insight.

Who should consider it, and who should not

The XE9712 is built for organizations training foundation models, running large-scale reasoning inference, or standing up an AI factory for many concurrent users. National labs, defense research programs, large health systems building clinical AI, and enterprises operationalizing generative AI are the natural fit. If your needs are more modest, such as fine-tuning, retrieval-augmented generation, or inference at smaller scale, a Dell PowerEdge server with a handful of NVIDIA GPUs will deliver better value, and Uniqcli can help you right-size rather than over-buy.

Procuring the XE9712 through Uniqcli

Uniqcli is an authorized Dell Technologies reseller serving US federal, DoD, SLED, healthcare, and enterprise buyers. We support TAA-compliant configurations and procurement through GSA Schedule and NASA SEWP, the vehicles most public-sector AI programs rely on. Because rack-scale AI involves long lead times, facilities coordination, and supply allocation, the earlier you engage, the better your delivery and readiness outcomes. Our team can scope an XE9712 deployment end to end, including the supporting PowerStore or PowerMax storage, PowerProtect data protection, networking, and the power and cooling assessment, and align it to the right contract vehicle. Contact Uniqcli to start a Dell AI Factory conversation tailored to your mission and timeline.