Ship Racks with Performance Confidence.
Automated AI rack and cluster validation. Ready to generate revenue at delivery on Day 1.
“Data centers receive misassembled, misconfigured, dead-on-arrival components.” — Together AI
Components tested in isolation, not as a system under AI workloads.
Silent failures like RCCL bandwidth and NVLink errors go undetected.
Operators discover underperformance only after the rack is live.
at stake in a single 32-GPU B200 rack.
Every additional week in L12 validation or post-delivery debug is a direct hit to OEM gross margin and operator time-to-revenue.
ClusterReady closes this gap — before the rack ships.
Validate with AI workloads and XPerf’s proprietary test suite at 10× faster validation. Get the remedy and optimization — not just the failure.
AI workloads + XPerf proprietary test suite take validation from weeks to hours.
Silent failures — RCCL bandwidth, NVLink errors — caught before shipment.
Not just a failure flag — the ONLY tool providing diagnosis, remedy, and optimization.
Four automated stages take every rack from arrival to a verified, optimized, revenue-ready state.
30+ automated checks. GPU · CPU · memory · network · storage.
MLPerf benchmarks + XPerf database. Catches what standard testing misses.
XPerf’s patented stress testing algorithm. Every server and every link, at peak. Surfaces marginal hardware before customers do.
Suggested remedy on every finding. Suggested optimization guidance.
ClusterReady tests the rack as a system — hardware, software, and AI workloads — not isolated components.
ClusterReady complements L11/L12 testing — what L11/L12 ships, ClusterReady proves at the system level.
| Capability | L11/L12 Testing | NVIDIA / AMD Tools | OEM Built-in (Dell, SMC) | Custom Scripts | Aptly Technology | XPerf ClusterReady |
|---|---|---|---|---|---|---|
| Test full rack? | No | No | Yes | Varies by team | SMC + NVIDIA only | Yes |
| Use AI workloads? | No | Synthetic only | Dell yes / SMC partial | Some custom | Yes | Yes |
| Automated diagnosis + suggested remedy? | No | No | No | No | No | Yes |
| Serve any OEM / ODM? | No | Yes | No | No | Yes | Yes |
| Time to validate | 2–6 weeks | Hours | Days–weeks | Weeks | Weeks | 2–8 hours for healthy racks |
Coverage across both major accelerator vendors and current network fabrics, with new SKUs added continuously.


4 nodes · 32× AMD MI325X GPUs · 256 GB HBM3e · RoCE / RDMA
vs AMD’s previous performance results.
scaling efficiency from 1 node → 4 nodes.
tokens/sec inference, single node.
bandwidth gain from RCCL config optimization.
The first published Llama 3.1 pretraining performance result on MI325X submitted to MLPerf Training v6.0.
Available on macOS, Windows, and Linux. Pick your platform to grab the latest build.
x64
Loading…Windows 10 or laterApple Silicon & Intel
Loading…macOS 11.0 or laterx64
Loading…Ubuntu 22.04 or compatibleFounded by ex-Intel engineers with extensive experience deploying clusters with thousands of accelerators, XPerf Inc. is the control plane for data centers — AI infrastructure software for GPU cluster performance validation and optimization, taking an AI-native approach to cluster operation problems. Austin / Round Rock, Texas.