GigaIO Achieves Breakthrough MLPerf Inference Performance with SuperNODE

Hits highest tokens per second ever achieved for a single node
in MLPerf Inference: Datacenter benchmark database.

Carlsbad, California, November 14, 2024 – GigaIO, an award-winning provider of open workloaddefined infrastructure for AI and accelerated computing, has unveiled record-setting results in the MLPerf Inference: Datacenter benchmark using its SuperNODE^TM server, powered by GigaIO’s AI memory fabric, FabreX™. This unprecedented achievement highlights GigaIO’s ability to deliver nearlinear scaling and optimal Total Cost of Ownership (TCO) for demanding AI workloads.

Key Highlights:

Demonstrated record performance for Llama 2 70B-99 AI inference workloads on a single node
Achieved near-linear scaling across 16 GPUs to 46,755.00 tokens per second
Generated 12% more tokens per second compared to other 16-GPU clusters

GigaIO’s SuperNODE server, a single-node system capable of supporting up to 32 accelerators, represents a new era in AI inference. By leveraging the FabreX interconnect, SuperNODE enables accelerators to function as if they were part of a unified, rack-scale server. This breakthrough configuration dramatically improves data throughput and latency, achieving performance levels previously unattainable on competing multi-node solutions.

*Scaling Llama2-70b Inference on 16-GPU SuperNODE.*
*Unverified benchmark results not officially submitted to or verified by MLCommons Association.*

Technical Specifications

System: SuperNODE with 16 GPUs (AMD MI300X)
Benchmark: MLPerf 4.1 Inference: Datacenter Llama 2 70B-99 Offline
Interconnect: FabreX low-latency, high-throughput AI memory fabric

The MLPerf benchmark employed the Llama 2 70B model to simulate real-world AI inference workloads. In a SuperNODE setup featuring two MI300X accelerators (totaling 16 GPUs), GigaIO demonstrated nearlinear performance, scaling to 46,755.00 tokens per second — the highest number achieved for a single node in the MLPerf Inference: Datacenter benchmark database. The results showed that for 16 GPUs, the FabreX-powered setup generates 12% more tokens per second than the next closest competitor using other interconnect technologies (MLPerf ID 4.1-0035), further establishing GigaIO’s solution as the superior choice for scale-up AI deployments.

“Our performance in the MLPerf benchmark underscores the power of SuperNODE and FabreX to drive simple-to-use, cost-efficient, scale-up AI deployments,” said Alan Benjamin, CEO at GigaIO. “We’re proud to offer our customers unmatched performance, simplified management, and lower operational costs to support their next-generation AI workloads.”

Industry Context
With inference being a critical challenge in AI deployment, GigalO’s results underscore the importance of simple, efficient, high-performance scale-up computing solutions. This demonstration shows how advanced interconnect technologies can optimize both TCO and performance for AI workloads. These MLPerf results validate GigaIO’s approach to hardware composition, showing how it offers organizations the flexibility to scale AI infrastructure efficiently. As AI-driven applications expand across industries, GigaIO’s SuperNODE provides a compelling, high-performance solution that empowers businesses to accelerate time-to-results and lower their cost structures.

Stop by the GigaIO booth (#2945) at SC24 in Atlanta to see more amazing SuperNODE AI benchmarks or learn more here. Note: This press release is based on an unverified benchmark result and has not been officially submitted to or verified by MLCommons Association.

About GigaIO
GigaIO is an award-winning provider of open workload-defined infrastructure for AI and accelerated computing, delivering innovative solutions for AI and HPC challenges. GigaIO’s portfolio features two edge-to-core accelerated AI systems: SuperNODE, the world’s only 32 GPU single-node AI supercomputer, and Gryf, a highly portable and scalable AI supercomputer solution that provides meaningful computing and storage at the edge. Both utilize GigaIO’s AI memory fabric, FabreX, which seamlessly composes rack-scale resources and integrates natively into industry-standard tools. Visit www.gigaio.com, or follow on Twitter (X) and LinkedIn.

Contact: Danica Yatko | 442-385-3630 | danica@xandmarketing.com

"*" indicates required fields

Email
This field is for validation purposes and should be left unchanged.
First Name*
Last Name*
Email Address*
Phone Number*
Company*
Country*
Country
Preferred method of contact*
Message*

GigaIO Achieves Breakthrough MLPerf Inference Performance with SuperNODE

Hits highest tokens per second ever achieved for a single node
in MLPerf Inference: Datacenter benchmark database.

Datacenter-Class AI
No Cloud Required

The Datacenter Has
Left the Building

Platform Highlights

What’s Your Edge?

Learn More

See Gryf in Action

GigaIO Achieves Breakthrough MLPerf Inference Performance with SuperNODE

Hits highest tokens per second ever achieved for a single nodein MLPerf Inference: Datacenter benchmark database.

Related Posts

GigaIO’s Edge Platforms Now Verified for Nutanix Kubernetes Platform and Enterprise AI, Solving Last Mile of Enterprise GenAI at the Tactical Edge

GigaIO Partners with Mushroom Networks to Solve Near-edge to Far-edge Connectivity Challenges

GigaIO Recognized for Impactful AI Hardware on the San Diego Hardtech 50 List for the Second Year

Datacenter-Class AINo Cloud Required

The Datacenter Has Left the Building

Platform Highlights

What’s Your Edge?

Learn More

See Gryf in Action

Sign up for GigaIO News

Contact Us

Hits highest tokens per second ever achieved for a single node
in MLPerf Inference: Datacenter benchmark database.

Datacenter-Class AI
No Cloud Required

The Datacenter Has
Left the Building