GigaIO and d-Matrix to Build Inference Platform for Enterprise AI
Note: This article originally appeared on insidehpc.com, click here to read the original piece.
CARLSBAD, Calif.– Edge-to-core AI platform company GigaIO today announced the next phase of its partnership with d-Matrix to deliver an inference solution for enterprises deploying AI at scale. Integrating d-Matrix’s Corsair inference platform into GigaIO’s SuperNODE architecture creates a solution designed to eliminate “the complexity and performance bottlenecks traditionally associated with large-scale AI inference deployment.”
The offering addresses the growing demand from enterprises for high-performance, energy-efficient AI inference capabilities that can scale without the typical limitations of multi-node configurations, the companies said. Combining GigaIO’s scale-up AI architecture with d-Matrix’s inference acceleration technology produces a solution that delivers unprecedented token generation speeds and memory bandwidth, while significantly reducing power consumption and total cost of ownership, according to the companies.
The new GigaIO SuperNODE platform, capable of supporting dozens of d-Matrix Corsair accelerators in a single node, is now the industry’s most scalable AI inference platform. This integration enables enterprises to deploy ultra-low-latency batched inference workloads at scale without the complexity of traditional distributed computing approaches.
“By combining d-Matrix’s Corsair PCIe cards with the industry-leading scale-up architecture of GigaIO’s SuperNODE, we’ve created a transformative solution for enterprises deploying next-generation AI inference at scale,” said Alan Benjamin, CEO of GigaIO. “Our single-node server eliminates complex multi-node configurations and simplifies deployment, enabling enterprises to quickly adapt to evolving AI workloads while significantly improving their TCO and operational efficiency.”
The combined solution delivers exceptional performance metrics that redefine what’s possible for enterprise AI inference:
- Processing capability of 30,000 tokens per second at just 2 milliseconds per token for models like Llama3 70B
- Up to 10x faster interactive speed compared with GPU-based solutions
- 3x better performance at a similar total cost of ownership
- 3x greater energy efficiency for more sustainable AI deployments
“When we started d-Matrix in 2019, we looked at the landscape of AI compute and made a bet that inference would be the largest computing opportunity of our lifetime,” said Sid Sheth, founder and CEO of d-Matrix. “Our collaboration with GigaIO brings together our ultra-efficient in-memory compute architecture with the industry’s most powerful scale-up platform, delivering a solution that makes enterprise-scale generative AI commercially viable and accessible.”
This integration leverages GigaIO’s cutting-edge PCIe Gen 5-based AI fabric, which delivers low-latency communication between multiple d-Matrix Corsair accelerators with near-zero latency. This architectural approach eliminates the traditional bottlenecks associated with distributed inference workloads while maximizing the efficiency of d-Matrix’s Digital In-Memory Compute (DIMC) architecture, which delivers an industry-leading 150 TB/s memory bandwidth.
This partnership builds on GigaIO’s recent achievement of recording the highest tokens per second for a single node in the MLPerf Inference: Datacenter benchmark database, further validating the company’s leadership in scale-up AI infrastructure.
“The market has been demanding more efficient, scalable solutions for AI inference workloads that don’t compromise performance,” added Benjamin. “Our partnership with d-Matrix brings together the tremendous engineering innovation of both companies, resulting in a solution that redefines what’s possible for enterprise AI deployment.”
Those interested in early access to SuperNODEs running Corsair accelerators can indicate interest here.
View source version on insidehpc.com: https://insidehpc.com/2025/05/gigaio-and-d-matrix-to-build-inference-platform-for-enterprise-ai/