Rack-scale Computing Made Simple
Think Outside The Box
FabreX universal dynamic fabric enables true rack-scale computing, breaking the constraints of the server box to make the entire rack the unit of compute. That is only feasible with FabreX, because all end points and servers within the rack can finally be connected in native PCIe (and CXL in the future), just as if they were still “inside the box”. No need for another network inside the rack. Now your “server” is the entire rack – but how would you go about deploying a FabreX system to gain the performance, flexibility and agility of true rack-scale computing?
The Basic Building Block: A GigaCell™
The GigaCell is the basic element from which you can build a variety of configurations based on your needs for HPC/AI workloads. Because FabreX connects any to any PCIe end point and servers, you keep your freedom of choice by selecting your preferred server model, instead of being limited to proprietary hardware or a limited list. While some servers will have limitations in terms of the amount of resources their BIOS will recognize, FabreX overcomes these constraints because of its unique ability to network servers together over the same PCIe/CXL fabric, so the resources can easily be shared across several nodes. FabreX will connect any server with any other rack resources into a fabric computing environment where all the resources can be dynamically allocated as if they were inside a single server enclosure.
This example of a GigaCell includes a switch and an Accelerator Pooling Appliance in which you can insert a variety of GPU or FPGA types: they could be all the same accelerators, or for example half compute and half visualization (or inference) GPUs. From FabreX and one of our off-the-shelf orchestration software options, or from our Command Line Interface (CLI) if you prefer writing your own scripts, the pooled resources can be composed to match your specific workload, and recombined if needed in a different configuration for the next workflow. In order to keep your computing cell humming at full speed, this example includes a storage server connected to your data center backbone and network storage.
The basic GigaCell can be configured in a number of ways depending on your needs, whether more storage, more accelerators, or variety of components. Outfit your JBOGs with the latest GPUs, a mixture of GPUs, FPGAs, ASICs, DPUs, or combine them all together. You get the flexibility and agility of the cloud, but with the security and cost control of your own on-prem infrastructure.
Or create an incredibly powerful compute cell, combining together the power of several HGX-based servers.
Or create up to 3PB storage cell to create a very large capacity NVMe-oF storage server.
From the GigaCell to the GigaPod™
Simply combine up to 6 cells via FabreX (PCIe / CXL) to create a GigaPod™. All the resources inside the entire GigaPod are connected by the FabreX universal fabric, transforming the entire GigaPod into one unit of compute, where all components communicate over native PCIe (or CXL in future versions) for the lowest possible latency and highest performance.
With the complexity of AI models doubling every six months, you need the ability to easily expand your training capability as your model complexity grows. For ease of expansion when your AI workflow demands more resources than can be orchestrated in a GigaCell or a GigaPod, simply combine up to six GigaPods to create a GigaCluster™. Your system now extends up to twelve 42U racks, to a maximum of 100 meters on fiber optic cable.
Example of Scale-Out Topology
For maximum scalability of data center resources within a single PCIe/CXL fabric, use typical scale-out topologies such as leaf and spine or dragonfly. The example below shows a novel way for universities to teach advanced artificial intelligence engineering while keeping within their budget, yet at the same time giving their students access to the latest and greatest in terms of infrastructure components.