The U.2 NoLoad™: Computational Storage without the Storage?
- Written by: Stephen Bates, CTO
In this blog we are very pleased to announce, the U.2 version of our NoLoad™ NVM Express (NVMe) computational storage and offload engine. Working with our friends at Nallatech we have developed a ground-up solution for NVMe based offload for storage and analytics in a form-factor that is ideal for next-generation, NVMe based, storage and compute systems. We are very happy to have Allan Cantle (CTO and Founder at Nallatech, a Molex company) act as a co-author on today’s blog. Nallatech are our hardware partner in the development of the U.2 NoLoad™.
Why U.2? Aligning with the NVMe Ecosystem
We should explain what the U.2 form-factor is and why its important in NVMe based systems. To help us do this consider the illustrations in Figure 1.
Figure 1: Consider these three NVMe devices. The first (left) is a Add In Card (AIC) form-factor. The second (middle) is a U.2 NVMe SSD from Seagate. The third (right) is the new Eideticom NoLoad™ U.2 device based on hardware from Nallatech.
In the early days of NVMe the AIC form-factor was the most popular way to deploy NVMe. However it suffers from some drawbacks:
- An AIC does not support hotplug.
- An AIC cannot be easily removed from the server or chassis in which it resides.
- An AIC typically consumes 8 or more PCIe lanes and takes up a whole PCIe slot. In many systems these are valuable resources in high demand by other devices in the system.
For all the above reasons and more several companies and standards entities got together to specify a new form-factor for PCIe attached devices. The SSD Form Factor Working Group developed this standard which can be found here. Because this form-factor addresses the concerns noted above many analysts predict this form-factor U.2 will, along with its smaller cousin M.2, dominate growth in the NVMe device space.
There are many server and storage enclosure systems coming to market with support for U.2 NVMe devices. One example of a storage enclosure is the Open Compute Foundation Lightening Platform developed by Facebook and others. An example of a NVMe U.2 enabled server is the Barreleye G2 from Rackspace we discussed in a previous blog post.
A Peek Inside the U.2 NoLoad™
The U.2 NoLoad™ Hardware Evaluation Kit (HEK) consists of the following:
- A Xilinx FPGA.
- 8GB of DRAM.
- An MCU for control, bootstrapping and debug.
- Some NVM for storing bitfiles for the FPGA.
A block diagram of the components is given in Figure 2 and a photo of the PCB inside the U.2 enclosure is in Figure 3.
Figure 2: A block diagram of the U.2 NoLoad™ Hardware Eval Kit. The FPGA is connected to the host via a standard SFF-8639 connector and connects to 8GB of DRAM via a DDR4 interface. Auxiliary components provide management features and the ability to update the FPGA image via the PCIe interface.
Note there is no NAND in our HEK (we’ll discuss that below) but the FPGA implements our NoLoad™ NVMe Computational Storage engine which as the following features:
- A fully featured NVMe 1.3 interface to the host.
- A large NVMe Controller Memory Buffer (CMB) that supports all data modes.
- PCIe Gen4 ready.
- Validated on Intel, ARM and IBM Power CPU architectures.
- NVMe Scatter-Gather List (SGL) support.
- A range of NVMe namespaces that implement computational services like erasure coding, deduplication, compression and pattern-matching.
Figure 3: The top layer of the PCB for the Eideticom-Nallatech U.2 NoLoad™ device. Notice the placements for the FPGA and the DRAM. Note there are no NAND placements on this PCB!
Hang On! Where’s the NAND?
The more observant of you might have noticed that the U.2 NoLoad™ has no placements for NAND or any other type of Non-Volatile Memory (NVM). This is intentional and makes the NoLoad™ a Computational Storage device without any storage! That may sound odd but we have a few reasons for designing NoLoad™ this way.
- #ComputationalNearStorage. Rather than being an SSD we feel it’s better if we focus on computation and deploy next to the SSDs. We leverage the PCIe Peer-2-Peer technologies we discussed in a previous blog post to get data from the SSDs to NoLoad™ without CPU involvement.
- Don’t compete with the big boys and girls! At the end of the day getting the most out of NAND is a signal processing problem. Claude Shannon’s information theory tells us that the people with the most a priori information will always win and for NAND that’s the NAND vendors.
- Keep It Simple Stupid! Trying to be an SSD and a computation engine at the same time is like trying to solve two very hard problems at once. We’d prefer to focus on the computation problems and let the SSD vendors focus on the NAND management!
- Give your customers choice. Our customers have developed deep and meaningful relationships with their NAND and SSD partners. For us to come in and insist that customers migrate to using NoLoad™ as their NVM controller is a challenge. We prefer to allow our customers to deploy whatever NVMe SSDs they like while we focus on accelerating their applications!
- The ability to scale storage and compute independently. Because we don’t have any NVM we can be scaled independent of the SSDs. This means some customers may deploy more NoLoad™ devices as their computational needs increase or they may deploy more SSDs as their storage needs increase.
Where Next?
The U.2 version of our NoLoad™ NVMe based Computational Storage engine complements our AIC HEKs. If you are interested in any of these contact us at Eideticom to learn more.
In a future blog we will discuss some of the storage server, JBOF and fabric-attached enclosure (FBOFs) systems being developed with U.2 NVMe devices in mind and how our U.2 NoLoad™ HEK can fit into those systems!