Advanced Computing in the Age of AI | Friday, March 29, 2024

Dell Previews Fluid Cache Acceleration for SANs 

At its DellWorld extravaganza in its hometown of Austin, Texas this week, Dell was showing off a new implementation of its Fluid Cache caching software, this one tuned up as a front-end for storage area networks.

The Fluid Cache software, which was launched earlier this year to goose the performance of direct-attached storage in PowerEdge servers, is based on clustering software that Dell got its hands on when it acquired Portland, Oregon-based RNA Networks in June 2011. RNA Networks was among a handful of companies that had created systems software for gluing together multiple X86 systems into a shared memory cluster. (Virtual Iron, now part of Oracle, was another, and ScaleMP is probably the most famous of suppliers of such software.)

RNA was founded in 2006 by Ranjit Panda, who worked on the Pentium 4 chip and the InfiniBand interconnect whole at Intel, and Jason Gross, who worked with Panda on database clustering tools at SilverStorm Technologies. The company had engineers from Cray, Intel, QLogic (which bought SilverStorm), and Akamai Technologies. The company's software was able to create a shared global memory space across a cluster of machines that keeps the memory coherent but not as tightly as symmetric multiprocessing (SMP) or non-uniform memory access (NUMA) server clustering, which allows one copy of an operating system to see the cluster as a single machine. Rather than this tight coupling, what the RNA software did was use standard InfiniBand or Ethernet links between the machines to turn all of the individual main memory in each server node into a global memory space that all of the nodes can see and access as if it were local main memory. In fact, each node thinks it has all of the memory in the cluster, and the messaging engine and pointer updating algorithms at the heart of the software allows for workloads that use messaging protocols – think databases, financial services, and scientific simulation applications – can scale horizontally across multiple server nodes for compute performance and yet use their main memory as a single pool.

dell-fluid-cache-demo

With the Fluid Cache software, what Dell did was create a global shared pool of flash memory instead of main memory. With Fluid Cache for SAN, the idea is to run the software on a server cluster and use flash storage to accelerate the performance of applications that store data on external SANs but keeping the flash right inside of the servers, where it is close to the processors and can radically boost the performance of applications.

It is what Brian Payne, executive director of PowerEdge marketing at Dell, half-jokingly called a cache area network for both reads and writes.

"You can get lots of IOPS with flash-based arrays," says Payne, "but you could never get the lower latency that we are showing."

To ensure integrity, data is written to one Express Flash drive and then the other, and the data is not unlocked until the second copy is written and verified. Then, at some convenient time for the system, this hot data is pushed out to the back-end Compellent SAN for permanent storage. Reads of data that starts to warm up are similarly pulled off the Compellent SAN and stored in duplicate on the Express Flash modules.

In the demonstration that Dell showed off at DellWorld, a cluster of eight PowerEdge R720 servers using Xeon E5 processors and running Linux were set up with two disk drives for local operating system and database software and two Express Flash PCI-Express flash modules, created it partnership with Micron Technology. These Express Flash modules hook into PCI-Express slots and therefore offer better performance than SATA or SAS flash drives. In this case, the test used 350 GB Express Flash modules, and according to Brian Martin, the product planner for the Fluid Cache software, those two Express Flash modules (which are hot plug units that look like disk drives and slide into slots in the front of the servers) have PCI-Express 2.0 connectivity even though the servers support faster PCI-Express 3.0 links. Still, at around 450,000 I/O operations per second (IOPS) apiece, those two Express Flash units can saturate the PowerEdge R720. Customers might add up to four of these Express Flash units to a server node for capacity, but it will not increase the IOPS throughput for the node.

Each node in a Fluid Cache cluster has a cache manager and metadata manger program that runs on it. At any given time, only one copy of these two programs are running and in charge, but the extra copies are there in the event that the primaries fail and can take over. The Fluid Cache stack also has a cache client program, which runs on each node and lets them access the data stored in the flash pool. And, importantly, this client can run on any Linux-based server (it doesn't have to be a PowerEdge 12G box at all) and allow that machine to access the data in the pool. These non-Dell servers will not, of course, get the benefit of the low latency because they don't have the Express Flash modules right next to their CPUs and the Fluid Cache software keeping data they need right it their L3 caches (the equivalent of a fingertip for a CPU).

On an eight-node PowerEdge R720 cluster that was front-ending a Compellent SAN, Dell loaded up an Oracle database (no one at DellWorld was sure if it was the 11g or 12c version) on the server nodes. On a read-only benchmark, those eight nodes were able to field 5.17 million IOPS with 6 millisecond response times, and that translated into chewing through around 13,000 transactions per second while supporting around 14,000 end users. With the Fluid Cache for SAN software turned off, this same Oracle database cluster could handle around 2,000 users to process around 3,600 transactions per second with a response time of just under one second.

This is obviously a very big boost in performance. But it is not limited to database workloads. Virtualized servers, virtual desktop infrastructure, and any workload with lots of random reads and writes would benefit from Fluid Cache for SAN, Payne tells EnterpriseTech.

The expectation is that Dell will move to flash storage modules developed in conjunction with Samsung that will do a number of different things to push performance even further. First, the PCI connectors coming out of the backend of the modules will support the faster 3.0 links as well as the NVM Express host interface for solid state drives being pushed hard by Intel, Dell, Oracle, Cisco Systems, EMC, Seagate Technology, Micron, NetApp, SanDisk, and others. NVM Express cuts about 33 percent of the latency out. Second, the new Samsung units will be available in 800 GB and 1.6 TB capacities, which will allow a lot more data to be stored in the Fluid Cache.

Dell can push this even further if it wants to. For one thing, the RNA pooling software can scale to 128 nodes, although it has only been certified to run across eight nodes at the moment. This, says Payne, is sufficient for the initial workloads that Dell is targeting. It would not be surprising to see Dell create a Fluid Cache accelerator based on main memory, either, especially considering that this was the original use of the RNA Networks software, which created a RAMdrive in main memory to be used as a cache for server workloads spanning multiple nodes.

Fluid Cache for SAN is in beta testing now and will be available in the first half of 2014.

The word on the street is that Dell is trying to time the formal announcement for when those Samsung units will be available, and that could happen as early as April next year. Pricing has not been set yet, but Payne did say that Fluid Cache for SAN "would be priced based on the value it creates."

With the Fluid Cache for DAS, which only ran on a single server node with multiple Express Flash modules, Dell charged $3,500 per server plus a $700 annual maintenance fee. The 350 GB Express Flash module costs $5,147.

EnterpriseAI