Advanced Computing in the Age of AI | Friday, March 29, 2024

As Data Analytics Workload Complexity Explodes, FPGAs May Find a Niche in Enterprise Clusters 

It’s the holiday season and out of the milling hordes of shoppers at a brick-and-mortar store, an individual consumer approaches the jewelry department. His cell phone is on, releasing data that’s picked up by the retailer’s consumer data analytics system. An in-store camera also spots the shopper and, utilizing advanced image and video processing techniques, recognizes him and observes his activity. Combining insights based on his buying history and the jewelry he’s looking at, a special offer pops up on the shopper’s cell phone, which has an app for that store, in the form of a text message: a special sale on lady’s earrings. The perfect gift!

Question: Is this the kind of use case – processing high volumes of data from multiple sources for immediate insight – well-suited for a standard x86-based cluster? Only a hardened x86 advocate would say that it is. That’s why market opportunities exist outside of the x86 world for advanced scale computing innovators. One of them is Ryft Systems, Rockville, MD, which is migrating its FPGA-based technology to the commercial high performance data analytics server arena from the federal market (military and intelligence), where its ruggedized, hand-held devices are used by soldiers on the front lines of Afghanistan.

Ryft may have timed its move well. According to industry watcher Steve Conway, research vice president of International Data Corp.’s, High Performance Computing Group, the data analytics marketplace can be divided into two camps: workloads that tend to be high volume/low complexity over against high volume/high complexity, and the latter category is growing fast.

“It’s not just the ability to process large volumes of data,” said Conway. “That’s not hard, that’s easy. The hard thing is to solve analytics problems that are complex in nature, involving lots of different factors and are time-critical. These problems are more strategic, with a higher economic value, which is why more organizations are moving toward them. Problems of higher complexity are growing as a proportion of the market, while less complex problems is on the wane.”

Aspects of these advanced applications can lie beyond capabilities of traditional x86 chips, Conway said, “and that’s why you’re seeing more than one type of processor in more and more computers today,” such as NVIDIA GPUs, Intel Xeon Phi co-processors, ARM processors and other “accelerated computing” chips installed within clusters.

"X86 processors are dominant today because they have very good price-performance,” said Conway. “They’ve come a long way, but they came up from the desktop, they were never really meant to be HPC processors. They’re very economical, but kind of a loose fit for a lot of HPC, loose enough that they’ve left room for other kinds of processors to kind of fill in the gaps where x86 doesn’t really excel.”

Ryft’s strategy is to combine the extreme throughput of FPGAs (field programmable gate arrays) with features designed to overcome resistance to FPGAs in the commercial realm: implementation complexity. These include an open API and software tools that eliminate the need to understand FPGA programming, called “primitives,” which are pre-built algorithm components for business use-cases like exact search, fuzzy search, term frequency.

The Ryft ONE is a compact processing powerhouse that the company says can ingest, search and analyze up to 48 TB of any type of data – structured and unstructured, historical and steaming – at 10 GB per second. It can be equipped with 10Gbps, 40Gbps, or InfiniBand network input network input speeds that, housed within the server’s 1U footprint, means low latency. It utilizes an x86-based front end service and log-in node that allows users to run standard Linux tools.

“They have something very small, very dense, but it fits easily into usable data center environments, it plays nice with the other equipment in the data center,” Conway said. “Something with a big memory space, theirs can scale up to 48 TB of memory and super-fast processors – much faster than CPUs. Since it’s such a small device the data can move around very quickly, which has been the Achilles heel of clusters.”

John von Neumann

John von Neumann

As Ryft describes it, FPGAs open up a new world for data analytics by addressing challenges posed by x86-based processing, challenges whose roots go back 70 years. Ryft’s Vice President of Engineering, Pat McGarry, argues that x86 and GPU processors are based on a sequential processing design originally proposed by John von Neumann in 1945, the limitations of which are becoming increasingly exposed as workloads grow in complexity.

“It’s pretty crazy,” McGarry said. “You can change the programs that run on von Neumann architectures, but it’s always the same underlying hardware architecture. An FPGA, on the other hand, can implement any hardware architecture, so you aren’t constrained in the same way you are with sequential processors. Being able to implement arbitrary architectures in FPGA fabric without being reliant on static instruction sets is a primary reason that FPGAs outperform CPUs and GPUs for most operations.”

Put another way, with sequential processing, algorithmic problems are distributed into a series of operations that are then executed in a sequence. But FPGA's offer a flexible architecture that can do parallel processing: algorithmic problems can be distributed and multiple parallel operations happen simultaneously, executing the algorithm at one time.

Conway agrees that FPGAs are better equipped for certain high-performance roles within complex data analytics environments. “Technically speaking, x86 processors are designed for instruction-level parallelism, but the kinds of big data problems we’re talking about are different, they’re data-level parallelism, it’s a different category of problems that’s very rich in problem types.”

The Ryft ONE is positioned to complement, not replace, x86-based nodes, to take on jobs that would otherwise impede clusters from completing time-critical workloads.

Pat McGarry of Ryft

Pat McGarry of Ryft

McGarry said Ryft customers won’t allow him to cite their implementations, joking that they are as secretive as the company’s customers in the defense and intelligence communities. But he discussed the hypothetical use case of a large trading company needing advanced servers for insider trading monitoring. This requires simultaneous analysis of historical trading data while monitoring current trades, emails and social media communications, such as Twitter and Facebook, along with phone calls, using voice-to-text processing. All of this data is thrown together and monitored for suspicious behavior.

Let’s say a trader makes a phone call to an unknown phone number the trader has not previously called. The system monitors the call to be sure the trader isn’t, for example, giving a stock tip to a family member or friend.

McGarry said this kind of application in a conventional computing scheme would call for a massive 32-core, x86-based cluster with hundreds of nodes. Yet it would still be inadequate for the complexity involved, he said, because voice-to-text results in frequent misspellings of words and names, requiring fuzzy logic techniques. Also, because of the large amount of unstructured data involved, traditional database indexing techniques would prevent the system from providing near real-time analysis.

“You can’t wait hours or days to index the data, load it into a database, transform it,” McGarry said, “you have to understand it immediately in time. Literally, there are scenarios where you need to shut down the phone conversation in milliseconds. There is no way traditional clusters can do that.”

Indexing raises another x86 computing challenge that McGarry said FPGAs overcome: the ETL bottleneck. The Ryft ONE’s FPGA fabric searches data so fast that there is no need to extract, transform and load, or index, data. “It can be structured, like XML or JSON, or pick your favorite comma separated value, or key value,” he said. “Or it can be unstructured, a bunch of bits, text or binary. To us it doesn’t matter, we search it all the same way, at 10GB per second or faster.”

Avoiding ETL not only saves time, it restrains data expansion.

“Users typically have a relatively small amount of initial data, but then they transform it, they index it and then that data explodes,” McGarry said. “But if you don’t have to do ELT your data grows more linearly, not exponentially, and that’s what you want. That’s one of the big hurdles the community is finally starting to realize. The problem isn’t too much data, it’s making the data bigger than it has to be.”

This is Ryft’s second foray, begun in early 2014, into the commercial market. Formerly called Data Design Corporation, the company made its first attempt several years earlier under the assumption that its success in the federal market would transfer to commercial adoption. But McGarry said the company failed to account for unfamiliarity with FPGAs.

“Then we had an ‘ah-ha moment,’” McGarry said. “There are a lot of people who understand FPGAs in the military and intelligence communities, how to use them, how to program them, but in the commercial space there are very smart people doing data analytics who can’t spell FPGA.”

With this realization, the Ryft ONE began to take shape, designed to be an FPGA platform for people who don’t understand FPGAs.

“We abstract all that away,” McGarry said. “It happens to have a massive up to 48 TB file system in it, but to a user it looks like a standard file system with massive-size files in it, and you can even edit the files like any Linux file system files, but behind the scenes all the high performance FPGAs are managing all this stuff the user doesn’t know about. It interfaces with any Linux tools they want to use, it can be incorporated into existing cluster architectures. We just call it a Linux box on steroids.”

Conway said Ryft shows promise.

“The world will need to see how it actually performs on these problems,” he said, “but it’s certainly designed to be one of the densest big data machines available. They’re one of the very interesting companies, from our standpoint, coming into the big data space.”

EnterpriseAI