Advanced Computing in the Age of AI | Saturday, April 20, 2024

Answered Prayers for High Frequency Traders? Latency Cut to 20 Nanoseconds 

(Vintage Tone/Shutterstock)

“You can buy your way out of bandwidth problems. But latency is divine.”

This sentiment, expressed by Intel Technical Computing Group CTO Mark Seager, seems as old as the Bible, a truth universally acknowledged. Latency will always be with us. It is the devilish delay, the stubborn half-life of the time gap caused by processor and memory operations, that has bewitched computer architects, IT managers and end use since the genesis of the computer age.

Solarflare Communications is an unheralded soldier in the eternal war on latency. With its founding in 2001, Solarflare took on the daunting raison d'être of grinding down latency from one product generation to the next for the most latency-sensitive use cases, such as high frequency trading. Today, the company has more than 1,400 customers using its networking I/O software and hardware to cut the time between decision and action.

In high frequency trading, the latency gold standard is 200 nanoseconds. If you’re an equity trader using a Bloomberg Terminal or Thomson Reuters Eikon, conducting transactions at speeds of 200 nanoseconds or more is considered to be shockingly pedestrian, putting you at risk of buying or selling a stock at a higher or lower price than the one you saw quoted. Now, with its announcement of TCPDirect, Solarflare said it has cut latency by 10X, to 20-30 nanoseconds.

“To drop that to 20 nanoseconds, that’s pretty amazing,” Will Jan, VP and lead analyst at IT consultancy Outsell, told EnterpriseTech.

He said most traders use Solarflare technology without knowing it, in the way we drive cars made up of parts not made by Toyota or Ford but by parts manufacturers, such as Bosch or Denso.

“They’re the backbone of a lot of server providers,” he said. “I always thought HPE, IBM and Dell…made this particular network IO component in software, but it turns out these guys are the providers. In this particular niche, when it comes to components that lower latency, these (server makers) farm it out to Solarflare. They’re happy making a lot of money in the background.”

The CTO of an equity trading firm, who agreed to talk with EnterpriseTech anonymously, said his company has been a Solarflare customer for four years and that its IT department has validated Solarflare’s claims for TCPDirect of 20-30 nanoseconds latency.

He regards Solarflare as a partner that allows his firm to focus on core competencies, rather than devoting in-house time and resources to lowering latency.

“It used to be the case that there weren’t a lot of commercial, off-the-shelf products applicable to this space,” he said. “If one of our competitors wanted to do something like this for competitive advantage, Solarflare can do it better, faster, cheaper, so they’re basically disincentivized from doing so. In a sense this is leveling the playing field in our industry, and we like that because we want to do what we’re good at, rather than spending our time working on hardware. We’re pleased when external vendors provide state-of-the-art technology that we can leverage.”

TCPDirect is a user-space, kernel bypass application library that implements Transmission Control Protocol (TCP) and User Datagram Protocol (UDP), the industry standards for network data exchange, over Internet Protocol (IP). It’s as part of Onload, Solarflare’s application acceleration middleware designed to reduce CPU utilization and increase message rates.

Solarflare's Ahmet Houssein

The latency through any TCP/IP stack, even written to be low-latency, is a function of the number of processor and memory operations that must be performed between the application sending/receiving and the network adapter serving it. According to Ahmet Houssein, Solarflare VP/marketing and strategic development, TCP/IP’s feature-richness and complexity means implementation trade-offs must be made between scalability, feature support and latency. Independently of the stack implementation, going via the kernel imposes system calls, context switches and, in most cases, interrupts that increase latency.

Houssein said TCPDirect attacks this network stack overhead problem with a “slimmer” kernel bypass architecture in which the TCP/IP stack resides in the address space of the user-mode application. This approach works for high frequency trading, he said, because it’s a use-case that requires only a limited number of connections, rather than the full TCP/IP feature-set included in Onload. Designed to minimize the number of processor and memory operations between the user-space application and a Solarflare Flareon Ultra server IO adapter, TCPDirect employs a smaller data structure internally that can be cache-resident.

TCPDirect’s zero-copy proprietary API removes the need to copy packet payloads as part of sending or receiving. Each stack is self-contained, removing processor operations other than those required to get the payload between the application and the adapter.

“We run in a standard x86 server, we are an Ethernet company and we are compliant with standard infrastructure, but for those applications that require this level of performance we give them this special piece of software,” Houssein said. “(TCPDirect) runs on top of our network interface controllers within that standard equipment.”

Solarflare concedes that TCPDirect is not a perfect fit for all low-latency use-cases because its hyperfocus on latency sacrifices some of the features found in Onload, of which TCPDirect is delivered as a part. Implementing TCPDirect requires applications to be modified to take advantage of the TCPDirect API – applications used, for example, in high frequency trading where latency is a quest that knows no end.

“If your competitors are getting faster, then you have to get faster too,” said the anonymous Solarflare customer quoted above. “Honestly, we’d prefer to say, ‘We’re good, let’s stop here and focus on other things.’ But no one’s going to do that.”

EnterpriseAI