Pure Storage 250 TB All-Flash Array Takes On Disks
All-flash array upstart Pure Storage is revving up its machines to take on bigger storage jobs as well as pushing down into smaller sites with its fourth generation of products. The new FlashArray 400 series have been refreshed with zippier X86 controllers, and like their predecessors, the machines use consumer-grade multi-level cell (MLC) NAND flash memory, which is less expensive than the enterprise-grade stuff, and sophisticated software on the controllers to make flash bear up under the enterprise strain.
There are myriad ways to make use of flash in server and storage systems, and the approach that Pure Storage decided to take when it was founded back in 2009 was to employ the consumer-grade flash and try to bring the cost of an all-flash array down to the same level as disk arrays so it would be widely deployed. Others in the market, says Matt Kixmoeller, vice president of products at Pure Storage and the sixth employee of the company, focused on performance and resiliency but not so much on cost. Pure Storage, Kixmoeller says, can deliver very good performance where it matters – with large block sizes – and has been able to get the cost of an all flash-array down to parity this year with a plain-vanilla disk array on a cost per gigabyte basis. (More on that in a moment.)
Two new arrays have been added to the FlashArray lineup. The top-end FA-450 is by definition the one that will be more interesting to EnterpriseTech customers looking to scale out. Like the existing FA-420 from last year, the FA-450 has a pair of flash controllers that are each 2U high. These flash controllers are tightly coupled and run in an active-active state, spreading work between the two nodes, and the interesting bit is that Pure Storage actually throttles back the performance on each node by half so that in the event of a controller crash the array can still drive the full throughput. With the new FA-405 and FA-450 units, these controllers have been upgraded to the latest "Ivy Bridge" Xeon E5-2600 v2 processors from Intel. The FA-420, which uses an earlier "Sandy Bridge" Xeon E5-2600 v1 controller, will continue to be available.
The FlashArray machines are designed so the controllers and flash enclosures that store the data can be upgraded independently of each other, and with flash technology changing every six to nine months, this is a necessity, not just a nicety. In one of the interesting innovations from Pure Storage, if you keep the machines on maintenance, every three years you get a controller upgrade for free. The Purity flash storage operating system that runs on the controller is kept current for all customers on maintenance. Every time you upgrade the controllers or shelves, Pure Storage allows customers to reset the maintenance to the then-current first year rates, which is another twist that the company has come up with to try to attract new customers and keep existing ones. The Purity software also includes replication between arrays, a feature that is usually an extra on disk arrays and that is also usually the most costly piece of software you can buy, according to Kixmoeller.
With the FA-450, the controllers come in dual 2U enclosures and are using the top-end twelve-core Xeon E5 processors and have a total of 1 TB of memory (that is 512 GB each). This machine can have up to four flash enclosures for between 34 TB and 70 TB of raw flash, depending on the capacity chosen for the SAS drives that slide into the enclosures. The flash drives link to the controllers through 6 Gb/sec SAS 2.0 links, and it is natural enough, says Kixmoeller, to expect a controller upgrade with the "Haswell" generation of Xeon E5 v3 chips later this year and a jump to the 12 Gb/sec SAS 3.0 controllers expected with these processors. Aside from managing all of the wear-leveling on the flash drives, the controller also runs in-line de-duplication and data compression algorithms on the data stored on the flash. Assuming a 6 to 1 compression ratio on average as well as the use of RAID data protection and reserve flash capacity to account for wear, the FA-450 delivers 125 TB to 250 TB of usable capacity (depending on the size of the flash media) and can push around 200,000 I/O operations per second using 32 KB file sizes and deliver around 7 GB/sec of sustained bandwidth with an average latency of under 1 millisecond.
Pure Storage uses a multi-source strategy for its SAS MLC drives, and if you look under the covers, you will find units from Samsung, Toshiba, and STEC in the current lineup.
A lot of disk and flash (and hybrid) disk arrays do their tests with 4 KB or 8 KB file sizes, and this is nonsense, says Kixmoeller. Based on data gathered from its more than 1,000 customers using a call-home feature that gives the company anonymized metrics about the machines are performing, and from that data the average I/O size is actually 38 KB. Here is how Pure Storage performs in the various block sizes compared to rival arrays:
In the "vanity benchmark zone," as Kixmoeller puts it, other flash arrays do show better performance. (He would not name names, as is often the case with such comparisons.) But in the hot zone where file sizes are 32 KB and higher, the Pure Storage arrays deliver two to three times the IOPs as competitors.
While performance is important, Pure Storage is always thinking about the price of the resulting flash array, and he is how the company sees the evolution of its all-flash arrays compared to plain vanilla disk arrays:
The disk arrays in the comparison above are not using de-duplication or compression software to try to squeeze more data onto their spinning platters, and this is not precisely a fair comparison. But it at least gives some sense of how an all-flash array has compared to disk arrays based on SAS or SATA drives over the past five year and where it might end up in the sixth year. As you can see from this chart, the in-line de-dupe and compression software is what gives Pure Storage a significant cost advantage over its rivals. So does using SAS flash and X86 controllers instead of expensive custom ASICs.
As for suggested pricing, Pure Storage has to be careful because it peddles its arrays through its channel partners, who actually set the prices. But an entry FA-405 array, which is aimed at remote offices of large enterprises as well as SMB shops with more modest flash storage needs, would cost under $100,000 for around 10 TB of usable capacity. (The FA-405 puts two controllers using more modestly powered eight-core Ivy Bridge Xeon E5s and 128 GB of memory into a single 2U chassis, and has up to 40 TB of usable capacity.) At the low-end of the market, customers are more worried about acquisition cost than cost per gigabyte, but at the high-end, cost per gigabyte is the ruling consideration, according to Kixmoeller. A heavily configured FA-450 could run up to $500,000, he says, but a lot depends on the features and the discounts the channel partners give.
With over 1,000 arrays sold to date, Pure Storage is definitely a player in the all-flash race in enterprise datacenters. Kixmoeller says that no one customer represents more than 5 percent of its annual revenues, not even marquee customers like LinkedIn, Workday, ServiceNow, and Shutterfly. All of the storage incumbents – NetApp, EMC, IBM, and Cisco Systems – are coming after all of the all-flash upstarts, which include PureData, Violin Memory, SolidFire, Kaminaro, Nimbus Data, and a bunch of others.
Tim Stammers, senior storage analyst at 451 Research, estimates that Pure Storage had somewhere between $75 million and $100 million in sales in 2013 and was growing at 50 percent sequentially from quarter to quarter. Pure Storage has raised a whopping $474.9 million in six rounds of venture funding and is backed by Greylock Partners, Sutter Hill Ventures, Redpoint Ventures, Index Ventures, Wellington Management, Tiger Global Management, T. Rowe Price, Samsung Ventures, and Glynn Capital Management. The company currently has a valuation of $3 billion, has more than 550 employees, and more than 200 channel partners. It has a strong belief that over time all of the $15 billion that is currently being spent on disk arrays will move over to flash arrays.
What Pure Storage doesn't have is a server or storage incumbent with deep pockets that wants to acquire it – all the big players have already done their acquisitions – and no one would likely pay the price that Pure Storage's investors would want to get their bait back with some extra fish on the line for their trouble. The company said last month that it was not pressing to do an initial public offering any time soon. The company seemed more eager to do one back in 2012 when it closed its Series E round of funding. Having raised so much money in so many rounds, it will be tough to do a Series G round. The ups and downs of Violin Memory, which went public last September, has made both Wall Street and privately held flash array suppliers a little wary. But the thing to remember is that just because some venture capitalists can't get rich quick does not mean a company or its product are not good. Pure Storage will stand or fall on the merits of its engineering and marketing and the sales of its channel. It has made a compelling case thus far, or it would never have raised as much money as it has. With $15 billion in storage array spending at stake, it will get its slice so long as it keeps moving.