Advanced Computing in the Age of AI | Wednesday, April 24, 2024

Oracle Stresses System Performance and Scalability 

As Oracle Team USA's dramatic come-from-behind win over Emirates Team New Zealand in the America's Cup yacht race aptly demonstrates, Oracle's co-founder and CEO, Larry Ellison, believes in extreme performance. Ellison pursued the acquisition of Sun Microsystems three years ago to be able to build systems that made databases, analytics, and other enterprise applications scream. While the systems business has struggled to grow, Ellison is showing patience and cannot fault the design in its so-called "engineered systems."

That is Oracle's term for machines that are optimized from the farthest flung disk drive in the system up through to the application software residing far above the iron to deliver the best performance possible on Oracle's own system and application software. At the OpenWorld customer and partner conference hosted this week by Oracle in San Francisco (not coincidentally when and where the America's Cup race was held), John Fowler, executive vice president of systems at the company, talked about Oracle's philosophy with systems and its success with large customers.

Fowler also showed off the company's new M6-32 shared memory monster system. Details on this machine was not available when as EnterpriseTech went to press on Sunday night when OpenWorld kicked off.

What Oracle is focused on since taking over a hardware business is performance. "This has been one of Larry's mantras right from the beginning," Fowler said. "How do I build the fastest microprocessor? How do I build the fastest storage? How do I build the fastest severs?"

Performance is not just about tackling large scale problems, but also about making systems more efficient so, ironically, you can use fewer of them to do a certain amount of work. But even this is enabled by increases in raw performance. "We have no engineering activities to generate products that are 30, or 40, or 50 percent faster than what you could do yourself," said Fowler. "We have many discussions on whether we should do this or that, and it always comes back to: What is the performance gain? We believe for you to adopt an engineered system and incorporate it into your infrastructure, you have to see a very significant capability gap because then it renders into the economics of being able to reduce your infrastructure."

Because the thousands of engineered systems that Oracle has installed in customer sites are identical to the ones that Oracle patches and tests in its labs every night, when it learns a trick about tuning performance for one customer, it is then available to others. This is a virtuous feedback loop that you cannot get so easily from diverse hardware. And this message, said Fowler, is why Oracle has sold over 2,000 of its engineered systems, which are based on both Xeon and Sparc server platforms, in the past two quarters.

"We actually deliver on the promise of extreme performance and extreme efficiency," he said. "We now have customers with tens and fifties of engineered systems, and they have become a standard part of their core infrastructure."

One such customer is the PayPal online payment unit of eBay. PayPal had a series of queries it needed to run as part of an online transaction, and it wanted to be able to do those queries in under 100 milliseconds, which is about the limit of patience for a human being these days. And it went with the Exadata database cluster because the combination of hybrid columnar data store and compression on the storage arrays, the flash storage that is in the server nodes, the fast InfiniBand networking that links the storage arrays and the servers, and the tuned-up Oracle database running on those servers, allowed it to hit that performance target. Compared to the prior platform in use by PayPal (Fowler did not identify it), PayPal was able to boost transaction volumes by a factor of two and queries across petabytes of data by a factor of ten to meet that 100 millisecond goal.

This was before the M6-32 "big memory" server came to market, which might have done better, This big memory machine, as Oracle calls it, combined with in-memory features of the new Oracle 12c database, are set to shakeup the high-end of the server racket and maybe even sales of clusters for databases and application servers that underpin online transaction processing, analytics, and other workloads.

The Relentless Pursuit of Performance

Fowler introduced the Sparc M6 processor and the new M6-32 system, with 32 sockets and 32 TB of main memory, that makes use of it. And he made fun of such an old idea as a big shared memory system, which like many in the IT industry he called SMP, short for symmetric multiprocessing, even though it is really a NUMA box, with non-uniform access to memory across processors in the machine. Fowler ad-libbed the reaction of a hypothetical potential customer for the M6-32:

"Large scale SMP? Isn't that sort of counter-culture? What are you guys doing? I am building all of my infrastructure out of smaller systems."

What Oracle is doing, explained Fowler, is scaling up systems and flat-lining the cost per unit of performance, instead of giving customers a "bad deal" as has been the case with shared memory systems in the past.

The chart below takes aim at IBM's Power Systems machines because they are the leader in Unix-based shared memory systems (as Sun Microsystems used to be a decade ago), but obviously Silicon Graphics sells very large Xeon E5-based UV 2 systems that can have up to 128 sockets and 64 TB of memory, besting both IBM's Power 795 and Oracle's M6-32. SGI's UV 2 machines support Windows Server or Linux; its own Irix version of Unix was mothballed back in the mid-2000s.

oracle-vs-ibm-numa

"We have a near-linear pricing model, and this means we don't have a prohibitive rent for doing large scale systems in terms of the oomph per dollar. We have a near linear scale, which means that you can simplify your infrastructure  by using larger scale systems like Exadata, M6, M6 SuperCluster, and others by using a much smaller number of systems without any prohibitive price premium. We want to give you the choice of having a wide horizontal scale or choosing large scale systems."

Oracle also wants to have the fastest processors it can make, and here's a funny glimpse into what it is like to work for Larry Ellison. "Larry said let's at least double performance every generation," Fowler said in storyteller mode. "And I said, 'Larry, that's kinda hard.' And he looked at me and said, 'Let's double it every generation.' So he only listens to what he wants to listen to and it didn't really work. I tried that three times and then went back to my office. But we are focused on doubling every generation because we believe customers don't get value out of small, incremental improvements. What you want to see is a huge opportunity to tackle a new problem or do infrastructure reduction."

oracle-sparc-roadmapFowler said that the Sparc M7 chip, which will offer a "huge" boost in performance over the M6 chip, is already in the labs today being tested, and there will be a Sparc T7 kicker as well. While this picture doesn't show it, the Sparc M8 and T8 are on the roadmap. (They only flashed up on the presentation for a second before we could get the screen capture.) "We are very aggressive about the performance targets we set for our systems."

The M6 launched this week marks the third Sparc chip that Oracle has rolled out this year. (The sixteen-core Sparc T5 for entry and midrange machines and the six-core Sparc M5 for high-end servers were launched in March.) The three chips, as well as the Sparc T4 that preceded them, are all based on Oracle's "S3" core, and each chip has a different mix of cores and L3 caches to meet different performance goals.

The M6 chip has twelve cores and 48 MB of L3 cache and runs at 3.6 GHz, just like the T5 and M5 chips. The S3 cores each have 16KB of L1 data cache, 16KB of L1 instruction cache, and 128KB of L2 cache. The T4, T5, M5, and M6 chips all also have on-chip NUMA clustering, with the Sparc M series having a much more sophisticated interconnect code-named "Bixby" that can, in its first iteration, scale up to 96 sockets and 96 TB of main memory. So presumably, at some point, there will be an M6-64 and an M6-96 system put into the field by Oracle.

oracle-sparc-m6-bixby-interconnect

To round out the feeds and speeds of the M6-32 that debuted earlier this week, which were missing at the time. The M6 chip has one cryptographic unit and one floating point unit per core, plus on-chip accelerators for popular hashing algorithms and to generate random numbers. The system uses 16 GB or 32 GB DDR3 memory sticks running at 1.07 GHz, and in theory could support 64 GB sticks but even Larry Ellison doesn't have that kind of money to throw around. The system has 32 10Gb/sec Ethernet ports and 64 PCI-Express 3.0 x8 peripheral slots, and has room for 32 SAS disks in a 2.5-inch form factor that come in a 600 GB capacity. Customers can also use 2.5-inch SAS solid state drives with 300 GB capacity.

The system supports Solaris 11.1 and can run the earlier Solaris 10 1/13 in a guest domain and can support older Solaris 8 and Solaris 9 applications inside Solaris containers inside of the Solaris 10 guests. Oracle's Solaris middleware, database, and application software stack is certified to run on top of the M6-32, and so is IBM's WebSphere middleware and DB2 database, as is SAP's Sybase IQ. Pricing has not be listed on the Oracle web site, but Ellison suggested a configured system cost $3 million in his keynote on Sunday.

EnterpriseAI