Inside Extreme Scale Tech|Tuesday, September 2, 2014
  • Subscribe to EnterpriseTech Weekly Updates: Subscribe by email

Next Generation SGI ICE X Scale-Out Bladed HPC Cluster Introduced at SC11 

Since it was first introduced in 2007, SGI has gotten a lot of mileage out of its SGI ICE platforms. 

With today’s announcement at SC11, the company is upping the ante by introducing its fifth generation of these popular HPC clusters – the SGI ICE X, code named “Carlsbad 3.0.”

The new system features a number of enhancements over previous generations, including enhanced cooling options at the node level and a higher density form factor through unique blade design, supporting up to 2,304 processor cores per rack. The ICE X system also supports independently scalable power at the enclosure level, along with Fourteen Data Rate (FDR) 56Gb/second InfiniBand support, including options for single-port, dual-port, and dual single-port Mellanox ConnectX-3 InfiniBand HCA mezzanine cards. According to SGI, it’s the industry’s first over one petaflop InfiniBand pure compute connected CPU cluster.

The platform leverages the performance of the future Intel Xeon processor E5 family.

Because it is built on industry standard hardware and software components, ICE X enables access to the entire Linux ecosystem, including the completely unmodified SUSE Linux Enterprise Server or Red Hat Enterprise Linux operating systems.

Scalability and flexibility are key.  The ICE X platform scales from half a rack to hundreds of racks and from tens of teraflops to tens of petaflops without breaking a sweat. It supports a variety of topologies – such as hypercube, enhanced hypercube, all-to-all and fat-tree topologies, which allows each deployment to be tailored to the customer’s specific needs.

Here’s what SGI CEO Mark J. Barrenechea had to say in a press release issued today: “This is our first fully choreographed engineering cycle for the new SGI with Intel, bringing Intel’s next generation Romley architecture to market. We expect to extend our share in the large-scale cluster market significantly with the new SGI ICE X, as it is designed for scale, speed and density.”

 No Interruptions, Please

ICE X continues the tradition of live integration into an existing HPC system without requiring that the host system go off line.  For example, last year SGI integrated a 512-core rack ICE system into NASA Ames’ Pleiades HPC system while Pleiades was running a full production workload.  This feat was accomplished by connecting the new rack’s InfiniBand dual port fabric via 44 fibre cables.

According to NASA, the live integration saved two million hours of productivity that had previously been lost each time a planned outage occurred.  This rather staggering number is the result of a cascade of events.  When outages are planned, users get a one week notice that the system is going down.  Inevitably system utilization plummets about three days before actual shutdown – after all, why start a batch job, many of which run five days or more, if they can’t be finished by the time of the outage.

Here’s where that flexible network topology comes in handy – SBI hypercube-based InfiniBand network topologies allow customers to not only add nodes and switches, but also racks of nodes and switches without disrupting the existing production load. ICE X continues this important tradition.

 Global Customers

In addition to NASA Ames, SGI has a long list of customers using its ICE systems.  For example, here’s some feedback from Scandinavia: “Norwegian University of Science and Technology (NTNU) and the Norwegian Meteorological Institute (met.no) have worked together for over 20 years on high performance computing systems for research and numerical weather prediction,” said Roar Skalin, Director of Information Technology at the met.no. “With the newest SGI ICE X, the compute resources available for our numerical weather prediction will increase by a factor of 20 without increasing space in the data center.”

There are many other large system customers around the world using the platform – organizations like Tokyo University, Skoda Auto, Idaho National Laboratory, Sikorsky, the Korean Air Force, McLaren Motor Racing, Oak Ridge National Lab, and the U.S. Army, to name just a few. 

For many of these customers, SGI characterizes ICE as a computational fluid dynamics (CFD) workhorse.

In fact, according to a SGI slide presentation, the platform is the world’s fastest and most scalable computational fluid dynamics system, a claim backed up by these stats: SGI ICE 8400 demonstrated unmatched parallel scaling up to 3,072 cores with a rating of 1,333.3 standard benchmark jobs per day; and it also proved the ability to run ANSYS FLUENT on all 4,092 cores. To date, no other cluster has reported ANSYS FLUENT benchmark results above 2,048 cores.

All this has implications for the upper end of the so-called “missing middle,” the medium-sized manufacturers that may already have some CFD capability in-house but lack the computing infrastructure to take full advantage of this modeling and simulation technology.  A big user, NASA, is setting an example for the smaller companies by making computational capabilities available to its engineers through the cloud. However, the pricing and scalability of the ICE platform allows users who want their own in-house system to start with a small footprint and then, if business booms, grow it to the largest petascale system on the planet if needs be – fostering what one SGI spokesperson labeled as “the democratization of petaflops.”

SGI is exhibiting at SC11 in Seattle, November 14-17, 2011, in Booth 1841 at the Seattle Convention Center.

Add a Comment