Advanced Computing in the Age of AI | Thursday, March 28, 2024

Some Like IT Cold: Intelligence Agencies Push for Low-Power Exascale 

Like the rest of the research world, intelligence agencies are facing some difficult challenges when it comes to fielding the next generation of supercomputers. To help usher in a more efficient mode of computing, the US intelligence community is investing in superconductive computing research. 

Power and cooling challenges extend across the IT spectrum, but they are especially prominent at large-scale computing installations. As the HPC community gets closer to breaking the exascale barrier – by building a system that is one-hundred times faster than the current generation – power presents a major roadblock.

The current fastest documented system is located in China, but not all government agencies wish to make their systems public. Take the NSA for example. For many years, it's very existence was denied, with the acronym winkingly referred to as "No Such Agency."

Like the rest of the research world, intelligence agencies are facing some difficult challenges when it comes to fielding the next generation of supercomputers. To help usher in a more efficient mode of computing, the US intelligence community is investing in superconductive computing research.

Earlier this year, the Intelligence Advanced Research Projects Activity (IARPA) launched its Cryogenic Computing Complexity (C3) Program "to establish superconducting computing as a long-term solution to the power-space-cooling problem and as a successor to end-of-roadmap complementary metal-oxide-semiconductor (CMOS) for high performance computing."

The agency points out that today's best-in-class systems are about to hit a power wall. The largest US supercomputer, Titan, installed at Oak Ridge National Laboratory, requires 8.2 MW to reach 17.59 petaflops. The world's fastest system, China's 33.86 petaflop Tianhe-2, has a peak power load of 17.8 MW, and uses 24 MW when cooling is added.

An exascale computer created with today's best technology would require several hundred megawatts of power – a far cry from the 20MW goal set by government military R&D body DARPA. Such a high demand would necessitate that a system be built with its own utility-scale power plant and would cost more than a billion dollars a year to power.

Just as the largest end of computing is hitting limits of scale, so is the tiniest element in computing, the transistor. The IARPA document notes that "conventional computing systems, which are based on complementary metal-oxide-semiconductor (CMOS) switching devices and normal metal interconnects, appear to have no path to be able to increase energy efficiency fast enough to keep up with increasing demands for computation."

Thanks to semiconductor advances described by Moore's Law, performance has been improving roughly 1,000-fold every 10 years. But transistors can only shrink so far before encountering a fundamental limit. As feature sizes come up against the atomic scale, performance improvements slow down. As systems approach exascale, the energy demand is not sustainable. Stacking silicon will buy some time, but it's not the ultimate solution.

Researchers are exploring a path beyond CMOS and superconductive switches could hold the key. Superconductivity refers to the phenomenon of a material having zero electric resistance below below a certain temperature, called the "critical temperature." This provides the setting for information to be transmitted with very little energy loss.

The government believes that "superconducting computing offers an attractive low-power alternative to CMOS with many potential advantages." One look at the potential FLOPS-per-watt profile and it's easy to understand the draw. According to published studies, superconducting technology sets the stage for one-petaflop systems that use just 25 kW and 100 petaflop systems that operate at 200 kW, including the necessary cryogenic cooler.

The most energy-efficient system today, Italy's Eurora supercomputer, operates at 3.20 gigaflops/watt. Compare that to the "new" cryogenic one-petaflop system at 40 gigaflops/watt or the new 100 petaflop system, even more efficient, at 500 gigaflops/watt. Those are the kinds of FLOPS-per-watt metrics that will sustain computing advances through exascale and beyond.

IARPA envisions a two stage project, spanning five-years. Phase one, lasting three years, will be focused on developing the technologies required to demonstrate the value of superconducting computing. Phase two will integrate those technologies into a "small-scale computer based on superconducting logic and cryogenic memory that is energy-efficient, scalable, and able to solve interesting problems."

The government sees a lack of current research into superconductive computing. Despite the promise of the technology, there are some significant challenges, which the C3 program will address, including "insufficient memory, insufficient integration density, and no realization of complete computing systems."

If all goes as planned, "the success of C3 will pave the way to a new generation of superconducting computers that are far more energy efficient than end-of-roadmap CMOS and scalable to practical application."

EnterpriseAI