Advanced Computing in the Age of AI | Thursday, March 28, 2024

Examining the PUE Metric 

<p>In-depth metrics like OPS (on-base plus slugging percentage) developed in baseball to describe a batter’s efficiency and power at the plate, discounting things like game situation, and sacrifice and base-running ability. There exist similar metrics to describe the specific in the world of green computing. The Green 500 list is ranked on Megaflops/Watt, a statistic that considers the entire supercomputing operation. Like OPS, PUE (Power Usage Efficiency) measures power and efficiency where the datacenter is their plate, so to speak.</p>

When the sabermetric revolution hit baseball, predictive statistics like WHIP (walks and hits per inning pitched) and OBP (on-base percentage) emerged. Later, more in-depth metrics like OPS (on-base plus slugging percentage) developed to describe a batter’s efficiency and power at the plate, discounting things like game situation, and sacrifice and base-running ability.

There exist similar metrics to describe the general and the specific in the world of green computing. The Green 500 list is ranked on Megaflops/Watt, a statistic that considers the entire supercomputing operation. Like OPS, PUE (Power Usage Efficiency) measures power and efficiency where the datacenter is their plate, so to speak.

In The Green Grid's (TGG) recently released whitepaper examining the benefits and difficulties of measuring PUE, they noted that one of the larger challenges regarding measuring PUE is defining with precision the energy used to power the IT functions and the total facility. For example, an office building housing a data center may have only a single all-encompassing meter that powers the datacenter as well as non-essential (to that data that is) offices. “In this case,” TGG noted, “the data center administrator could measure and subtract the amount of energy being used by the non-data center offices in order to calculate an accurate PUE.”

Such calculations require a slightly more intricate reporting system. Further, determining which outputs are IT intensive and which are slightly more subjective. The baseball/OPS analogy falls apart here as the parameters for on-base percentage and slugging percentage are rigidly defined.

However, the purpose of the report is to indeed define those parameters in a rigid fashion. The calculation on its surface is relatively simple: divide the total datacenter facility energy consumption by that of the IT processes. The closer the number is to one, the more efficient the datacenter is at appropriating its power to IT, the purpose of the datacenter in the first place.

According to TGG, there may be a disconnect in reported and actual PUE values. “Although PUE is a simple concept that has gained broad acceptance, correctly and accurately measuring PUE can be challenging,” they claim. “Many PUE claims seem too good to be true and cause others to wonder if they are incorrectly obtaining their PUE values.”

Distinguishing between the energy that goes to IT and the energy that goes elsewhere can be accomplished with thoughtfully-placed output meters. “IT equipment energy should be measured after all facility power conversion, switching, and conditioning is completed and before the energy is used by the IT equipment itself,” TGG noted. “The most likely measurement point is at the output of the computer room PDUs. This measurement should represent the total energy delivered to the compute equipment racks in the data center.”

That description constitutes what would be to TGG a Level Two reading level. To distinguish between a one-time monthly measurement in just a few places and a more sophisticated model brought about by more continuous measurement in different levels of the datacenter and facility, TGG developed a three-level system.

Again, monthly measurements of this ilk are sufficient to determine PUE. However, TGG believes further energy efficiency insight can be garnered from continuous measurement. For example, large variances between similar data centers can be caught and studied. “PUE is valuable for monitoring changes in a single data center at an aggregated level. It also can help identify large differences in Power Usage Effectiveness among similar data centers, although further investigation is required to understand why such variations exist.”

TGG developed PUE along with an inverse metric DataCenter infrastructure Efficiency (DCiE) in 2007. According to TGG, PUE gained traction in the industry, presumably because it was simpler to compare a number scale from one to infinity than one to zero. Since, the statistic has seen usage from both the US-based Data Center Metrics Coordination Task Force and the worldwide Global Harmonization of Data Center Efficiency Metrics Task Force.

To be eligible to be graded on the PUE scale, a facility must measure its power monthly over the course of a year. However, to garner more accurate and legitimate results, they suggest measuring daily or even multiple times a day to account for all different environmental factors, including time-dependent data demand as well as variable seasonal and daily cooling requirements.

With that said, the point of PUE, as TGG put it, is to increase awareness of one’s own datacenter efficiency versus winning a statistical measure. “The common goal is to reduce energy usage, not manipulate a metric,” TGG argued. Like a baseball agent manipulates his player’s statistics to gain an advantage in the negotiating room, some organizations manipulate their supercomputers such that they gain recognition and press from the Green 500 list.

Since PUE has yet to become a tool for organizing machines and centers to the acclaim that Megaflops/Watt has, the manipulation of this metric is less likely.

Next--Calculating PUE >

Calculating PUE

 

Indeed, for the most part, TGG is wary of using PUE to compare and contrast data facilities. However, the natural proclivity to do so has prompted this effort to enlighten those in the differences that must be taken into account. For example, data centers come in many flavors, from the type of processing they do to the amount of time they have been operational and even their location and resulting surrounding climate.

 “In some real-world situations, the PUE metric may go up if the total energy provided to a data center is not adjusted accordingly to match a drop in IT energy. It is important to remember to reduce the infrastructure subcomponent energy consumption.” Those subcomponents can be properly evaluated if a facility implements Level Two or Three measure, as denoted by the figure below.

While implementing these monitoring procedures would cost more upfront, they could potentially save those using datacenters money as they get a sense of which components are not performing up to efficiency standards. Varying power levels detected by level two or three sensors could also help detect runtime and processing problems in the IT infrastructure. “In cases where continuous real-time monitoring is not practical or economically justifiable, some form of repeatable, defined process should be in place to capture PUE as often as possible for comparison purposes.”

According to TGG, a comprehensive list of datacenter PUE’s has yet to be compiled. However, they believe that while many centers have ratings of 3 are higher, 1.6 should be possible. Indeed, research done by Lawrence Berkeley National Labs noted that 22 datacenters had values from 1.3 to 3.0.

Below is a representation of a calculation done by TGG to help put those values of 1.3 and 3.0 in context.

First of all, not all energy is treated equally. Purchased electricity is treated normally while electricity generated internally, whether it’s bought or manufactured, carries a lower weight. Natural gas holds a co-efficient of 0.35 in this calculation. When combined with the electricity bought, the IT energy co-efficient comes to 0.90.

The total facility calculation is pretty simple once the moderators are applied. That is divided by the adjusted IT energy to get a Power Usage Efficiency of 1.57.

From that, PUE seems like the simple addition and division problem that calculating OPS is. However, those energy type modifiers, and it should be noted that global modifiers are different than those used in the United States, complicate the situation. So too do methods of cooling the datacenter. “To further complicate PUE calculation, some cooling technologies integrate cooling elements such as pumps, refrigeration, blowers, and heat exchangers within the IT equipment itself. These technologies blur what has traditionally been a clear delineation between facility equipment and IT equipment. However, equipment used to provide power and cooling to the data center must be accounted.”

Despite these challenges, calculating datacenter efficiency is important. Collecting the appropriate measurements at the correct intervals can give facilities the ability to crack down on unnecessary energy loss, and PUE just may be the metric to help do that.

Related Articles

A Sobering Assessment of the Microserver Chip Market

MRAM Contends for Green Memory Title

Are Clouds the Fastest Path to Green Computing?

EnterpriseAI