Data Centers and HPC: the Energy Challenge

If you were asked to list industries whose carbon footprint contributes dramatically to global warming and represents a threat to our planet’s delicate climate balance, you would probably not think of IT as one of the top “energy hogs”.

You would be wrong. IT is rapidly climbing in this not-so-virtuous ranking, with its double digit growth rate in energy consumption dwarfing the transportation industry’s own of 1% per year. According to the industry consortium GreenTouch, the IT industry accounts today for roughly 2% of the world’s total energy consumption, comparable to the airline industry.

While the fantastic computing power that technology is making available to businesses and to strategic fields such as science, applied research and industrial R&D is enabling progress, ultimately contributing to the better good of human kind in ways rarely seen before, the desire for even faster progress is boosting the need for ever increasing computing power and for access to huge amounts of data, all but accelerating this segment’s demand for energy.

Data centers are responsible for a large chunk of the energy consumed by IT. According to the U.S. Environmental Protection Agency (EPA) the amount of energy consumed by data centers doubled between 2000 and 2006 alone. This trend slowed down in 2007 amid economic crisis and better data centers efficiency, to start accelerating again in the last 2 years. A recent report from the Natural Resources Defense Council (NRDC) claims waste and inefficiency in U.S. data centers – that consumed a massive 91 bn kWh of electricity in 2013 – will increase to 140 bn kWh by 2020, the equivalent of 50 large (500 megawatt) power plants.

This has become largely unsustainable.

In an ideal situation, IT equipment should use 100% of the energy consumed by a data center. Unfortunately, reality is different. The percentage of energy used for IT equipment varies between 60% and 30% of the total energy consumed by the whole data center. The parameter that best measures data center energy efficiency is PUE (power usage effectiveness). The closer PUE is to 1 the better: a PUE of 2.5 means that for every Watt consumed by the data center 1 is used  for the IT equipment and 1.5 Watt goes for cooling or other not essential activities. A PUE of 1 means that 100% of the energy used by the data center goes into IT equipment.

A 2014 Uptime Institute annual data center survey reveals that data center power usage efficiency (PUE) metrics have plateaued at around 1.7 after several years of steady improvement.

The main reasons of this low efficiency are energy waste in the electric conversion needed to power equipment (transformers, rectifiers, UPS) and energy used to cool IT equipment through chillers and CRAC units.

There are many technologies that are being used to improve energy efficiency in data centers: virtualization, hot and cold aisle containment, increase of thermal envelope, air flow optimization, DCIM (Data Center Infrastructure Management). These are all low hanging fruits that allowed for an increase in efficiency, but they don’t allow reaching sustainable levels.

HPC (High Performance Computing) has long seen energy cost and availability as the biggest challenges for future developments. It is not a surprise that the HPC segment is currently adopting the most advanced solutions for energy efficiency, aiming to reduce consumption of both IT equipment and datacenter infrastructure, as well as reusing the thermal energy servers produce.

Designing more energy efficient systems means taking an approach where efficiency comes first. This implies making HW and SW design choices that maximize performance within a target power budget, leveraging heterogeneous architectures, accelerators, solid state disks, no-fan liquid cooled systems and in general choosing always the components that can guarantee more efficiency.

Wherever possible, the goal must be to achieve “free cooling”, meaning the IT equipment should be cooled without using additional energy, for instance by eliminating chillers, thus pushing down the datacentre PUE to levels around 1.05, very close to the ideal value of 1. Free cooling is only feasible when the coolant has a temperature higher than the external air temperature. If the outside temperature is very low, for instance at high latitudes or elevations, the coolant may be air, in all other cases it has to be liquid, typically water, warm enough to be cooled with outdoor air also in hot seasons. The only way to use warm water to cool IT equipment is to bring it as close as possible to where the heat is generated, at direct contact with the components (“direct liquid cooling”).

The development and optimization of the technologies leading to better energy efficiency in IT, and in HPC, requires non trivial R&D investments by the manufacturers of large scale computers used in datacenters and HPC centers. While better energy efficiency contributes significantly to lowering the Total Cost of Ownership of IT equipment, the R&D costs manufacturers sustain may lead to higher prices for equipment built according to energy efficiency criteria.

As for many other areas around carbon footprint reduction, “doing the right thing” may end up being economically less attractive than doing the “wrong” one.

While in other industrial and domestic segments (transportation, heating, renewable energy generation), policy, recommendations, stricter regulations and incentives start yielding tangible results, the IT industry and HPC have been only marginally touched by such initiatives.

As long as setting up an energy inefficient datacentre is an economically viable option for IT equipment owners, it is unlikely that substantial progress will be made towards reversing a dangerous trend.

While the issue is a planetary one, now is a good time for Europe to take it in its own hands and show the planet the way towards a more responsible and energy conscious future for the IT industry and High Performance Computing.

Data Centers and HPC: the Energy Challenge