One of the great things about our job at CDCDG is that we not only get to design data centers, but we get to test them once they are up and running. As a result, we get to see how they actually perform and compare that to the assumptions we made during the design process. The other day, one of my partners, Richard Greco and I were discussing the requirement for continuous cooling. Both of us were once proponents of it. I even wrote an article entitled "High density data centers require continuous cooling." I'm not so sure anymore.
Providing continuous cooling for a data center is fairly expensive and fairly complex given the need to provide thermal storage and put pumps and air handlers on UPS power. Most designs set up a separate UPS system just for the continuous cooling loads. All this comes at a pretty hefty price. The question is if it is really necessary?
As densities have increased, we have seen more and more articles about the requirement for continuous cooling in data centers. If you go back 5 years, people were saying that it was a requirement once you got above 5kW per rack. That proved to not be true, but the general consensus came to be that if you got above 10kW it was a requirement. The thought was that if the cooling was out for 30 to 60 seconds, the temperature would skyrocket and you could never catch up in time to prevent equipment damage. That has proven not to be true either.
We have tested numerous data centers with racks loads of between 8 to 12kW and can tell you that not only did the temperatures not skyrocket during short outages, but the only noticeable change was a lack of air movement. While the server fans continued to move air through the servers (since they were on a UPS), the major air handlers or CRAC units being off line for 30 to 60 seconds did not seem to have much of an impact. Even with a 2 to 3 minute interruption in cooling, the temperature rises were between 2 to 3 degrees. While the temperatures do indeed go up, they are nowhere near the thermal runaway conditions that had been predicted and are certainly within the allowable rate of change.
Now these are recent designs, so they are all hot aisle/cold aisle arrangements and most used the ceiling as a return plenum. Some of them had hot aisle containment, which really helps to minimize the issue. If you are isolating the hot air from the data center and routing it into a plenum, the temperature in the data center would remain fairly constant. Especially if we are only talking about the 15 to 30 seconds it takes for a backup generator system to come on line.
I've heard some theories about why the continuous cooling assumption has proven to be invalid in all but a few instances (super-computer centers being one of them). The thermal mass in a data center is pretty significant, raised flooring, concrete slabs, walls, racks themselves provide lot of thermal mass, and it's all at 72 to 75 degrees. It takes awhile for those surfaces to heat up. Another is that the fans in the servers themselves provide enough air movement to keep them cool for a short period of time it takes for the generators to start and the cooling to come back on line.
The new energy efficient designs should help as well. Although there are a number of different designs, some of them provide for very low velocity air and rely on the server fans to distribute the air. Most of them also dump the hot air to the exterior and use 100% outside air for cooling a large percentage of the time. Certainly these designs should be even less prone to the thermal runaway scenarios.
So at what point does continuous cooling become a requirement? I'm not really sure that we have an answer. There are a lot of variables that are site and design specific. One thing we can say is that a blanket statement that all data centers with rack loads of 5 - 10 kW and above require continuous cooling is simply not reality based on our experiences. I would love to get some feedback on what others have seen at their data centers.