Purdue Software Throttles Servers With Rising Temps, Saves System Failure - HotHardware
Purdue Software Throttles Servers With Rising Temps, Saves System Failure

Purdue Software Throttles Servers With Rising Temps, Saves System Failure

What if your future GPU or CPU could throttle itself back when it got too hot, rather than overheating and causing errors or a shutdown? It may be possible, but for now, this is a reality in the server realm. Overheating doesn't happen nearly as often these days when it comes to consumer computing, but once you add in overclocking, serial jobs or multiple linked computers, things can get messy when temperatures rise above a certain level. Particularly with servers, many are designed to shut completely off if temperatures soar beyond a certain point in order to save the machine from irreparable damage.

But shutting down a machine in order to save it from melting, so to speak, isn't exactly the best option. Many long-term tasks that take months to finish can be ruined, requiring that the task be started again. That can cost companies and universities weeks, if not months, in research time. A new software developed by Patrick Finnegan, a systems administrator at Purdue University, enables servers to sense when temperatures are about to rise above a certain point and instead of shutting down and waving a white flag, they simply throttle back extensively until cooling machines catch up. This kills about 70% to 80% of the workflow, but users do see a power savings, and moreover, no work is lost. It's better to have a task slowed than to lose it forever, or at least that's the prevailing logic.


The software is now being sold for $250. Finnegan designed the software using a "clock frequency scaling driver available for the Linux kernel, which can control both Intel and AMD chipsets with frequency scaling capabilities" It also "also relies on Altair job scheduling software as well as a set of cluster management tools from the U.S. Department of Energy's Oak Ridge National Laboratory." Purdue itself has used the software to save system failures twice already, and it worked great both times. Here's hoping for even more fail-safe options for our own computers of the future.
0
+ -

I wonder if we'll ever be able to underclock our air-conditioning in the data center. if you are lucky enough to have an external air source there will be days you can just use the outside air and fans to cool the room but what really helps is to be able to have systems turn on/off as needed automagically.

0
+ -

SO what? The Eco-nuts are telling people to not use the AC :P

SO lets make electricity prices necessarily skyrocket? Lose the politics and we can lower the prices because we wont have asses who charge more for the necessities in life!

Crank up the AC BABY! OR they could build the computer room at a deeper level of earth, where it is cooler. Or in a colder place like Alaska:P Or they could just take a picture from Sci-Fi and build these systems within a pool of nitrogen?

0
+ -

Uhhhh...

I don't get what the big deal is. Someone wrote a daemon that watches lmsensors and cpufreq's down the system when temps get high. That would probably take an afternoon to write... I'm not sure why it's worthy of news or why you would pay $250 for it.

Login or Register to Comment
Post a Comment
Username:   Password: