Monitoring the Operational Status of Computer Room Air Conditioners (CRAC)
A breakdown of the status factors to measure and record for each Computer Room Air Conditioner (CRAC) unit in your computer room, data centre or server room. The paper also gives a justification for why you should record each status factor and what you should do with the information once you've recorded it.
Unit Operating Status
Plainly the most important operational status for a Computer Room Air Conditioner (CRAC) is whether the unit is switched on or off and in an operational state.
The operating status of most air conditioning units can be ascertained by monitoring the operational status dry contact point. An air conditioning unit without an operational dry contact point is unlikely to be meant for use within the computer room environment. Many air conditioning units have a number of dry contact points so you may need to consult your CRAC unit's manual to ascertain which dry contact port you need to monitor.
A number of manufacturers have added SNMP support to some of their air conditioners. Whilst SNMP is a boon for monitoring many factors it is unlikely that it would be helpful in monitoring the operational status of an air conditioner because the absence of a reply from the air conditioner isn't guaranteed to be indicative of the air conditioner being switched off. The SNMP reply may be lost due to a failure on the network. SNMP is very sensitive to network failure because it doesn't include any error detection or correction. Reading a dry contact point provides a completely unambiguous status that does not rely on a third party system.
Supply Fan Status
In order to understand the load being placed onto your air conditioner it is useful to record supply fan status. If you run a number of CRAC units in a primary, back-up and tertiary configuration it would be useful to know when the back-up and tertiary air conditioning units are operational. Ideally you should monitor the fan speed as well if it is variable.
If you are unable to record the status of the supply fan directly because your CRAC unit doesn't support fan status, a useful proxy would be the air speed next to the supply fan. Plainly, fans are there to move air so a high air speed next to the fan would indicate that the fan is in operation. Air speed probes are readily available for use with environment monitors.
Temperature & Humidity
Both the supply air and return air temperature and humidity should be measured to ensure the values at both locations are within expected ranges. Return air should be significantly hotter than supply air temperature. If the return air temperature is too cold it would suggest that air may not be passing through your equipment in the computer room. Air that is short circuiting your cold aisle/hot aisle configuration wastes energy, there is no point in cooling air that does not flow through your equipment. In addition, air that is able to short circuit back to the air conditioner is likely to place extra loads onto your back-up and tertiary air conditioners as they will have a larger load to contend with.
If you find supply air is short circuiting you need to identify precisely where the air is able to escape directly back to the air conditioner. You can then plug any direct air flow routes back to the air conditioner. When you have plugged the escape routes, the returned air temperature should increase to the ambient temperature in your hot aisles.
Recording Status Points
The simplest way to measure and store the status point data is to use an environment monitor. Many environment monitors can be placed into your rack environment. Not only will environment monitors record your critical status points, they will also alarm you when the readings go outside their recommended operating range. Of course, it is up to you to specify the operating range.
- Thermal Guidelines for Data Processing Environments Second Edition ASHRAE Datacom Series