Thermal problems in a circuit can be identified by measuring the temperature of a PCB, the die temperature of a CPU or other IC with a thermal sensing transistor, and the temperature of the chassis air. Monitoring the temperature of the PCB helps identify the overheating of chips in the vicinity of the sensor. Monitoring the die temperature of a CPU, FPGA, or other high-power chip that has an on-chip thermal monitoring diode can very quickly detect dangerous thermal conditions before an expensive device is damaged by heat. Monitoring the air temperature can indicate such conditions as a failed or blocked cooling fan.
Easily available temperature monitoring ICs allow accurate, automated measurement of board and remote thermal diode temperatures. However, they do a poor job of measuring air temperature. They measure board temperature well because they are in direct thermal contact with the board through their leads. But unless the air and board temperatures are the same, they cannot sense air temperature.
One way to sense air temperature is to use an NTC (negative temperature coefficient) thermistor with long leads. The long leads help isolate the temperature of the NTC element from the temperature of the board. Specialized air temperature probes with longer leads are available from thermistor manufacturers. To measure air temperature in this circuit, connect an NTC and a resistor in series to form a voltage divider. The voltage across the series resistor is measured.
A Better Solution
A simpler way is to consolidate all monitoring into a single, integrated circuit. The circuit in Figure 1 measures and monitors the CPU, circuit board, and ambient temperatures. The MAX6656 is a temperature and voltage monitor that continuously captures the temperature of two external thermal sense transistors, its own temperature, its supply voltage, and three external voltages. All measured quantities are compared against programmable temperature and voltage limits. If a value falls outside its limits, the %-overbar_pre%ALERT%-overbar_post% pin asserts.
Figure 1. This circuit monitors the temperatures of the CPU (or other IC with a thermal sensing diode), the circuit board, and the air. When any temperature exceeds a programmable limit, an %-overbar_pre%ALERT%-overbar_post% assertion notifies the system of an overtemperature condition.
The MAX6656 measures its own die temperature, and therefore board temperature, with an accuracy of 1.5°C from 60°C to 100°C. Over the same temperature range, external ICs with thermal sense transistors are monitored with a 1.0°C accuracy. The external ICs might be two CPUs, a CPU and an FPGA, or some other combination of remote devices. One of the remote sense transistors can even be a discrete transistor measuring board temperature some distance from the MAX6656. Use a 2.2nF capacitor across the MAX6656's DXP_ and DXN_ pins to filter external noise that might disrupt the temperature measurement or conversion process.
The relationship between a thermistor's resistance and its temperature is very nonlinear, but over a limited temperature range, when the right series resistor is used the relationship can be made relatively linear. The circuit in Figure 1 has been optimized for good linearity with thermistor temperatures in the range of approximately 20°C to 70°C, resulting in a less than 0.8°C linearity error over this range. The average slope in this temperature range is 29.35mV/°C, and the voltage monitor input VIN3 has an LSB value of 11.9mV. This results in an LSB weighting of 0.405°C/LSB. At 20°C, the voltage on R1 is nominally 693mV, which corresponds to a measured code of 46 (decimal).
Note that the accuracy of the ambient temperature measurement would normally depend on the thermistor/resistor combination being connected to an accurate reference voltage. To minimize cost in this circuit, however, the thermistor and R1 are simply connected to the MAX6656's supply voltage. This could cause an error of a few degrees, but fortunately, the MAX6656 monitors its own supply voltage, allowing corrections to any supply voltage errors to be made in the software. If the supply voltage is high by 3%, for example, the measured voltage at a given temperature will be 3% high. At 65°C, the output voltage is ideally 2.03V. If the supply voltage is 3% high, the measured temperature will be about 61mV high (approximately a 2°C error).
Thermal and voltage fault limits can be set via the SMBus. When any temperature (or voltage) is outside of the correct range, an %-overbar_pre%ALERT%-overbar_post% is asserted. An additional limit can be set for each of the local and remote transistor temperatures to generate an %-overbar_pre%OVERT%-overbar_post% output that may be used to activate a cooling fan or to shut the system down.
Although temperature monitoring ICs and the use of NTC thermistors with long leads can monitor the temperature of the board, the CPU, and the air, a more efficient way to protect a system from overheating is to consolidate all monitoring into a single, integrated circuit.