An Effective Gray-Box Identification Procedure for Multicore Thermal Modeling

Aggressive thermal management is a critical feature for high-end computing platforms, as worst-case thermal budgeting is becoming unaffordable. Reactive thermal management, which sets temperature thresholds to trigger thermal capping actions, is too “nearsighted,” and it may lead to severe performance degradation and thermal overshoots. More aggressive proactive thermal managements minimize performance penalty with smooth optimal control. These techniques require knowledge of thermal models, which have to be accurate and simple to make the controls effective, while keeping their complexity limited. In practice, these models are not provided by manufacturers, and in most cases, they strongly depend on the deployment environment. Hence, procedures to automatically derive thermal models in the field are needed. In this paper, we propose a gray-box procedure to learn a compact and physically consistent model for multicore chips. We leverage the physical consistency of the proposed model to tame the model complexity and to face large quantization noise in measurements.Weexploit Output Error structures along with Levenberg–Marquardt and Least Squares optimization algorithms. We tackle the problem in a real-life contest: we developed a complete infrastructure for model building and thermal data collection in the Linux environment, and we tested it on an Intel Nehalem-based server CPU.

Beneventi F. ; Bartolini A. ; Tilli A. ; Benini L., An Effective Gray-Box Identification Procedure for Multicore Thermal Modeling, in: Computers, IEEE Transactions on (Volume:63 , Issue: 5 ), Page(s): 1097 – 1110, ISSN :0018-9340, DOI: 10.1109/TC.2012.293, 2014 IEEE

2014_TC_Beneventi
2014_TC_supp_files_Beneventi

Categories: Journal