系统仿真学报 ›› 2019, Vol. 31 ›› Issue (12): 2685-2695.doi: 10.16182/j.issn1004731x.joss.19-FZ0346

• 仿真系统与技术 • 上一篇    下一篇

基于推测机制异构多核处理器容错方法与仿真

余世干1,2,3,4, 唐志敏1,2,3, 叶笑春1,2, 范东睿1,2   

  1. 1. 中国科学院计算技术研究所计算机体系结构国家重点实验室,北京 100190;
    2. 中国科学院大学计算机控制与工程学院,北京 100049;
    3. 先进微处理器技术国家工程实验室,四川 成都 610218;
    4. 阜阳师范大学信息工程学院,安徽 阜阳 236041
  • 收稿日期:2019-04-25 修回日期:2019-07-19 发布日期:2019-12-13
  • 作者简介:余世干(1982-),男,安徽定远,硕士,副教授,研究方向为计算机体系结构与容错计算。
  • 基金资助:
    国家重点研发计划(2018YFB1003500), 国家自然科学基金(61732018, 61872335, 61802367), 安徽高校人才支持与科研项目(gxyq2019175, KJ2018A0669)

Fault-tolerant Method and Simulation of Heterogeneous Multi-core Processor Based on Speculative Mechanism

Yu Shigan1,2,3,4, Tang Zhimin1,2,3, Ye Xiaochun1,2, Fan Dongrui1,2   

  1. 1. State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;
    2. School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 100049, China;
    3.National Engineering Laboratory for Advanced Processor Technology, Chengdu 610218, China;
    4.Information Engineering college, Fuyang Normal University, Fuyang 236041, China
  • Received:2019-04-25 Revised:2019-07-19 Published:2019-12-13

摘要: 异构多核是处理器重要方向之一,却面临着瞬态故障频发问题,传统TMR(三模冗余)是主要解决办法,但有效率低,功耗高特点,提出基于推测机制高性能容错调度算法FTSAS。各异构核独立执行任务,记录最先完成核的状态值,采用前向推测法继续执行下一任务,采用多数一致原则,由落后的核完成结果比较,保障系统可靠性。仿真实验表明,FTSAS比当前容错方法平均性能提高了12.9%,注入200个错误时,具有相近的容错效果,但FTSAS平均执行性能提高了11.4%,平均功耗降低了15.8%。

关键词: 异构多核, 处理器, 推测机制, 容错, 调度

Abstract: Heterogeneous multicore is one of the important branches of processors,but they are still faced with frequent transient failures. TMR(Triple mode redundancy) is the main method to solve transient faults, which has the characteristics of low efficiency and high power consumption, a high-performance Fault-Tolerant Scheduling Algorithm with Speculative mechanism(FTSAS) is proposed. Each heterogeneous core can execute tasks independently, the state values of the first completed core are recorded, and the first completed core continues to perform the next task with forward speculative method. The results are compared by backward core, the majority consensus principle is adopted to ensure the reliability of the system. The simulation results show the average performance of the FTSAS is improved by 12.9% compared with state-of-the-art methods. When 200 errors are injected, the FTSAS has a similar fault-tolerant effect, however, FTSAS can achieve 11.4% average efficiency improvement and 15.8% average power consumption decrement.

Key words: Heterogeneous multicore, Processor, Speculative mechanism, fault tolerance, Scheduling

中图分类号: