CASIA OpenIR  > 毕业生  > 博士学位论文
微处理器高性能部件设计关键技术研究
其他题名Research on the Key Technologies of High Performance Components Design in Microprocessor
肖偌舟
2015-05-27
学位类型工学博士
中文摘要当前,国家大力推动核心元器件国产化,以实现自主可控,XX处理器应运而生。XX处理器是一款适用于无线通信、雷达信号处理等领域的全新的具有自主知识产权指令集体系结构和微体系结构的高性能处理器,性能达到国际领先水平。 在XX处理器中,运算部件和时钟网络占芯片总面积的50%以上,总功耗的60%以上,因此对运算部件和时钟网络进行优化设计具有至关重要的地位。同时为了进一步提高性能降低功耗,进行晶体管级尺寸优化也非常必要。为此,本文从以下三方面展开研究工作,主要如下: 高性能运算部件设计:现代微处理器中,运算部件往往是芯片的关键路径,且功耗较大。本文在算法级对常用运算部件进行微体系结构研究,深入研究各类加法器、乘法器、乘累加器等功能结构及特点,提出多种高性能低功耗设计算法。本文提出的高性能加法器结构性能较通用加法器提高28.6%;低功耗乘法器中,压缩阵列的进位保留加法器翻转率降低40%,乘法器功耗降低16.78%;可复用定点乘累加器支持多种操作,支持子字并行,数据位宽包括8、16和32位,支持的运算包括乘法,累加和乘累加,支持数据格式包括实数、复数,整数和小数,支持的数据类型包括有符合、无符号数。运算部件设计的复杂性也给数据通路的验证带来严峻挑战。电路状态的复杂度与设计规模呈指数增长,完备、高效地验证XX处理器是保障其流片成功的关键。本文提出一种黄金参考模型及自动验证平台的设计方法,黄金参考模型代码量仅为待验证设计的8%,参考模型正确性易保证,整个验证平台自动化程度高,资源利用率高,实验结果表明验证平均覆盖率在98%以上。在此基础上设计实现自适应模板查找匹配算法,从数据通路的逻辑关系入手,导出逻辑功能团组合,进而进行功能团功能、性能定义与设计,得到功能团的详细数据,如出现次数,前后级驱动环境等,以16位串行进位加法器为例,其一共包括80个单元,该算法按逻辑深度对电路进行梳理,共得到33级逻辑,提取出26类模板。之后利用数据通路内部的规整性,进行智能布局算法研究,对运算部件的数据通路在物理摆放上进行相关性布局,从而有效减少面积,连线长度,布线拥塞,因互联而引发的寄生电阻电容(Resistance Capacitance,RC)以及时钟和数据信号的偏斜。同时相比于传统的手工布局,能够提高设计效率和逻辑优化深度。以XX处理器中浮点乘法部件第三级流水线设计为例,采用高速低功耗乘法结构,进行相关性布局后缓冲器数量减少23.2%,连线数量减少3.9%,连线长度减少15.7%。 晶体管级电路优化平台设计:传统的全综合设计流程基于单元库实现,由于单元库使用离散尺寸,电路内部晶体管之间的匹配程度往往不是最优,这造成面积的浪费和性能的损失。传统的EDA工具链中,晶体管级电路分析通常是对模拟电路而言,用SPICE(Simulation Program with Integrated Circuit Emphasis,通用模拟电路仿真器)进行仿真,这种方法较为耗时。本文深入研究晶体管级电路设计优化技术,将待优化电路划分成多个子系统,缩小优化空间,根据各路径时序余量加权值进行优化方案选择,摸索出一套快速、高效的晶体管级优化流程,并在实践...
英文摘要Currently, the nation vigorously promote the localization of the core components which aims at self-control abilities, thus the XX processor emerges as the times require. XX processor is a self-owned instruction set high performance microprocessor with intellectual property right, which is suit for military wireless communication, radar signal processing and so on. The performance of XX is in the lead internationally. In XX processor, arithmetic units and clock network occupy 50% of chip area and they consume 60% of chip power. So it is very important to optimize arithmetic units and clock network. At the same time in order to further improve the performance and reduce power, it is necessary to take out the transistor level optimization. Thus this paper launches the research work from the following three aspects, show as follows: The high performance arithmetical components design: arithmetical components are always the critical path of chip, and they consumes considerable amount of power. This paper researches on the algorithms of common arithemetic units, deeply in various adders, multipliers and fused multiply-add unit, then create many algorithms that are high performance and low power. The high performance structure of adder presented by this paper is 28.6% faster than common adder. The flip probability of carry save adders of compression array in lower power multiplier is decreased by 40%, which can save 16.78% power. And the fix point fused multiply-add unit supports many kinds of operations. It supports 8-bit, 16-bit and 32-bit operands and sub-word parallel computation. The unit can take out operations of multipily, accumulation and fused multiply-add with the data format of real, complex, integer and decimal, whose types could be signed number, unsigned number. The complexity of arithmetic unit design has brought serious challenges to the verification. The number of state of the circuits is exponential to the design scale and how to make the verification complete and efficient is key to the success taping out of the XX. This paper gives method of designing golden reference models and automatic verification platform, by which the coding amount of golden reference model is 8% of that of design under test, thus the correct of the golden reference model is easily guaranteed. The whole verification platform has a high degree of automation and the resource utilization rate. The experiment shows the average of verification coverage is above 98%. It ...
关键词高性能部件 运算部件算法 模板抽取 晶体管级优化 高性能时钟网络 High Performance Components Algorithms Of Arithmetical Cells Template Extraction Transistor Level Optimization High Performance Clock Network
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/6714
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
肖偌舟. 微处理器高性能部件设计关键技术研究[D]. 中国科学院自动化研究所. 中国科学院大学,2015.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
CASIA_20121801462910(4134KB) 暂不开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[肖偌舟]的文章
百度学术
百度学术中相似的文章
[肖偌舟]的文章
必应学术
必应学术中相似的文章
[肖偌舟]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。