CASIA OpenIR  > 国家专用集成电路设计工程技术研究中心
MaPU编程语言及编译器关键技术研究
申俊志
Subtype硕士
Thesis Advisor王东琳
2019-05-24
Degree Grantor中国科学院自动化研究所
Place of Conferral北京市海淀区中关村东路95号自动化研究所
Degree Name工程硕士
Degree Discipline计算机技术
Keyword编程语言设计 编译器设计与实现 Vliw 指令调度
Abstract
    随着领域定制化硬件和异构多核处理器的兴起,软件开发成本越来越成为这类系统发挥性能和进行商业推广的瓶颈。本文工作是基于一款应用于通信领域的异构多核高性能处理器MaPU,其同时具有一个主控ARM核和两个加速核,分别为标量加速核SPU和向量加速核MPU。其中SPU是一个具有四发射VLIW结构的32位标量加速核,MPU是一个具有十七发射VLIW结构、无指令互锁、硬件流水线暴露的512位向量加速核。不但如此,向量加速核MPU中十七个功能单元之间复杂的互联关系也给MPU的开发带来了不少复杂性。以上MaPU的所有硬件特点一起构成了MaPU处理器的软件开发难题,本文为了解决MaPU处理器的软件开发难题做了如下工作:
    设计了具有两层抽象层次的Maple(MaPU Assembly Program Language Extension)编程语言。Maple编程语言是本文提出一门专门为MaPU处理器向量加速核MPU设计的领域定制化语言。Maple是MPU向量加速核微码汇编语言的一种扩展形式,拥有比微码更高的抽象层次,而且是两层。在Maple语言的较低抽象层次上进行MPU应用程序的开发时,Maple为程序员隐藏了MPU硬件流水、无指令互锁、十七发射VLIW结构和复杂的硬件互联端口,同时这个层次的Maple还为程序员保留着功能单元和物理寄存器的控制权;对于Maple语言的较高抽象层次,程序员甚至都不用时刻保持对功能单元和物理寄存器使用的担忧,可以把跟多的精力放在算法的设计和实现上。除此之外,Maple语言还具有C语言一样的预处理器和高级语法糖,给程序设计人员的开发带来了极大的便利,经过统计可以平均减少程序员4.02倍的编码量,并且针对MaPU典型应用算法,Maple语言编译器平均可以达到手工微码性能的70.44%。
    设计和实现了MaPU处理器的编译器。由于MaPU处理器异构多核的特点,其编译器的设计和实现任务本质上是两个几乎独立的编译器的设计和实现,分别是标量处理器SPU编译器和Maple语言编译器的设计和实现。SPU编译器前端对接的是高级语言C语言,后端对接的SPU标量加速核;而Maple语言编译器前端对接的Maple语言,后端对接的是MPU向量加速核。但两者在实现上都是使用的Clang提供的预处理器和LLVM提供的编译器基础设施,包括中间语言机器无关的分析和优化,Pass管理机制和TableGen后端信息描述工具等。除此之外,我们还在MaPU编译器中实现了很多机器相关优化,例如硬件循环、VLIW打包、指令调度、端口分配等等,保证了MaPU处理器编译器的高效性。SPU编译器可以取得1.66倍的执行时间优化和1.11倍的代码大小优化。
Other Abstract
With the rise of domain-specific hardware and heterogeneous multi-core processors, software development costs are increasingly becoming a bottleneck for performance and commercialization of such systems. The work of this paper is based on a heterogeneous multi-core high-performance processor MaPU used in the communication field. It has a master ARM core and two acceleration cores, which are scalar acceleration core SPU and vector acceleration core MPU. The SPU is a 32-bit scalar acceleration core with a four-transmission VLIW structure. The MPU is a 512-bit vector acceleration core with a seventeen-transmission VLIW structure, no instruction interlock, and hardware pipeline exposure. Not only that, but the complex interconnection between the seventeen functional units in the vector acceleration core MPU also brings a lot of complexity to the development of the MPU. All the hardware features of the above MaPU together constitute the software development problem of the MaPU processor. In order to solve the software development problem of the MaPU processor, this paper does the following work:

 

A Maple programming language with two levels of abstraction is designed. The Maple programming language is a domain-specific custom language designed specifically for the MaPU processor vector acceleration core MPU. Maple is an extension of the MPU vector acceleration core microcode assembly language, with a higher level of abstraction than microcode, and is two layers. When developing MPU applications at a lower level of abstraction in the Maple language, Maple hides MPU hardware pipelines, instructionless interlocks, seventeen transmit VLIW structures, and complex hardware interconnect ports for programmers, at the same time, this level of Maple also reserves the control of the functional unit and physical registers for the programmer; for the higher level of abstraction of the Maple language, the programmer does not even have to keep concerns about the use of functional units and physical registers, they can put more energy into the algorithm Design and implementation. In addition, the Maple language also has a C-like preprocessor and advanced syntax sugar, which brings great convenience to the development of programmers. After statistics, the programmer can reduce the code amount by 4.02 times on average, and for the typical application algorithm of MaPU, the Maple language compiler can reach 70.44% of the manual microcode performance on average.

 

Designed and implemented a compiler for the MaPU processor. Due to the heterogeneous multi-core characteristics of the MaPU processor, the compiler design and implementation tasks are essentially the design and implementation of two almost independent compilers, namely the design and implementation of the scalar processor SPU compiler and Maple language compiler. The front end of the SPU compiler is connected to the high-level language C language, and the back-end docking SPU scalar acceleration core; while the Maple language compiler front-end docking Maple language, the back-end docking is the MPU vector acceleration core. However, both implementations use the preprocessor provided by Clang and the compiler infrastructure provided by LLVM, including intermediate language machine-independent analysis and optimization, pass management mechanism and TableGen back-end information description tool. In addition, we have implemented many machine-related optimizations in the MaPU compiler, such as hardware looping, VLIW packaging, instruction scheduling, port allocation, etc., to ensure the efficiency of the MaPU processor compiler. The SPU compiler can achieve 1.66 times execution time optimization and 1.11 times code size optimization.
Pages114
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/23918
Collection国家专用集成电路设计工程技术研究中心
Recommended Citation
GB/T 7714
申俊志. MaPU编程语言及编译器关键技术研究[D]. 北京市海淀区中关村东路95号自动化研究所. 中国科学院自动化研究所,2019.
Files in This Item:
File Name/Size DocType Version Access License
MaPU编程语言及编译器关键技术研究.p(4521KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[申俊志]'s Articles
Baidu academic
Similar articles in Baidu academic
[申俊志]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[申俊志]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.