CASIA OpenIR  > 智能制造技术与系统研究中心
一种基于规则迭代的无人车自学习控制方法
张力夫
Subtype硕士
Thesis Advisor汤淑明
2021-05-27
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Name工程硕士
Degree Discipline控制工程
Keyword无人车控制 自主学习 规则提取 规则迭代
Abstract

近年来,无人驾驶引起社会的广泛关注。无人驾驶技术对改善城市交通和环境友好持续发展具有重要意义,同时还能提升驾驶安全性,有效降低事故发生率。无人车运动控制是无人驾驶领域不可或缺的关键技术之一。与传统的分为感知、认知、决策和控制的无人驾驶技术方案不同,AlphaGo Zero的出现为人们对机器自学习能力的研究树立了新的范式,本文借鉴人类学习驾驶技术的渐进过程,开展基于规则迭代的无人车自学习控制方法的研究,让无人车控制器能够通过自学习从完全不具备行驶能力,逐步达到具备安全平稳顺滑的行驶能力。

本文基于深度强化学习方法,通过规则提取以及奖励规则迭代的方式,模拟人类学习驾驶车辆的渐进过程,在仿真环境下实现无人车安全平稳行驶,具体研究内容包括:

1)搭建Carla无人驾驶仿真平台,针对城镇道路环境下无人车运动控制任务进行仿真实验,验证DQNDDPG两类典型深度强化学习算法的性能。

2)全面分析典型城镇道路环境下车辆行驶任务,根据交通规则、道路边界条件等类比人类学习驾驶行为,给出一套分级驾驶任务规则,并转化为学习算法中相应的奖惩规则。

3)基于获得的奖惩规则对DDPG算法的奖励函数部分提出改进,并优化了DDPG算法的状态空间。实验结果表明,提出方法有效提升了无人车控制器的平稳性和算法的训练效率,车辆的行驶任务平均完成度接近90%,且训练时间明显缩短,与原DQNDDPG深度强化学习的训练结果相比有显著提升。

Other Abstract

In recent years, autonomous driving has attracted widespread attention in the world. Unmanned driving technology is of great significance to the improvement of urban traffic and the sustainable development of environmental friendliness. At the same time, it can also enhance driving safety and effectively reduce the accident rate. Unmanned vehicle motion control is an indispensable technology in the field of autonomous driving. Different from the traditional solutions of autonomous driving technology, which are divided into perception, cognition, decision, and control, the emergence of AlphaGo Zero set up a new paradigm for the research of machine self-learning ability. In order to enable the autonomous vehicle controller to have safe and smooth driving ability through self-learning, the paper references to the process of human learning driving and carries out research on the rule-based iterative self-learning control method of unmanned vehicle.

In this paper, based on deep reinforcement learning method, and through rule extraction and reward rule iteration, We imitated the gradual process of human learning to drive a vehicle, and realized the safe and stable running of the unmanned vehicle in the simulation environment. Its main works are as follows:

(1) Building Carla simulation platform and carrying out simulation experiments for the motion control task of unmanned vehicles in urban road environment to verify the performance of two typical deep reinforcement learning algorithms, DQN and DDPG.

(2) Analysing of vehicle driving tasks under typical urban road environment, according to traffic rules and road boundary conditions, a set of hierarchical driving task rules is given, which is transformed into the corresponding reward and punishment rules in the learning algorithm.

(3) The reward function of the DDPG algorithm is improved based on the obtained reward and punishment rules, and the state space of DDPG algorithm is optimized. Experimental result shows that the proposed method effectively improves the stability of the unmanned vehicle controller and the training efficiency of the algorithm. The average completion degree of the vehicle's driving task is close to 90%, and the training time is shortened. Compared with the training results of DQN and DDPG, the results were significantly improved.

Pages84
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/44990
Collection智能制造技术与系统研究中心
Recommended Citation
GB/T 7714
张力夫. 一种基于规则迭代的无人车自学习控制方法[D]. 中国科学院自动化研究所. 中国科学院大学,2021.
Files in This Item:
File Name/Size DocType Version Access License
学位论文_final-6.24.pdf(3356KB)学位论文 开放获取CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[张力夫]'s Articles
Baidu academic
Similar articles in Baidu academic
[张力夫]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[张力夫]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.