CASIA OpenIR  > 毕业生  > 博士学位论文
基于监督式自适应动态规划的车辆智能巡航控制
Alternative TitleVehicle Intelligent Cruise Control Based on Supervised Adaptive Dynamic Programming
王滨
Subtype工学博士
Thesis Advisor赵冬斌
2015-05-26
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Discipline控制理论与控制工程
Keyword自适应巡航控制 自适应动态规划 监督式强化学习 智能控制 Dspace Adaptive Cruise Control Adaptive Dynamic Programming Supervised Reinforcement Learning Intelligent Control Dspace
Abstract车辆的智能巡航控制系统通过传感器实时测量本车与前车的距离和相对速度,计算出合适的油门或刹车控制量,并进行自动调节实现本车的车速控制或车距控制,对于提高汽车行驶的安全性、舒适性和节能性具有重要意义。汽车的智能巡航控制问题可以看成是一类存在内部动力学模型不确定性和外部环境干扰的非线性系统的最优跟踪控制问题,因其系统复杂性而受到研究人员的广泛关注。但目前汽车智能巡航控制系统还存在对环境和驾驶员习惯的自适应性等问题。另一方,自适应动态规划 (Adaptive Dynamic Programming, ADP)由于具有自适应的学习能力和近似最优的控制性能,在机器人、智能交通等领域得到了广泛的研究和应用,而传统自适应动态规划方法的学习效率问题仍有待解决。因此,本论文基于自适应动态规划方法研究汽车的智能巡航控制问题,提出了监督式自适应动态规划的相关理论方法,有效改善自适应动态规划算法的效率,并用于汽车智能巡航控制系统的设计。本文的主要内容和贡献有如下几个方面: 1. 利用 dSPACE 汽车驾驶仿真系统,实现了分层式 ACC 系统的设计。上层控制器采用本文提出的监督式自适应动态规划方法进行期望加速度的学习,下层控制器采用 dSPACE 汽车驾驶仿真系统所提供的汽车动力学模型将期望加速度转换为所需的油门开度或刹车压力。在 dSPACE 汽车驾驶仿真系统中设计了几种典型驾驶场景来模拟真实的驾驶环境,用来验证论文中所提出算法的有效性和优越性。 2. 基于自适应动态规划方法,研究了一类带有参数不确定性和状态时滞的非线性离散时间系统的最优控制问题,提出了一种迭代 DHP (Dual Heuristic Programming) 算法,从理论上证明了所提出的方法能够收敛到最优代价函数和最优控制策略。迭代 DHP 算法采用三个前馈神经网络来分别构建模型网络、评判网络和执行网络,用于近似被控对象、代价函数的偏导数和控制策略,并通过离线训练来获得近似最优的控制策略。最后将迭代 DHP 方法用于汽车的自适应巡航控制 (Adaptive Cruise Control, ACC) 模型中,实现了 ACC 加速度控制器,通过仿真实验验证了所采用的迭代 DHP 算法的有效性。 3. 在自适应动态规划方法的基础上,引入监督控制器,提出了一种监督式ADP (Supervised Adaptive Dynamic Programming, SADP) 学习方法。该方法在学习过程中,通过导师提供的标称控制器对 Actor 进行预训练得到基本可行的控制策略,再附加一定的探索噪声作为控制量输入系统。 Critic 通过对系统状态和控制动作的评估来调整自身参数,进而改进控制策略。为实现SADP 算法,采用了仅包括一个隐含层的前馈神经网络来分别近似 Actor 和 Critic 两部分,并利用 Lyapunov 方法证明了所提出的 SADP 算法的稳定性, 即 Actor 和 Critic 神经网络的权值估计误差是一致最终有界的。最后将所提出的 SADP 方法应用于 ACC 问题,利用 dSPACE 汽车驾驶仿真系统,设计实现了基于 SADP 的分层式 ACC 系统。为验证所设计的 ACC 系统的有效性,采用 dSPACE 汽车驾驶仿真系统来模拟真实的驾驶环境,将通过学习训练得到的控制器用于各种典型驾驶场景的控制性能测试。此外...
Other AbstractIntelligent cruise control systems can detect the relative distance and speed between the preceding vehicle and the host vehicle by sensors, which can be used to calculate the required throttle angle or brake pressure, and regulate the speed or maintain the desired distance automatically. Intelligent cruise control systems are critical to improve driving safety, comfort and energy efficiency. The intelligent cruise control problem can be viewed as an optimal tracking control problem for a class of nonlinear system, which involves system dynamic uncertainty and environment disturbance. Although extensive researches and achievements have been obtained, but there is still much work to be done, such as the adaptability to drivers and the complex driving environment. On the other hand, due to the adaptive learning ability and the approximate optimal control performance, adaptive dynamic programming is widely investigated and applied in robotics and intelligent transportation systems. However, the low learning efficiency is the main limit of traditional adaptive dynamic programming approaches. For the above reasons, this thesis investigates the intelligent cruise control problem with adaptive dynamic programming methods. Supervised adaptive dynamic programming theory and algorithms are proposed in the thesis, which greatly improve the learning efficiency. The presented algorithms are used to design the intelligent cruise control system. The main contributions are as follows. 1. A hierarchical ACC system is designed in the dSPACE simulation system. The upper level controller is to learn the desired acceleration, which is implemented by the proposed supervised adaptive dynamic programming methods. The lower level controller provides required throttle angle or braking pressure through vehicle dynamics. Typical driving scenarios are designed in the dSPACE simulator to verify the effectiveness and superiority of the proposed algorithms in the thesis. 2. Based on adaptive dynamic programming, an iterative dual heuristic programming (DHP) algorithm is proposed to deal with the optimal control problem for a class of nonlinear discrete-time system, which contains parameter uncertainty and state time-delay. Detailed convergence proof is given to demonstrate that this algorithm can converge to the optimal cost function and the optimal control policy as well. Three feed-forward neural networks are presented to build Model network, Critic network and Action network, whic...
Other Identifier201218014628016
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/6689
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
王滨. 基于监督式自适应动态规划的车辆智能巡航控制[D]. 中国科学院自动化研究所. 中国科学院大学,2015.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20121801462801(2069KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[王滨]'s Articles
Baidu academic
Similar articles in Baidu academic
[王滨]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[王滨]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.