基于监督式自适应动态规划的车辆智能巡航控制

CASIA OpenIR > 毕业生 > 博士学位论文

	基于监督式自适应动态规划的车辆智能巡航控制
其他题名	Vehicle Intelligent Cruise Control Based on Supervised Adaptive Dynamic Programming
	王滨
	2015-05-26
学位类型	工学博士
中文摘要	车辆的智能巡航控制系统通过传感器实时测量本车与前车的距离和相对速度，计算出合适的油门或刹车控制量，并进行自动调节实现本车的车速控制或车距控制，对于提高汽车行驶的安全性、舒适性和节能性具有重要意义。汽车的智能巡航控制问题可以看成是一类存在内部动力学模型不确定性和外部环境干扰的非线性系统的最优跟踪控制问题，因其系统复杂性而受到研究人员的广泛关注。但目前汽车智能巡航控制系统还存在对环境和驾驶员习惯的自适应性等问题。另一方，自适应动态规划 (Adaptive Dynamic Programming, ADP)由于具有自适应的学习能力和近似最优的控制性能，在机器人、智能交通等领域得到了广泛的研究和应用，而传统自适应动态规划方法的学习效率问题仍有待解决。因此，本论文基于自适应动态规划方法研究汽车的智能巡航控制问题，提出了监督式自适应动态规划的相关理论方法，有效改善自适应动态规划算法的效率，并用于汽车智能巡航控制系统的设计。本文的主要内容和贡献有如下几个方面： 1. 利用 dSPACE 汽车驾驶仿真系统，实现了分层式 ACC 系统的设计。上层控制器采用本文提出的监督式自适应动态规划方法进行期望加速度的学习，下层控制器采用 dSPACE 汽车驾驶仿真系统所提供的汽车动力学模型将期望加速度转换为所需的油门开度或刹车压力。在 dSPACE 汽车驾驶仿真系统中设计了几种典型驾驶场景来模拟真实的驾驶环境，用来验证论文中所提出算法的有效性和优越性。 2. 基于自适应动态规划方法，研究了一类带有参数不确定性和状态时滞的非线性离散时间系统的最优控制问题，提出了一种迭代 DHP (Dual Heuristic Programming) 算法，从理论上证明了所提出的方法能够收敛到最优代价函数和最优控制策略。迭代 DHP 算法采用三个前馈神经网络来分别构建模型网络、评判网络和执行网络，用于近似被控对象、代价函数的偏导数和控制策略，并通过离线训练来获得近似最优的控制策略。最后将迭代 DHP 方法用于汽车的自适应巡航控制 (Adaptive Cruise Control, ACC) 模型中，实现了 ACC 加速度控制器，通过仿真实验验证了所采用的迭代 DHP 算法的有效性。 3. 在自适应动态规划方法的基础上，引入监督控制器，提出了一种监督式ADP (Supervised Adaptive Dynamic Programming, SADP) 学习方法。该方法在学习过程中，通过导师提供的标称控制器对 Actor 进行预训练得到基本可行的控制策略，再附加一定的探索噪声作为控制量输入系统。 Critic 通过对系统状态和控制动作的评估来调整自身参数，进而改进控制策略。为实现SADP 算法，采用了仅包括一个隐含层的前馈神经网络来分别近似 Actor 和 Critic 两部分，并利用 Lyapunov 方法证明了所提出的 SADP 算法的稳定性，即 Actor 和 Critic 神经网络的权值估计误差是一致最终有界的。最后将所提出的 SADP 方法应用于 ACC 问题，利用 dSPACE 汽车驾驶仿真系统，设计实现了基于 SADP 的分层式 ACC 系统。为验证所设计的 ACC 系统的有效性，采用 dSPACE 汽车驾驶仿真系统来模拟真实的驾驶环境，将通过学习训练得到的控制器用于各种典型驾驶场景的控制性能测试。此外...
英文摘要	Intelligent cruise control systems can detect the relative distance and speed between the preceding vehicle and the host vehicle by sensors, which can be used to calculate the required throttle angle or brake pressure, and regulate the speed or maintain the desired distance automatically. Intelligent cruise control systems are critical to improve driving safety, comfort and energy efficiency. The intelligent cruise control problem can be viewed as an optimal tracking control problem for a class of nonlinear system, which involves system dynamic uncertainty and environment disturbance. Although extensive researches and achievements have been obtained, but there is still much work to be done, such as the adaptability to drivers and the complex driving environment. On the other hand, due to the adaptive learning ability and the approximate optimal control performance, adaptive dynamic programming is widely investigated and applied in robotics and intelligent transportation systems. However, the low learning efficiency is the main limit of traditional adaptive dynamic programming approaches. For the above reasons, this thesis investigates the intelligent cruise control problem with adaptive dynamic programming methods. Supervised adaptive dynamic programming theory and algorithms are proposed in the thesis, which greatly improve the learning efficiency. The presented algorithms are used to design the intelligent cruise control system. The main contributions are as follows. 1. A hierarchical ACC system is designed in the dSPACE simulation system. The upper level controller is to learn the desired acceleration, which is implemented by the proposed supervised adaptive dynamic programming methods. The lower level controller provides required throttle angle or braking pressure through vehicle dynamics. Typical driving scenarios are designed in the dSPACE simulator to verify the effectiveness and superiority of the proposed algorithms in the thesis. 2. Based on adaptive dynamic programming, an iterative dual heuristic programming (DHP) algorithm is proposed to deal with the optimal control problem for a class of nonlinear discrete-time system, which contains parameter uncertainty and state time-delay. Detailed convergence proof is given to demonstrate that this algorithm can converge to the optimal cost function and the optimal control policy as well. Three feed-forward neural networks are presented to build Model network, Critic network and Action network, whic...
关键词	自适应巡航控制自适应动态规划监督式强化学习智能控制 Dspace Adaptive Cruise Control Adaptive Dynamic Programming Supervised Reinforcement Learning Intelligent Control Dspace
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6689
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	王滨. 基于监督式自适应动态规划的车辆智能巡航控制[D]. 中国科学院自动化研究所. 中国科学院大学,2015.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20121801462801（2069KB）			暂不开放	CC BY-NC-SA