基于SADP的仿人式车辆自适应巡航控制

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于SADP的仿人式车辆自适应巡航控制
其他题名	Human-like Adaptive Cruise Control Based on Supervised Adaptive Dynamic Programming
	胡朝辉
	2011-05-24
学位类型	工学硕士
中文摘要	当今汽车安全和辅助驾驶系统越来越引起人们的重视。其中自适应巡航控制(ACC: Adaptive Cruise Control)作为一种辅助驾驶系统，得到了广泛研究。ACC能根据传感器检测到的驾驶环境进行纵向速度和距离控制，将驾驶员从频繁的加速减速中解放出来，极大减轻了驾驶员的驾驶强度。但传统的控制器很难实现仿人式控制，对复杂驾驶环境的控制效果并不好，因此开发简单、易行、有效的控制器已成为迫切要求。另一方面，强化学习(RL: Reinforcement Learning)作为一种新兴的机器学习算法，由于其自我学习的特性引起了学者的极大兴趣。监督式强化学习(SRL: Supervised Reinforcement Learning)结合了监督学习(SL: Supervised Learning)导师的指导和强化学习自我学习的特性，很适合仿人式自适应巡航控制问题的求解。自适应动态规划(ADP: Adaptive Dynamic Programming)是一种先进的强化学习方法，采用神经网络来逼近动态规划的性能指标函数，从本质上解决了强化学习的维数灾问题。本文针对ACC问题，分析了其驾驶模式，构建了车辆驾驶的上层控制模型。在强化学习算法、监督式强化学习算法、自适应动态规划算法的基础上，提出了监督式自适应动态规划算法(SADP: Supervised Adaptive Dynamic Programming)。根据车辆的各种驾驶模式，分别设计了SRL控制器、SADP控制器和混合PID控制器进行仿真。结果显示SADP控制器的效果优于SRL控制器的效果，略好于混合PID控制器的效果。在速度和距离的控制上，SADP控制器和混合PID控制器效果大体相当，但SADP控制曲线更加平稳；在加速度上，SADP控制器比混合PID控制器要小且平稳；在碰撞时间上，SADP控制器和混合PID控制器效果相当。因此可以得出结论：SADP有很好的控制精度及泛化性；在仿人式ACC系统中具有良好的控制效果。
英文摘要	Nowadays, driver-assistance systems have been extensively researched and implemented to increase driving safety. There are many applications derived from this concept, such as Adaptive Cruise Control (ACC) system, which is now used in some automobile for safety driving. In the ACC system, with the help of radars or other sensors equipped for detecting the distance and relative speed to the target vehicle, the control system will help the driver keep a safe distance and relative speed. ACC can not only liberate the driver from frequent speeding up and slowing down but also reduce the drivers’ mental stress. However, traditional control strategy can hardly fulfill the human-like control requirement, even function unsatisfied under complex environments. As a result, to development a simple, easy and effective controller is an urgent need. On the other hand, reinforcement learning (RL) has aroused abroad attention in the research areas because of the self-learning property. Supervised reinforcement learning (SRL) has both the merits of RL and supervised learning (SL). It is very suitable for the ACC problem with human-like control requirement. Adaptive dynamic programming (ADP) can be deemed as a higher level RL. It uses a neural network to approximate the state value that will be gained through iterating in dynamic programming to solve the curse of dimensionality problem. In this paper, we analyze the driving mode, construct the upper control model for ACC and introduce RL, SRL as well as ADP. At last, we propose Supervised adaptive dynamic programming (SADP) based on ADP and SL. We design the SRL, SADP and hybrid PID controllers to realize the human-like control, and then apply these controllers to different simulation scenarios. The results show that the SADP controller outperforms the SRL controller. The SADP controller also outperforms the hybrid PID controller slightly because the speed and distance control in the SADP controller are smoother than that in the hybrid PID controller. The acceleration in the SADP controller is smaller and smoother than that in the hybrid PID controller, and the time to collision (TTC) in the SADP controller is as good as that in the hybrid PID controller. It can be concluded that the SADP controller not only is robust, but also has enough accuracy in the control performance. The SADP controller provides us with a feasible and effective way in the human-like ACC system.
关键词	车辆自适应巡航控制强化学习监督学习监督式强化学习自适应动态规划监督式自适应动态规划 Adaptive Cruise Control Reinforcement Learning Supervised Learning Supervised Reinforcement Learning Adaptive Dynamic Programming Supervised Adaptive Dynamic Programming
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7585
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	胡朝辉. 基于SADP的仿人式车辆自适应巡航控制[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20082801462800（979KB）			暂不开放	CC BY-NC-SA