非线性系统最优控制的自适应动态规划方法及应用

CASIA OpenIR > 毕业生 > 博士学位论文

	非线性系统最优控制的自适应动态规划方法及应用
其他题名	Adaptive Dynamic Programming for Optimal Control of Nonlinear Systems with Applications
	黄玉柱
	2013-05-22
学位类型	工学博士
中文摘要	在控制工程实践中，存在着大量的非线性系统的最优控制问题。自适应动态规划(Adaptive Dynamic Programming, ADP)作为一种近似求解非线性最优控制问题的新方法，融合了神经网络、动态规划和强化学习的思想，有效的克服了“维数灾”的问题，并且能够获得近似最优的闭环反馈控制律，可用于复杂生产过程系统的优化控制设计。然而，自适应动态规划方法体系还未完善，利用自适应动态规划方法研究非线性系统最优控制的许多理论与技术问题还有待解决。本文基于自适应动态规划方法研究了在模型未知和存在外界干扰情况下的非线性系统的最优控制问题，并将自适应动态规划方法成功的应用于煤化工生产过程控制中，取得了很好的控制效果。本文的主要工作和贡献体现在以下四个方面。 1.研究离散时间下模型完全未知的非线性系统，利用贪婪迭代启发式动态规划(HDP)方法，设计神经网络最优控制器，实现了系统的最优跟踪控制。首先通过构造神经网络辨识器来学习未知系统的动态特性，同时建立前馈控制器来产生系统稳态跟踪控制；然后利用系统变换，将最优跟踪控制问题转化为最优误差调节器设计问题，进而引入贪婪迭代HDP算法得到系统的最优反馈控制律，并对迭代算法的收敛性进行证明；最后进行仿真实验，仿真结果表明基于贪婪迭代HDP的神经网络控制器是有效的。 2.考虑有外界干扰情况下，针对离散非线性系统，设计基于GHJI(Generalized Hamilton-Jacobi-Isaacs)和神经网络的H无穷跟踪控制器，解决了存在外界干扰条件下的非线性系统的跟踪控制问题。首先通过系统变换，将跟踪控制问题转化为误差调节器设计问题；然后建立系统的GHJI方程，设计近似最优迭代算法求解最优控制策略，并对迭代算法的收敛性和控制系统的稳定性进行了证明；最后利用神经网络实现H无穷最优跟踪控制器，并通过仿真实验说明该方法是可行且有效的。 3.针对连续时间下未知的非线性系统，基于神经网络观测器，利用自适应动态规划方法设计神经网络控制器，解决了连续非线性系统的最优控制问题。首先为了得到系统的状态信息，构造神经网络观测器，并给出观测器稳定性的证明方法；然后利用自适应动态规划方法，建立系统的最优控制器，同时分析了包含神经网络观测器和控制器的整个闭环系统的稳定性，并给出相关证明方法。最后进行仿真实验，仿真结果表明该方法有效的解决了未知的连续非线性系统的最优控制问题。 4.研究自适应动态规划方法在煤制甲醇变换单元温度控制中的应用。首先，我们利用机理分析建模方法和神经网络建模方法，分别对系统进行建模；然后，通过设置关键过程参数的评价函数，将机理模型和神经网络模型相结合，进而能够根据不同工况条件来实时调整神经网络模型，使得其更加接近实际系统的动态特性。在建模完成之后，利用双启发式动态规划方法，建立神经网络温度控制器。最后进行仿真实验，仿真结果表明基于双启发式动态规划的温度控制器是可行的，实现了变换炉触媒层温度的最优控制。
英文摘要	As is well known, there are demands for nonlinear optimal controller designs in many real world applications. Therefore, adaptive dynamic programming (ADP), as a new paradigm for approximately solving the optimal control problem of nonlinear systems, has gained much attention from a lot of researchers. Adaptive dynamic programming, combining with neural networks, can effectively avoid the “curse of dimensionality”, and meanwhile obtain the approximate optimal closed-loop feedback control law. However, the architecture of ADP approach is far from perfect. Many theoretical and technical issues of optimal control for nonlinear systems based on ADP have yet to be addressed. In this dissertation, based on ADP, in the presence of external disturbance and unknown mathematical model, optimal tracking control problems are investigated for nonlinear systems. Furthermore, a neural network (NN) based ADP approach is applied to solve the optimal temperature control problem of the water-gas shift (WGS) process. The main contributions of the dissertation can be briefly described as follows: 1. Propose a novel optimal tracking controller design scheme for aclass of unknown discrete-time (DT) nonlinear systems by using greedy iterative heuristic dynamic programming (HDP) algorithm. First, in order to obtain the dynamics of nonlinear system, an identifier is constructed by a three-layer feedforward NN. Second, a feedforward controller is designed to get the steady control input of the system. Third, via system transformation, the original tracking problem is transformed into an optimal regulation problem with respect to the state tracking error. Then, the greedy iterative HDP algorithm is introduced to deal with the regulation problem with convergence analysis. Finally, simulation results are also presented to demonstrate the effectiveness of the proposed scheme. 2. In the presence of external disturbance, a nearly H∞ optimal tracking control scheme based on generalized Hamilton-Jacobi-Isaacs (GHJI) is developed for affine DT nonlinear systems. First, via system transformation, the original tracking problem is transformed into an optimal regulation problem with respect to the state tracking error. Second, with regard to the converted regulation problem, the corresponding GHJI equation is formulated, and then the L2-gain analysis of the closed-loop nonlinear system are employed. Third, a nearly optimal iterative algorithm based on the game theoretic interpretation of ...
关键词	自适应动态规划最优控制 H无穷控制神经网络观测器非线性系统 Adaptive Dynamic Programming Optimal Control H-infinity Control Neural Network Observer Nonlinear System
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6509
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	黄玉柱. 非线性系统最优控制的自适应动态规划方法及应用[D]. 中国科学院自动化研究所. 中国科学院大学,2013.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20101801462800（2297KB）			暂不开放	CC BY-NC-SA