英文摘要 | With the rapid development of science and technology, the optimization problems of dynamic systems have been more closely looked at, and optimal control theory has gained widespread popularity due to its superiority. Adaptive dynamic programming (ADP) has been considered as one of the most effective techniques to solve optimal control problems, which solves the ``curse of dimensionality'' caused by the traditional dynamic programming methods effectively. The core idea of ADP is to approximate the iterative value functions and iterative control laws by utilizing function approximation structures, such as neural networks. However, there are still lots of theoretical and technical difficulties required to be conquered, while using ADP to cope with the distributed iteration control problems. Therefore, this thesis takes research on the distributed control problems based on ADP. The main contributions of this thesis include the following five parts.
1. A distributed policy iteration method is presented for the infinite horizon optimal control problems of multicontroller nonlinear systems. In each iteration of the presented method, only one iterative control law is updated, instead of all the iterative control laws, which effectively reduces the computational burden. The properties of the distributed policy iteration method are analyzed, such as monotonicity, convergence, and optimality, which show that the iterative value function is non-increasingly convergent to the solution of the Hamilton-Jacobi-Bellman (HJB) equation.
2. A distributed fault-tolerant control method is presented for the fault-tolerant control problems of multicontroller linear systems. Based on the distributed policy iteration method and fault compensation method, the optimal control is realized with the partial system information, and the effects of actuator faults are removed. The properties of the distributed fault-tolerant control method are analyzed, such as stability and optimality, which show that the presented method guarantees the stability of the control systems and reduces the computational burden.
3. A distributed value iteration method is presented for the optimal control problems of control systems with finite states. By separating the control systems into several subsystems, the modi-matrix method is presented to calculate the iterative value function of each subsystem, which can convert Bellman equation into a linear recursive matrix equation in the Bellman semiring. Then, a novel distributed value iteration method is established to iteratively update the iterative value function of each subsystem, instead of all the iterative value functions, which effectively reduces the computational burden. The properties of the distributed value iteration method are analyzed, such as monotonicity, convergence, and optimality, which show that the iterative value function is non-decreasingly convergent to the optimal performance index function in finite
iteration steps.
4. A data-driven distributed control method is presented for the optimal output cluster synchronization control problems of the heterogeneous multi-agent systems with external disturbances. A novel distributed adaptive observer is introduced to estimate the state and system matrices of each leader. In order to realize the output tracking control and the disturbance rejection, the output cluster synchronization control problem is transformed into the output regulation problem, and reinforcement learning method is introduced to obtain the optimal control laws and optimal performance index functions. The stability, convergence, and optimality are analyzed, which show the effectiveness of the presented data-driven distributed control method.
5. A data-driven distributed control method is presented for the multi-agent systems with input saturation. The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game. A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution, and the neural networks are introduced to implement the presented method. The convergence and optimality are analyzed, which show that the iterative control laws converge to the Nash equilibrium. |
修改评论