In recent years, various robots have been widely used in military and civil fields and have shown great application value. With the increase in complexity of the operating environment and diversity in mission scenarios, cooperative formation of multi-robot has become a research hotspot in the robotics field due to its advantages in terms of range, safety, and efficiency, etc. As one of the essential technologies in multi-robot formation, formation cooperative planning and control have attracted continuous attention from researchers in many fields. However, most existing researches focus on cooperative control and formation maintenance of multi-robot formation. Although some researchers have achieved certain results in the field of formation cooperative path planning, there are still some typical scenarios and problems that have not yet been solved. Moreover, deep reinforcement learning (DRL) methods, which have emerged in recent years, provide an alternative scheme for the problem of cooperative path planning of multi-robot formation yet there is currently little related research work. Therefore, the path planning problem of multi-robot cooperative formation is studied in this dissertation. Aiming to achieve stable cooperative formation of two typical scenarios, the problem of formation generation and transformation, formation maintenance and cooperative collision avoidance are modeled and solved. The main work and innovation points of this dissertation are summarized as follows:
(1)To deal with the problem of formation generation and transformation for multi-robot system, a novel optimal transformation strategy is proposed based on particle swarm optimization (PSO) algorithm and Hungarian algorithm. Such strategy uses the inner loop to solve the matching relationship of individuals in the formation, the outer loop to optimize the offset between formations. It realizes the generation of expected shortest global collision free path by the joint action of the inner and outer loops. On this basis, a formation path planning algorithm based on limited artificial potential field method is designed to realize safe and collision free path planning for non-particle model.
(2)To deal with the problem of multi-robot formation maintenance and cooperative collision avoidance, deep reinforcement learning based path planning methods are studied. For image input data, a parallel double-Q network structure is implemented, and a cooperative reward mechanism is designed to realize the cooperative path planning of multiagent and complete the task of constrained formation maintenance. For input data of entity state, a DRL based formation maintenance and cooperative collision avoidance method is proposed, where the problem is modeled as a comprehensive reward function based Markov decision process (MDP). The behavior policy of robot is trained through a deep value network to achieve formation maintenance and cooperative collision avoidance in a dynamic environment. Compared with the existing methods, the proposed method has shown significant improvement in success rate and safety.
(3)Considering the problems of slow convergence and low exploration efficiency in the training process of model-free reinforcement learning (RL) methods, formation maintenance and cooperative collision avoidance methods based on model knowledge and data training fusion are studied. A model-guided formation maintenance and cooperative collision avoidance method method is proposed in this dissertation, where a switching system based on consensus theory and multi-agent cooperative collision avoidance method is designed, and the system is used as a demonstrator for imitation learning before RL, so as to obtain effective initial strategy and improve the efficiency of subsequent training. Besides, an action space filter based on the concept of velocity obstacle (VO) is designed, which improves the problem of useless action exploration in reinforcement learning, and improves the performance of the original method in terms of safety and training efficiency. Finally, the effectiveness of the proposed methods is verified by comparative experiments.
(4)Aiming at solving the practical problems of large-scale multi-robot formation such as the difficulty of test and the harsh requirements of conditions, a software-in-the-loop (SITL) simulation platform and a ground unmanned vehicle swarm platform for typical scenarios of multi-robot formation are built to achieve rapid demonstration and verification of the algorithm. In the unmanned vehicle swarm platform, a cross-platform planning and control system is designed to realize any communication network of the swarm, and a virtual Global Positioning System (GPS) method based on ultra-wideband (UWB) indoor positioning is designed. Finally, the effectiveness of the proposed algorithms in this dissertation are verified by those platforms.
In general, starting from two typical scenarios of formation generation and transformation, formation maintenance and cooperative collision avoidance, this dissertation deeply studies the optimal formation transformation strategy, DRL based formation maintenance and cooperative collision avoidance methods and their optimization. On this basis, a SITL simulation system and a hardware platform for the above scenarios are built. The research results obtained have great theoretical and practical application value.
|
修改评论