深度强化学习在多机对战战术决策中的应用研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	深度强化学习在多机对战战术决策中的应用研究
	张业胜1,2
	2018-05
学位类型	工程硕士
中文摘要	随着无人机在世界各国的快速发展，无人机已经成为不可忽视的军事力量的重要组成部分。无人机的智能化水平是决定无人机整体性能的关键因素之一。深度学习是近10年来发展最为迅猛的学科之一，在文本、语音、图像等领域都有着成功的应用。特别是深度强化学习的引入对人工智能领域产生了深远的影响。本文主要研究将深度强化学习方法应用于无人机对战，旨在提高无人机自主空战战术决策的智能水平。主要工作包括以下几个部分： 1.提出了将深度强化学习应用于1对1空战中的战术机动选择。定义了无人机空战的环境，包括系统状态、可选机动、空战态势评估等；根据经典的能量战术与角度战术设计了系统学习所需的奖赏函数；设计了深度强化学习应用于空战训练的深度学习模型，并进行了多层次的实验，取得了良好的效果。 2.设计了功能丰富的空战仿真系统。定义了多样化的人机接口，可以实现多种空战模拟；对空战模拟及训练中产生的大量数据进行了整理与筛选并持久化存储；充分利用这些存储的数据进行模仿学习，进一步优化了深度神经网络，提高了无人机自主空战战术决策的效果。 3.提出了一种多机编队对战战术决策方法。根据无人机攻击效果的不同，划分了4块不同威胁的区域；设计了多机协同时的目标分配算法；对典型的4对2编队空战进行了仿真，验证了该方法的有效性，提高了多机战术决策的智能化程度。
英文摘要	With all the countries in the world to develop UAVs and UCAVs, UAV has become the important part of military forces. The intelligent degree of UAV is one of the key factors for determining the level of UAV. Deep learning is one of the most rapid development in the past 10 years, has a successful application in the field of text, speech and image. Especially deep reinforcement learning is proposed, which has a far-reaching influence on the field of artificial intelligence. This paper mainly studies how the deep learning method is applied to UAV air combat, in order to improve the intelligent level of tactics decision-making of UAV air combat . The main work includes the following parts: 1. The deep reinforcement learning is applied to one-to-one air combat tactical maneuver. The definition of UAV air combat environment, including the state of the system, optional maneuver, air combat situation assessment; the reward function according to the energy and angle of the classic tactical tactical design system for learning; deep learning model used in deep reinforcement learning is applied to air combat training, and multi level experiments are taken and get good results. 2. Design air combat simulation system with rich features. The definition of the interface versatility, can achieve a variety of air combat simulation; a large amount of data generated in the air combat simulation and training are collected and screened and long-term persisted; make full use of the stored data to learn the further optimization of the deep neural network. 3. This paper puts forward a multi-UCAV formation combat tactics decision-making method. According to the different UAV attack effect, space is divided into 4 blocks of different threat areas; design the target assignment algorithm for multi-UCAV coorperation; the typical 4 to 2 formation air combat simulation is carried out to verify the effectiveness of the method, raise the level of intelligent multi machine tactical decision.
关键词	深度强化学习机动决策战术决策空战仿真多机协同
学科领域	智能信息系统
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/21193
专题	毕业生_硕士学位论文
作者单位	1.中国科学院自动化研究所 2.中国科学院大学
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	张业胜. 深度强化学习在多机对战战术决策中的应用研究[D]. 北京. 中国科学院大学,2018.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
深度强化学习在多机对战战术决策中的应用研（6414KB）			限制开放	--