面向博弈对抗的智能量化评估方法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 脑机融合与认知评估

	面向博弈对抗的智能量化评估方法研究
	牟佳
	2021-06
页数	69
学位类型	硕士
中文摘要	近年来人工智能技术的发展，给很多问题带来了全新的解决思路，以人机对抗形式出现的博弈对抗问题受到了极大的关注，一部分研究者从AI智能体的角度出发，研究如何对博弈对抗环境态势进行有效认知，提高搜索算法的效率，以及如何针对态势采取动作；另一部分研究者从比赛组织者角度，研究如何对博弈对抗环境中的态势进行挖掘，探索比赛组织管理的多样化方式，增加比赛的观赏性与专业性。目前针对博弈对抗环境的认知与评估已成为了人工智能领域具备极大潜力的研究方向，尤其是在军事博弈对抗环境中，认知与评估是提升智能体指挥决策能力的关键。本研究针对智能体战术运用能力高效学习和定性、量化评估的问题分别开展了基于战术描述数据挖掘的作战风格定性评估研究和基于深度学习的智能量化评估研究，通过在兵棋推演实战数据和公开数据集的实验验证证明本研究提出的方法切实有效。在基于战术描述数据挖掘的作战风格定性评估研究中，本研究利用单算子频繁项特征挖掘与多算子协同动作特征挖掘共同实现了战场重要地物挖掘与智能体行为特性挖掘两个目标。针对单算子频繁项特征挖掘，本研究基于复盘数据设计了动作特征提取方案，采用WTH（who-type-how）形式对提取后的动作特征进行编码，选取了时序频繁项挖掘算法AprioriAll对复盘数据表现出的智能体行动习惯与特性进行提取。针对多算子协同动作特征挖掘，本研究根据陆战合同战术结合OODA循环理论设计了四类协同动作，对多算子协同下的战术行为进行标注，可以支撑非依赖军事知识的智能体利用高层次战术语义特征进行训练。对于挖掘出的智能体行为特征，分别采用了线性模型与谱聚类模型完成了智能体的能力归因与作战风格定性评估。在基于深度学习的智能量化评估研究中，本研究基于智能体决策模式，即根据环境反馈信息动态调整策略的环境-策略交互模式，提出了智能量化评估框架，其对博弈对抗环境态势进行特征提取，并抽取多种效用信息构建了动态可重组的多元标签数据，结合深度学习设计了端到端的智能量化评估模型，最终对模型预估结果进行聚合分析，实现了对博弈对抗过程中智能体的综合表现进行量化评估的目标。进一步本研究基于多通道时序空间编码环境信息，提出一种附带侧信息融合的卷积长短时记忆网络SideInfo-STNet。将该智能量化评估框架应用于战术级海军兵棋推演系统中，分析结果表明所得结果有极高的可信度，并能够获得启发式的结论。本研究开展的面向博弈对抗的智能量化评估是基于博弈对抗过程对智能体表现的综合定性与量化评价，通过在战术级兵棋推演上进行实验，验证了本研究所提出的方案与框架都有极好的效果，能够从大量复盘数据中捕获到重要的信息以得到高解释性与高量化比较性的结论，进而有效地辅助智能体提升决策与行动等能力，这一点对军事博弈对抗尤为重要。
英文摘要	The development of artificial intelligence technology in recent years has brought brand new solutions to many problems. The problem of game confrontation in the form of human-computer confrontation has received great attention. From the perspective of AI agents, some researchers study how to effectively recognize the situation of the environment, improve the efficiency of search algorithms, and how to take actions in response to the situation;from the perspective of the game organizer，another part of the researchers study how to mine the the situation of the environment and explore the diversified ways of competition organization and management in order to increasing the viewing and professionalism of the competition. At present, the cognition and evaluation of the game confrontation environment become a direction with great potential in the field of artificial intelligence. Especially in the military game confrontation environment, cognition and evaluation are the keys to improve the command and decision-making ability of the agent. Aiming at the problems of efficient learning and qualitative and quantitative evaluation of the tactical application capabilities of agents, this paper carries out qualitative evaluation of combat style based on tactical descriptive data mining and intelligent quantitative evaluation based on deep learning.The experimental verifications of actual combat data setted on wargames and public data prove that the methods proposed in this paper are effective. In the research of qualitative evaluation of combat style based on tactical descriptive data mining, this paper uses single-operator frequent item feature mining and multi-operator collaborative action feature mining to jointly achieve the two goals of mining important features on the battlefield and mining agent behavior characteristics. For single operator frequent item feature mining, this paper designs an action feature extraction scheme based on replay data, uses WTH (who-type-how) form to encode the extracted action features, and selects the time-series frequent item mining algorithm AprioriAll to extract the behavioral habits and characteristics of the agent shown by the replay data. For the feature mining of multi-operator cooperative actions, this paper designs four types of cooperative actions based on army warfare contract tactics combined with OODA cycle theory, and labels the tactical behaviors under multi-operator coordination to support the training of intelligent agents that do not rely on military knowledge base on hierarchical tactical semantic features.For the extracted behavior features of agents, a linear model and a spectral clustering model were used to complete the agent's ability attribution and combat style qualitative evaluation. In the research of intelligent quantitative evaluation, this paper proposes an intelligent quantitative evaluation framework based on the decision-making mode of the agent, that is, the environment-strategy interaction mode in which agent dynamically adjusts the strategy base on the environmental feedback information. It extracts the features of the environmental situation and extracts a variety of reward information to construct dynamic and reconfigurable multi-label data.Then it designs an end-to-end intelligent quantitative evaluation model combined with deep learning, and finally aggregates the model prediction results to achieve the goal of quantitative evaluation of the comprehensive performance of the agent in the game confrontation process. Furthermore, this paper proposes a novel Side Information fused Spatial-Temporal Network(SideInfo-STNet) base on multi-channel temporal spatial coding of environment information. The intelligent quantitative evaluation framework is applied to the tactical navy wargame. The analysis results show that the obtained results have extremely high credibility and heuristic conclusions can be obtained. Intelligent quantitative evaluations oriented to game confrontation carried out in this paper are comprehensive qualitative and quantitative evaluations of the performance of the agent based on the game confrontation process. Through experiments on tactical wargames, it is verified that the schemes and frameworks proposed in this paper are extremely powerful. they can capture important information from a large amount of replay data to obtain highly interpretable and highly quantified comparative conclusions, which can effectively assist the agent to improve decision-making and action capabilities, this is particularly important for military game confrontation.
关键词	博弈对抗时序空间特性智能量化评估深度学习时序频繁项挖掘
语种	中文
七大方向——子方向分类	复杂系统推演决策
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44795
专题	多模态人工智能系统全国重点实验室_脑机融合与认知评估
推荐引用方式 GB/T 7714	牟佳. 面向博弈对抗的智能量化评估方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
毕业论文.pdf（4701KB）	学位论文		开放获取	CC BY-NC-SA