Empirical Policy Optimization for n-Player Markov Games
Yuanheng Zhu; Weifan Li; Mengchen Zhao; Jianye Hao; Dongbin Zhao
发表期刊IEEE Transactions on Cybernetics
2022
页码doi={10.1109/TCYB.2022.3179775}
摘要

In single-agent Markov decision processes, an agent
can optimize its policy based on the interaction with the environment.
In multiplayer Markov games (MGs), however, the
interaction is nonstationary due to the behaviors of other players,
so the agent has no fixed optimization objective. The challenge
becomes finding equilibrium policies for all players. In this
research, we treat the evolution of player policies as a dynamical
process and propose a novel learning scheme for Nash equilibrium.
The core is to evolve one’s policy according to not
just its current in-game performance, but an aggregation of its
performance over history. We show that for a variety of MGs,
players in our learning scheme will provably converge to a point
that is an approximation to Nash equilibrium. Combined with
neural networks, we develop an empirical policy optimization
algorithm, which is implemented in a reinforcement-learning
framework and runs in a distributed way, with each player
optimizing its policy based on own observations. We use two
numerical examples to validate the convergence property on
small-scale MGs, and a pong example to show the potential on
large games.

七大方向——子方向分类机器博弈
国重实验室规划方向分类开放博弈基础理论
是否有论文关联数据集需要存交
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/51532
专题多模态人工智能系统全国重点实验室_深度强化学习
推荐引用方式
GB/T 7714
Yuanheng Zhu,Weifan Li,Mengchen Zhao,et al. Empirical Policy Optimization for n-Player Markov Games[J]. IEEE Transactions on Cybernetics,2022:doi={10.1109/TCYB.2022.3179775}.
APA Yuanheng Zhu,Weifan Li,Mengchen Zhao,Jianye Hao,&Dongbin Zhao.(2022).Empirical Policy Optimization for n-Player Markov Games.IEEE Transactions on Cybernetics,doi={10.1109/TCYB.2022.3179775}.
MLA Yuanheng Zhu,et al."Empirical Policy Optimization for n-Player Markov Games".IEEE Transactions on Cybernetics (2022):doi={10.1109/TCYB.2022.3179775}.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Empirical_Policy_Opt(1739KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Yuanheng Zhu]的文章
[Weifan Li]的文章
[Mengchen Zhao]的文章
百度学术
百度学术中相似的文章
[Yuanheng Zhu]的文章
[Weifan Li]的文章
[Mengchen Zhao]的文章
必应学术
必应学术中相似的文章
[Yuanheng Zhu]的文章
[Weifan Li]的文章
[Mengchen Zhao]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Empirical_Policy_Optimization_for__n_-Player_Markov_Games.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。