已选(0)清除
条数/页: 排序方式: |
| MMD-MIX: Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning 会议论文 , Shenzhen, China, 18-22 July 2021 作者: Zhiwei Xu; Dapeng Li; Yunpeng Bai; Guoliang Fan Adobe PDF(3892Kb)  |  收藏  |  浏览/下载:6/2  |  提交时间:2024/05/28 |
| AlphaHoldem: High-Performance Artificial Intelligence for Heads-Up No-Limit Poker via End-to-End Reinforcement Learning 会议论文 , 线上, 2022-02-22 作者: Zhao EM(赵恩民); Yan RY(闫仁业); Li JQ(李金秋); Li K(李凯); Xing JL(兴军亮) Adobe PDF(2593Kb)  |  收藏  |  浏览/下载:145/55  |  提交时间:2023/06/29 |
| Learning to Play Hard Exploration Games Using Graph-guided Self-navigation 会议论文 , 线上, 2021-02 作者: Zhao EM(赵恩民); Yan RY(闫仁业); Li K(李凯); Li LJ(李丽娟); Xing JL(兴军亮) Adobe PDF(413Kb)  |  收藏  |  浏览/下载:144/54  |  提交时间:2023/06/28 |
| Hierarchical Cooperative Swarm Policy Learning with Role Emergence 会议论文 , Online, 05-07 December 2021 作者: Zhang TL(张天乐); Liu Z(刘振); Pu ZQ(蒲志强); Qiu TH(丘腾海); Yi JQ(易建强) Adobe PDF(327Kb)  |  收藏  |  浏览/下载:127/57  |  提交时间:2023/06/12 |
| Semantic Perception Swarm Policy with Deep Reinforcement Learning 会议论文 , Online, 05 December 2021 作者: Zhang TL(张天乐); Liu Z(刘振); Pu ZQ(蒲志强); Yi JQ(易建强) Adobe PDF(523Kb)  |  收藏  |  浏览/下载:105/43  |  提交时间:2023/06/12 |
| Multi-agent Collaborative Learning with Relational Graph Reasoning in Adversarial Environments 会议论文 , 线上会议, 2021-9 作者: Wu Shiguang; Qiu Tenghai; Pu Zhiqiang; Yi Jianqiang Adobe PDF(1396Kb)  |  收藏  |  浏览/下载:236/69  |  提交时间:2022/06/16 |
| Multi-target Coverage with Connectivity Maintenance using Knowledge-incorporated Policy Framework 会议论文 , Xi'an China, May 31-Jun. 4 作者: Shiguang Wu; Zhiqiang Pu; Zhen Liu; Tenghai Qiu; Jianqiang Yi; Tianle Zhang Adobe PDF(12862Kb)  |  收藏  |  浏览/下载:261/39  |  提交时间:2022/04/06 |
| Spiking Adaptive Dynamic Programming Based on Poisson Process for Discrete-Time Nonlinear Systems 期刊论文 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 页码: 11 作者: Wei, Qinglai; Han, Liyuan; Zhang, Tielin Adobe PDF(2904Kb)  |  收藏  |  浏览/下载:192/1  |  提交时间:2022/01/27 Maximum likelihood estimation (MLE) Nonlinear systems Optimal control Poisson process Spike train Spiking Adaptive dynamic programming(SADP) |
| Neuro-Optimal Trajectory Tracking With Value Iteration of Discrete-Time Nonlinear Dynamics 期刊论文 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 页码: 12 作者: Wang, Ding; Ha, Mingming; Cheng, Long 收藏  |  浏览/下载:259/0  |  提交时间:2022/01/27 Trajectory Heuristic algorithms Convergence Trajectory tracking Stability criteria Optimal control Dynamic programming Adaptive critic design discrete-time nonlinear plants neuro-optimal trajectory tracking uniformly ultimately bounded stability value iteration |
| Target Tracking Control of a Biomimetic Underwater Vehicle Through Deep Reinforcement Learning 期刊论文 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 页码: 12 作者: Wang, Yu; Tang, Chong; Wang, Shuo; Cheng, Long; Wang, Rui; Tan, Min; Hou, Zengguang 收藏  |  浏览/下载:217/0  |  提交时间:2022/01/27 Reinforcement learning Target tracking Robots Sports Aerospace electronics Mobile robots Underwater vehicles Biomimetic underwater vehicle (BUV) reinforcement learning target tracking control |