CASIA OpenIR

浏览/检索结果: 共10条,第1-10条 帮助

限定条件                    
已选(0)清除 条数/页:   排序方式:
Dynamic-horizon model-based value estimation with latent imagination 期刊论文
IEEE Transactions on Neural Networks and Learning Systems, 2022, 页码: 1-14
作者:  Wang JJ(王俊杰);  Zhang QC(张启超);  Zhao DB(赵冬斌)
Adobe PDF(2305Kb)  |  收藏  |  浏览/下载:197/69  |  提交时间:2023/05/30
Latent world model  model-based value expansion (MVE)  reinforcement learning  reinforcement learning  
A comprehensive scheme for tattoo text detection 期刊论文
PATTERN RECOGNITION LETTERS, 2022, 卷号: 163, 页码: 168-179
作者:  Banerjee, Ayan;  Shivakumara, Palaiahnakote;  Pal, Umapada;  Raghavendra, Ramachandra;  Liu, Cheng-Lin
收藏  |  浏览/下载:177/0  |  提交时间:2022/11/28
HMDRL: Hierarchical Mixed Deep Reinforcement Learning to Balance Vehicle Supply and Demand 期刊论文
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 卷号: 23, 期号: 11, 页码: 21861-21872
作者:  Xi, Jinhao;  Zhu, Fenghua;  Ye, Peijun;  Lv, Yisheng;  Tang, Haina;  Wang, Fei-Yue
Adobe PDF(3316Kb)  |  收藏  |  浏览/下载:322/42  |  提交时间:2022/09/19
deep reinforcement learning  online ride-hailing system  hierarchical repositioning framework  parallel coordination mechanism  mixed state  
POPO: Pessimistic Offline Policy Optimization 会议论文
, Singapore, Singapore, 23-27 May 2022
作者:  He Q(何强);  Hou XW(侯新文);  Liu Y(刘禹)
Adobe PDF(1200Kb)  |  收藏  |  浏览/下载:215/44  |  提交时间:2022/06/27
reinforcement learning  offline optimization  out-of-distribution  
面向连续控制任务的深度强化学习值函数估计研究 学位论文
工学硕士, 中国科学院自动化研究所: 中国科学院大学, 2022
作者:  何强
Adobe PDF(4687Kb)  |  收藏  |  浏览/下载:233/5  |  提交时间:2022/06/17
深度强化学习  值函数估计  值函数表示  集成强化学习  
基于元学习和强化学习的机器人操作视觉模仿技术研究 学位论文
, 中国科学院自动化研究所: 中国科学院自动化研究所, 2022
作者:  李佳怡
Adobe PDF(33715Kb)  |  收藏  |  浏览/下载:239/15  |  提交时间:2022/06/13
机器人操作学习  视觉模仿  元学习  强化学习  
Online Minimax Q Network Learning for Two-Player Zero-Sum Markov Games 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 卷号: 33, 期号: 3, 页码: 1228-1241
作者:  Zhu, Yuanheng;  Zhao, Dongbin
Adobe PDF(2838Kb)  |  收藏  |  浏览/下载:252/12  |  提交时间:2022/06/10
Games  Nash equilibrium  Mathematical model  Markov processes  Convergence  Dynamic programming  Training  Deep reinforcement learning (DRL)  generalized policy iteration (GPI)  Markov game (MG)  Nash equilibrium  Q network  zero sum  
Supervised assisted deep reinforcement learning for emergency voltage control of power systems 期刊论文
NEUROCOMPUTING, 2022, 卷号: 475, 页码: 69-79
作者:  Li, Xiaoshuang;  Wang, Xiao;  Zheng, Xinhu;  Dai, Yuxin;  Yu, Zhihong;  Zhang, Jun Jason;  Bu, Guangquan;  Wang, Fei-Yue
Adobe PDF(2551Kb)  |  收藏  |  浏览/下载:360/76  |  提交时间:2022/06/06
Deep reinforcement learning  Behavioral cloning  Dynamic demonstration  Emergency control  
Highway Lane Change Decision-Making via Attention-Based Deep Reinforcement Learning 期刊论文
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 卷号: 9, 期号: 3, 页码: 567-569
作者:  Wang, Junjie;  Zhang, Qichao;  Zhao, Dongbin
Adobe PDF(803Kb)  |  收藏  |  浏览/下载:296/69  |  提交时间:2022/02/16
SADRL: Merging human experience with machine intelligence via supervised assisted deep reinforcement learning 期刊论文
NEUROCOMPUTING, 2022, 卷号: 467, 页码: 300-309
作者:  Li, Xiaoshuang;  Wang, Xiao;  Zheng, Xinhu;  Jin, Junchen;  Huang, Yanhao;  Zhang, Jun Jason;  Wang, Fei-Yue
Adobe PDF(1244Kb)  |  收藏  |  浏览/下载:344/76  |  提交时间:2021/12/28
Deep reinforcement learning  Behavioral cloning  Dynamic demonstration  Double DQN