CASIA OpenIR  > 紫东太初大模型研究中心  > 大模型计算
Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Chen, Zhiyang1,2; Zhu, Yousong1; Li, Zhaowen1,2; Yang, Fan1,3; Li, Wei4; Wang, Haixin1,2; Zhao, Chaoyang1; Wu, Liwei4; Zhao, Rui4; Wang, Jinqiao1,2,3; Tang, Ming1
2022-11-01
会议名称Neural Information Processing Systems
会议日期2022-11-28
会议地点New Orleans, Louisiana & Online
摘要

Visual tasks vary a lot in their output formats and concerned contents, therefore it is hard to process them with an identical structure. One main obstacle lies in the high-dimensional outputs in object-level visual tasks. In this paper, we propose an object-centric vision framework, Obj2Seq. Obj2Seq takes objects as basic units, and regards most object-level visual tasks as sequence generation problems of objects. Therefore, these visual tasks can be decoupled into two steps. First recognize objects of given categories, and then generate a sequence for each of these objects. The definition of the output sequences varies for different tasks, and the model is supervised by matching these sequences with ground-truth targets. Obj2Seq is able to flexibly determine input categories to satisfy customized requirements, and be easily extended to different visual tasks. When experimenting on MS COCO, Obj2Seq achieves 45.7% AP on object detection, 89.0% AP on multi-label classification and 65.0% AP on human pose estimation. These results demonstrate its potential to be generally applied to different visual tasks.

关键词transformer general visual framework sequence prediction multi-task
收录类别EI
七大方向——子方向分类图像视频处理与分析
国重实验室规划方向分类视觉信息处理
是否有论文关联数据集需要存交
文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/56593
专题紫东太初大模型研究中心_大模型计算
作者单位1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.School of Artificial Intelligence, University of Chinese Academy of Sciences
3.Peng Cheng Laboratory
4.SenseTime Research
第一作者单位模式识别国家重点实验室
推荐引用方式
GB/T 7714
Chen, Zhiyang,Zhu, Yousong,Li, Zhaowen,et al. Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks[C],2022.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
1533_obj2seq_formatt(1289KB)会议论文 开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Chen, Zhiyang]的文章
[Zhu, Yousong]的文章
[Li, Zhaowen]的文章
百度学术
百度学术中相似的文章
[Chen, Zhiyang]的文章
[Zhu, Yousong]的文章
[Li, Zhaowen]的文章
必应学术
必应学术中相似的文章
[Chen, Zhiyang]的文章
[Zhu, Yousong]的文章
[Li, Zhaowen]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 1533_obj2seq_formatting_objects_as_.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。