CASIA OpenIR  > 紫东太初大模型研究中心
Reparameterizing and dynamically quantizing image features for image generation
Sun, Mingzhen1,2; Wang, Weining1; Zhu, Xinxin1; Liu, Jing1,2
Source PublicationPATTERN RECOGNITION
ISSN0031-3203
2024-02-01
Volume146Pages:11
Abstract

For autoregressive image generation, vector-quantized VAEs (VQ-VAEs) quantize image features with discrete codebook entries and reconstruct images from quantized features. However, they treat each codebook entry separately, which causes losses of image details. In this paper, we propose to reparameterize image features with weight vectors to treat all codebook entries as an entity, and present a novel dynamically vector quantized VAE (DVQ-VAE) to quantize reparameterized image features. Specifically, each image feature corresponds to a weight vector and we sum weighted codebook entries to obtain values of image features. In this way, image features can incorporate information from different codebook entries. Additionally, a novel continuous weight regularization loss is proposed to improve the reconstruction of image details. Our method achieves competitive results with prior state-of-the-art works for image generation and extensive experiments are conducted to take a deep insight into our DVQ-VAE.

KeywordVector quantization Variational auto-encoder Unconditional image generation Text-to-image generation Autoregressive generation
DOI10.1016/j.patcog.2023.109962
Indexed BySCI
Language英语
Funding ProjectNational Key Research and De-velopment Program of China[2022ZD0118801] ; National Natural Science Foundation of China[U21B2043] ; National Natural Science Foundation of China[62102419] ; National Natural Science Foundation of China[62102416]
Funding OrganizationNational Key Research and De-velopment Program of China ; National Natural Science Foundation of China
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS IDWOS:001086812100001
PublisherELSEVIER SCI LTD
Sub direction classification多模态智能
planning direction of the national heavy laboratory多模态协同认知
Paper associated data
Citation statistics
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/54359
Collection紫东太初大模型研究中心
紫东太初大模型研究中心_图像与视频分析
Corresponding AuthorLiu, Jing
Affiliation1.Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Sun, Mingzhen,Wang, Weining,Zhu, Xinxin,et al. Reparameterizing and dynamically quantizing image features for image generation[J]. PATTERN RECOGNITION,2024,146:11.
APA Sun, Mingzhen,Wang, Weining,Zhu, Xinxin,&Liu, Jing.(2024).Reparameterizing and dynamically quantizing image features for image generation.PATTERN RECOGNITION,146,11.
MLA Sun, Mingzhen,et al."Reparameterizing and dynamically quantizing image features for image generation".PATTERN RECOGNITION 146(2024):11.
Files in This Item: Download All
File Name/Size DocType Version Access License
[RP.2023] Reparamete(3612KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Sun, Mingzhen]'s Articles
[Wang, Weining]'s Articles
[Zhu, Xinxin]'s Articles
Baidu academic
Similar articles in Baidu academic
[Sun, Mingzhen]'s Articles
[Wang, Weining]'s Articles
[Zhu, Xinxin]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Sun, Mingzhen]'s Articles
[Wang, Weining]'s Articles
[Zhu, Xinxin]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: [RP.2023] Reparameterizing and dynamically quantizing image features for image generation.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.