Knowledge Commons of Institute of Automation,CAS
A Bidirectional Hierarchical Skip-Gram Model for Text Topic Embedding | |
Suncong Zheng![]() ![]() ![]() ![]() ![]() ![]() | |
2016 | |
会议名称 | IJCNN |
会议日期 | 2016 |
会议地点 | Canada |
出版地 | Canada |
出版者 | IEEE |
摘要 | Taking advantage of the large scale corpus on the web to effectively and efficiently mine the topics within texts is an essential problem in the era of big data. We focus on the problem of learning text topic embedding in an unsupervised manner, which enjoys the properties of efficiency and scalability. Text topic embedding represents words and documents in a semantic topic space, in which the words and documents with similar topic will be embedded close to each other. When compared with con-ventional topic models, which implicitly capture the document-level word co-occurrence patterns, text topic embedding alleviates the data sparsity problem and captures the semantic relevance between different words and documents. To model text topic embedding, we propose a Bidirectional Hierarchical Skip-Gram model (BHSG) based on skip-gram model. BHSG includes two components: semantic generation module to learn semantic relevance between texts and topic enhance module to produce the text topic embedding based on text embedding learned in the former module. We evaluated our method on two kinds of topic-related tasks: text classification and information retrieval. The experimental results on four public datasets and one dataset we provide all demonstrate that our proposed method can achieve a better performance. |
文献类型 | 会议论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/40650 |
专题 | 复杂系统认知与决策实验室_听觉模型与认知计算 |
作者单位 | CASIA |
第一作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Suncong Zheng,Hongyun Bao,Jiaming Xu,et al. A Bidirectional Hierarchical Skip-Gram Model for Text Topic Embedding[C]. Canada:IEEE,2016. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
2016 IJCNN zheng.pdf(442KB) | 会议论文 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论