Knowledge Commons of Institute of Automation,CAS
Segmentation of mixed Chinese/English documents based on Chinese radicals recognition and complexity analysis in local segment pattern | |
Xia, Yong; Xiao, Bai-Hua![]() ![]() ![]() | |
发表期刊 | INTELLIGENT COMPUTING IN SIGNAL PROCESSING AND PATTERN RECOGNITION
![]() |
2006 | |
卷号 | 345页码:497-506 |
文章类型 | Article |
摘要 | Segmentation based on character recognition is one of the most popular methods of segmenting mixed Chinese/English documents. However, the rejection to outliers is always the bottleneck of this method. A new method is provided to alleviate the problem in this paper. We will give language attribute of each segment as possible as we can and then merge or split segment according to the language attribute. First of all, we construct a mixed OCR engine for Chinese radical and English character and some English character-pairs. Furthermore, English negative samples are trained to improve the capability of rejection to outliers. Finally, language determination of segments based on the mixed OCR engine and complexity analysis of local pattern is conducted. Encouraging performance has been obtained according to the test results. |
WOS标题词 | Science & Technology ; Technology |
关键词[WOS] | MULTILAYER PERCEPTRONS |
收录类别 | ISTP ; SCI |
语种 | 英语 |
WOS研究方向 | Automation & Control Systems ; Computer Science ; Engineering |
WOS类目 | Automation & Control Systems ; Computer Science, Artificial Intelligence ; Computer Science, Information Systems ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000240385300051 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/9235 |
专题 | 09年以前成果 |
作者单位 | Chinese Acad Sci, Inst Automat, Beijing 100080, Peoples R China |
推荐引用方式 GB/T 7714 | Xia, Yong,Xiao, Bai-Hua,Wang, Chun-Heng,et al. Segmentation of mixed Chinese/English documents based on Chinese radicals recognition and complexity analysis in local segment pattern[J]. INTELLIGENT COMPUTING IN SIGNAL PROCESSING AND PATTERN RECOGNITION,2006,345:497-506. |
APA | Xia, Yong.,Xiao, Bai-Hua.,Wang, Chun-Heng.,Li, Yao-Dong.,Huang, DS.,...&Irwin, GW.(2006).Segmentation of mixed Chinese/English documents based on Chinese radicals recognition and complexity analysis in local segment pattern.INTELLIGENT COMPUTING IN SIGNAL PROCESSING AND PATTERN RECOGNITION,345,497-506. |
MLA | Xia, Yong,et al."Segmentation of mixed Chinese/English documents based on Chinese radicals recognition and complexity analysis in local segment pattern".INTELLIGENT COMPUTING IN SIGNAL PROCESSING AND PATTERN RECOGNITION 345(2006):497-506. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论