CASIA OpenIR  > 毕业生  > 硕士学位论文
人机口语对话系统的知识自动生成技术
Alternative TitleAutomatic Knowledge Generation Technique of Human-Computer Dialog System
黄韵竹
Subtype工学硕士
Thesis Advisor李成荣
2011-05-27
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword人机口语对话系统 词类扩展 一阶谓词逻辑 依存句法分析 知识库生成 Human-computer Dialog System Word Class Expansion First Order Predicate Logic Dependency Parsing Knowledge Base Generation
Abstract人机口语对话技术使得人机交互更加简单自然。然而,要生成一个人机口语对话系统,需要耗费大量的人力物力。如何自动的搜集限定领域语言模型的训练语料以及构建人机口语对话系统的知识库,是当前的两个研究难点。本文针对这些问题,重点对日常对话聊天领域开展研究,提出了半自动扩展语言模型训练语料和构建口语对话知识库的方法。论文的主要内容和贡献如下: 1. 从词级扩展的层面,提出了一种词类扩展方法,并通过实验说明了该方法对语音识别系统的贡献。 2. 提出了一种半自动生成一阶谓词知识表示的方法。该方法利用了依存句法分析。首先对句子去停用词,然后对句子进行句法分析,再根据分析结果和关键词表将句子转换成一阶谓词形式,最后生成谓词知识库。实验表明,采用该方法生成的知识库具有很高的检出率。 3. 将词类的思想用在口语对话知识库上。根据句型将文本进行分类,同类句型只保留一句,其它以同类词的形式存入词类查询表,并且进一步进行词类扩展。采用该方法可以大大缩小知识库的规模,提高系统的处理速度。 4. 运用词类语料扩展和一阶谓词知识表示方法,改进了语音地球仪系统。
Other AbstractHuman-computer dialog technology makes human-computer interaction more simple and natural. However, to generate a Human-computer dialog system a lot of manpower and resources are required. How to automatically collect training corpus of language modal in restricted domain and build knowledge base of Human-computer dialog system, are two challenges in current research. Focusing on the daily chatting area, we propose approaches of semi-automatic extension to the training corpus of language modal and building knowledge base for dialog. The main contents and contributions are as follows: 1. From the level of word-level expansion, a type of word class expansion methods is introduced. We illustrate the contribution of the method to the speech recognition system through experiments. 2. A semi-automatic method of generating first order predicate knowledge is proposed. The dependency parsing theory is used. We first get rid of stop words in the sentences, then analyze sentences with dependency parsing, next according to the parsing result and a key-word list convert the sentences into first order predicate logic form, finally generate the predicate logic knowledge base. Experimental results show that the method can reach the application level. 3. The thought of word classes applied on the knowledge base of the dialogue system. According to the sentence structure, we classify the text and maintain only one of the same structures. The other is saved into a list of word classes. Then the word classes are further expanded. This method can greatly reduce the size of the knowledge base, and improve processing speed. 4. Applying word class expansion and first order predicate logic knowledge representation methods, we improve the speech globe system.
shelfnumXWLW1639
Other Identifier200828014628036
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/7569
Collection毕业生_硕士学位论文
Recommended Citation
GB/T 7714
黄韵竹. 人机口语对话系统的知识自动生成技术[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20082801462803(691KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[黄韵竹]'s Articles
Baidu academic
Similar articles in Baidu academic
[黄韵竹]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[黄韵竹]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.