CASIA OpenIR  > 毕业生  > 硕士学位论文
Thesis Advisor徐波 ; 李成荣
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword语音识别 命令词识别 关键词检测 智能语音机器人 儿童语音识别 儿童语音数据库 Speech Recogniton Voice Command Recogniton Keyword Spotting Intellegent Speech Robot Speech Recognition For Children Database Of
Abstract语音识别技术是人类迈向高度智能化和自动化的信息社会所必备的关键技 术之一。经过几十年的艰苦探索和研究,语音识别研究获得了极大的发展,某 些比较成熟的技术已经逐步应用于日常生活中。以命令词识别、关键词检测和 连续数字串识别为代表的中小词汇量语音识别技术,是语音识别实用化研究中 相当重要的方向。 本文开展的工作主要集中于汉语命令词识别和关键词检测技术的研究和应 用,概括起来有以下几个方面: 1. 开发了一个高性能的汉语命令词识别引擎,并为其设计和实现了简洁、 完善、灵活易用的应用程序开发接口。识别引擎采用了有调的类三音 子声学模型,使用与声学模型无关的词树表示词典知识,在帧同步的 Viterbi-Beam搜索过程中使用了多门限路径裁剪技术。对此 识别引擎的各项测试表明,在较为理想的环境下系统误识率 低于2%。在实际应用背景下,识别引擎也体现出很高的性 能和实用价值。 2. 使用上述汉语命令词识别引擎,研制开发了智能语音机器人系统,主 要的应用场合是展览环境下的人机语音交互。其中环境噪声、观众类 型的多样性以及某些情况下占较大比重的儿童观众是展览环境下语音 识别所面临的主要问题,文章在说话人聚类、儿章语音识别以及噪声 过滤和拒识等方面进行了有针对性的研究。 3. 在开发智能机器人系统的过程中,采集并建立了包括上千人的大规模 儿童语音数据库,这是目前国内所仅有的。此外,我们还采用无监督 方式自动采集了大量真实展览馆噪声环境下的自然口语化语音数据。 4. 对关键词检测技术做了初步的探讨,开发了一个汉语关键词检测的基 本系统,实现了基于音节的填充模型和词树上的连续语音Viterbi-Beam 搜索。
Other AbstractSpeech recognition is one of the most indispensable technologies for human to realize a higly intelligentized and fully roboticized information society in the future. With many researchers' enormous efforts, the past tens of years has witnessed significant progress in speech recognition technologies and part of them have already been applied in people's daily life. The technologies of small or medium vocabulary speech recognition, such as voice command recognition, keyword spotting and continuous digit string speech recognition, are of great importance in the application-oriented study o f speech recognition. In this paper we focus our research on Mandarin voice command recognition and keyword spotting. There are several points in my work: 1. Build a high performance Mandarin voice command recognizer, then design and implement a set of concise but powerful APIs. The recognizer uses the tonal class-triphone as acoustic model, represents the pronunciation lexicon with prefix lexical tree and applies a multi-thresholds path pruning in the frame synchronous Viterbi-Beam search. Our experiments indicate that the recognizer achieves a WER below 2 % under a desired environment. 2. Develop a system of Intelligent Speech Robot with the Mandarin voice command recognizer mentioned above, which is a human-machine interactive system mainly used in exhibitions. Since environmental noises, speaker varieties, and sometimes speech recognition for children are the main problems of speech recognition under exhibitions environments, we study methods of speaker clustering, children's speech recognition and simple noise rejection with gagbage model. 3. During the course of developing the Intelligent Speech Robot, we collected speech data of children and constructed a corpus of children's speech, which is the unique one at present in China. We also collected a great amout of natural speech from children under the noisy exhibition environments in unsupervised style. 4. Investigate some aspects of keyword spotting technology and build a basic system of Mandarin keyword spotting. We use atonal syllables as filler models for non-keywords and implement frame synchronous Viterbi-Beam search based on lexical tree.
Other Identifier648
Document Type学位论文
Recommended Citation
GB/T 7714
马龙. 汉语命令词识别、关键词检测的研究与应用[D]. 中国科学院自动化研究所. 中国科学院研究生院,2002.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[马龙]'s Articles
Baidu academic
Similar articles in Baidu academic
[马龙]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[马龙]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.