跨语言交互中的错误检测及纠错对话问题研究

CASIA OpenIR > 毕业生 > 博士学位论文

	跨语言交互中的错误检测及纠错对话问题研究
其他题名	Research on Error Detection and Human-Computer Dialogue for Error Correction in Cross-language Conversation
	于东
	2011-06-04
学位类型	工学博士
中文摘要	现代社会中，人们的跨语言交互需求日益增多，计算机辅助的跨语言交互系统成为研究者们关注的热点。然而，由于系统处理自然语言的能力不足而产生的系统错误严重影响了系统性能，大大降低了其实用性。以人机对话的方式实现跨语言交互系统的错误纠正，为解决该问题提供了一个新的思路。本文针对跨语言交互中的错误检测和纠错对话问题，从口语翻译质量评估和错误检测、人机对话管理、纠错对话话语生成等几个方面展开深入研究，并最终建立纠错对话系统，实现了跨语言交互过程中的纠错。论文在该领域的贡献和创新点主要包括： 1. 提出了基于循环翻译特征的口语翻译质量在线评估和错误定位方法。本文从循环翻译过程中提取口语翻译置信度特征，使用基于支持向量回归（SVR）的机器学习方法对人工主观译文质量评分结果进行拟合，实现了无参考译文口语翻译质量在线评估。根据该评估结果，本文进一步提出了翻译错误定位方法。实验表明，系统计算的翻译置信度得分与人工主观评测得分之间具有较高的相关度。循环翻译特征能够显著提高翻译置信度计算的准确度，基于SVR的机器学习方法可以有效拟合人工主观评测结果，并具有良好的推广能力。 2. 提出了基于动态贝叶斯网络（DBN）的人机对话管理方法。本文将人机对话管理问题描述为包含用户话语输入信息的DBN模型，对话策略的生成不仅依赖于当前系统状态，同时也依赖于当前用户话语。系统状态、对话策略和用户话语都被看作DBN模型中的随机变量，模型参数即为表达随机变量依赖关系的条件概率分布，可以从对话语料中自动学习得到。DBN模型中，系统状态和用户语言的建模可以不依赖于话语语义表示或任务知识，因而可以作为一种与特定任务无关的通用对话管理方法。针对特定任务的实验结果表明，基于DBN的人机对话管理模型显示出较好的对话策略选择能力。 3. 提出了基于统计的澄清式疑问句生成方法。该方法针对话语中的错误部分动态生成澄清式疑问句，系统可以据此发起纠错对话。本文提出两种方法对澄清疑问模式建模，分别是截取模型和对齐泛化短语模型，实现句子规划；提出利用统计机器翻译方法将澄清疑问模式转换为澄清式疑问句，实现表层生成模型。实验证明，在给定话语错误定位信息的条件下，澄清式疑问句生成模型可以有效模拟口语中的澄清提问形式，模型可以针对不同的错误情况生成合理的澄清式疑问句。 4. 结合上述研究成果，建立纠错对话系统，实现跨语言交互过程中错误纠正。本文从四个方面描述该系统架构：语音识别和口语翻译错误检测、纠错对话策略生成、纠错对话话语解析、以及基于DBN模型的纠错对话管理。针对不同错误环境下的纠错实验表明，该系统可以有效检测跨语言交互过程中的系统错误并可以通过纠错对话有效实现错误纠正。; In modern society, people’s cross-language conversation grows rapidly. Computer-aided cross-language interactive system has become hotspot of research. However, natural language processing errors caused by system will seriously affect system performance. The idea of correcting errors in cross-language conversation by using human-computer dialogue provides a new way to solve the problem. This paper presents a new approach to error detection and error correction dialogue for cross-language conversation system, discusses issues of speech translation error detection, dialogue management and clarification question generation, and builds a clarification dialogue system. Contributions and innovations of this paper are mainly summarized as follows: 1. Presents a Round-trip Translation (RTT) feature based approach to online confidence estimation and error localization for speech translation without the assistance of reference translations. A number of RTT features are introduced to reflect the quality of speech translation. Support Vector Regression (SVR) method is employed to learn human’s assessment patterns of translation quality. Additionally, error localization method is proposed by using the assessment result. Experimental results show that RTT based features could improve the accuracy of CE significantly and SVR method could model human’s assessment pattern accurately and robustly. The system performance can achieve high correlation with human’s assessment results even with small training data. 2. Presents a new approach to dialogue management method based on Dynamic Bayesian Networks (DBN). The problem of dialogue management is described as a DBN model with user utterance as input. System action is not only depends on system state, but also depends on the user utterance. System state, system action and user utterance are treated as stochastic variables. Conditional probability distributions of these variables are parameters of the DBN, which can be estimated from dialogue corpus. Both system state and user utterance are not modeled based on semantic representation or task knowledge, and thus the model can be used as a general method of dialogue management. Experimental results on specific task show that the method has good performance in dialogue strategy selection task. 3. Presents a new approach to statistical clarification question generation. The proposed method can generate clarification question dynamically according to user utterance given s...
关键词	跨语言交互错误检测纠错对话对话管理澄清式疑问 Cross-language Conversation Error Detection Human-computer Dialogue For Error Correction Dialogue Management Clarification Question
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6392
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	于东. 跨语言交互中的错误检测及纠错对话问题研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20071801462807（1954KB）			暂不开放	CC BY-NC-SA