CASIA OpenIR  > 毕业生  > 博士学位论文
任务型对话系统中对话管理方法研究
王唯康
2020-05-28
页数110
学位类型博士
中文摘要

任务型对话系统指通过对话交互的方式辅助用户完成特定任务(订餐、订机 票等)的人机交互系统。一般而言,任务型对话系统由语言理解、对话管理和语 言生成模块组成。其中,对话管理模块负责记录用户的对话状态,并按照某一对 话策略选择系统行为,是整个任务型对话系统的核心。近年来,对话管理技术取 得了较为丰硕的成果。然而,已有的工作主要关注封闭领域下的对话管理方法。 当遇到新的用户行为时,系统往往会给出不合理的回复。此外,已有的对话管理 方法无法根据客观知识灵活地制定交互策略,只适用于处理简单的槽填充任务。 缺乏知识建模能力极大地限制了任务型对话系统的应用范围。针对上述不足,本 文围绕提升对话管理模块的可维护性、在线学习能力和知识建模能力展开。 

论文的主要贡献和创新归纳如下: 

(1)提出了一种基于教师-学生框架的对话管理维护方法 

针对基于强化学习的对话管理模块难以维护的问题,论文提出了一种基于 教师-学生框架的对话管理维护方法。其中,“教师”指现有的对话资源。它包 括原始对话管理模块、人机交互日志和用于处理新的用户行为的对话规则。“学 生”指新本体结构下的对话管理模块。论文提出的方法通过定义“学生”的学习 约束,直接把“教师”的对话知识迁移到“学生”中,从而避免了从零开始训练 新对话管理模块。实验表明,使用该方法扩展后的模型取得了和利用强化学习重 新训练得到的模型可比的性能,但是本文提出的方法的训练开销远低于后者。 

(2)提出了一种基于增量学习框架的任务型对话系统设计方法 

针对现有对话系统缺乏在线学习能力的问题,论文以客服场景为例提出了 一种基于增量学习框架的任务型对话系统设计方法。该方法可以通过不确定性 评估模块估计系统给出正确回复的置信度。在置信度较高时,系统会回复用户的 提问。否则,人工客服将接管对话。当人工客服回答结束后,系统会通过在线学 习模块更新模型参数。实验表明,使用该方法设计的系统对新的用户行为更为鲁 棒,且能够在线地累积对话知识。更重要的是,不确定性估计模块能够引导人类 标注最有价值的对话数据。因此,对话系统能够以更少的数据标注代价取得更好 的效果。

(3)提出了一种用于身份欺诈检测的任务型对话系统设计方法 

针对现有对话管理不具备知识建模能力的问题,论文以贷款申请中的身份 欺诈检测任务为例探讨了对话管理中的知识建模技术。具体而言,论文为每个贷 款申请者构建了一个和其身份信息相关的知识图谱。基于该知识图谱,论文提出 了结构化的对话管理模块。该对话管理模块由基于知识图谱的对话状态追踪器 和层次对话策略模块组成。对每个申请者而言,基于知识图谱的对话状态追踪器 会把知识图谱中和申请者身份信息相关节点的表征视为对话状态。然后,层次对 话策略模块将基于层次强化学习探索反欺诈策略。实验表明,具备结构化对话管 理模块的系统能够在更短的交互轮次内更准确地识别出身份欺诈者。

英文摘要

The task-oriented dialogue system refers to a human-machine interaction system that assists users in completing specific tasks (e.g., table reservation, booking tickets, etc.) through dialogue interactions. Generally speaking, a task-oriented dialogue system consists of a language understanding module, a dialogue manager, and a natural language generation component. Among them, the dialogue manager, which is the core of a task-oriented dialogue system, is responsible for recording dialogue states of users and selecting system actions according to a specific dialogue policy. Recently, great progress has been made in the dialogue manager. However, existing work has mainly focused on designing the dialogue manager in a closed domain. When encountering new user actions, dialogue systems often give unreasonable responses. Besides, existing dialogue management methods can not develop strategies based on external knowledge and are only suitable for solving slots-filling tasks. The lack of knowledge modeling ability greatly limits the application scope of task-oriented dialogue systems. To address the above problems, this paper focuses on improving the maintainability, online learning ability and knowledge modeling ability of dialogue manager.

The main contents are summarized as follows: 

(1) Teacher-Student Framework for Maintainable Dialogue Manager 

To improve the maintainability of reinforcement learning (RL) based dialogue manager, this paper proposes a practical teacher-student framework. Specifically, the “teacher” refers to existing resources, which include the original dialogue manager, the human-machine interaction logs, and dialogue rules for handling new user actions. The “student” is an extended dialogue manager based on a new ontology. Our method directly transfers the dialogue knowledge of the “teacher” to the “student” by defining learning constraints of the “student” and avoids training the new dialogue manager from scratch. Experiments show that the performance of the extended model is comparable to the model trained by RL from scratch, but the training cost of our approach is much lower than the latter.

(2) Incremental Learning Framework for Task-oriented Dialogue Systems 

To improve the online learning ability of task-oriented dialogue systems, this paper proposes a novel incremental learning framework to design task-oriented dialogue systems within the context of customer services, or for short Incremental Dialogue System (IDS). Specifically, IDS can evaluate the confidence of giving correct responses through an uncertainty estimation module. If there is high confidence, IDS will respond to users. Otherwise, the hired customer service staffs will be involved in the dialogue process, and IDS can learn from human interventions through an online learning module. Experiments show that the IDS is more robust to new user actions and can accumulate dialogue knowledge online. More importantly, the uncertainty estimation module can guide humans to label only valuable data. Therefore, IDS attains better performance with less annotation cost. 

(3) Task-oriented Dialogue System for Identity Fraud Detection 

To improve the knowledge modeling ability of dialogue manager, this paper takes the identity fraud detection task in loan applications as an example to explore the knowledge modeling methods in dialogue management. Specifically, for each loan applicant, this paper constructs a knowledge graph (KG) according to his (her) personal information. Based on the knowledge graph, this paper proposes a structured dialogue manager which consists of a knowledge graph based dialogue state tracker (KG-DST) and a hierarchical dialog policy module (HDP). For each loan applicant, the KG-DST treats the embeddings of nodes in KG as dialogue states. Then, the HDP will explore antifraud strategies based on hierarchical reinforcement learning. Experiments show that the dialogue system equipped with the structured dialogue manager can achieve higher recognition accuracy in a shorter number of turns.

关键词自然语言处理 任务型对话系统 对话管理 强化学习 对话策略
语种中文
七大方向——子方向分类自然语言处理
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/39122
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
王唯康. 任务型对话系统中对话管理方法研究[D]. 中科院自动化所. 中国科学院大学,2020.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Thesis.pdf(2587KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[王唯康]的文章
百度学术
百度学术中相似的文章
[王唯康]的文章
必应学术
必应学术中相似的文章
[王唯康]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。