CASIA OpenIR  > 毕业生  > 博士学位论文
汉语自然语言生成的理论、方法的研究及系统实现
其他题名Research On Theory and Methods of Chinese Language Generation and System Realization
吴华
学位类型工学博士
导师黄泰翼
2001-03-01
学位授予单位中国科学院研究生院
学位授予地点中国科学院自动化研究所
学位专业模式识别与智能系统
关键词自然语言
摘要本文对汉语自然语言生成的理论、方法,特别是独立于应用领域的句法实 现及文本规划方法进行了全面和深入的研究,提出了通用的汉语生成系统的框 架。在此基础上,成功地设计并实现了一个应用于口语翻译的生成系统和一个 用于信息查询的汉语篇章生成系统。 本文的第一个工作是研究句法实现、特别是通用的句法实现方法,提出和 建立了一个与具体领域无关的汉语句法实现系统。首先,我们采用系统功能语 法,建立了汉语生成语法。其次,为了使这个系统用于书面语和口语生成,本 文采用了模板和基于特征的规则方法相结合的生成方法,使句法实现系统同时 保持了通用性和灵活性,又能提高生成效率。然后设计了一个中间语义模型, 使汉语句法实现系统能够应用于不同的领域。这种模型是一种语义的等级结构, 使不同的领域概念能在这种中间语义模型中找到其相应的上位语义类型,根据 这些语义类型,句法实现系统能够找到相应的句法规则,生成适合于具体领域 的汉语文本,很好地解决了汉语句法实现系统独立于具体应用领域的问题。此 外,根据汉语独有的特点,我们还讨论了生成过程中汉语功能词的添加,并提 出了相应的解决策略。 本文的第二个工作是研究了汉语生成方法在口语翻译系统中的应用。首先, 我们根据口语翻译系统的特点,设计了一种适合于口语翻译的中间语言,这种 语言以话语行为理论为基础,准确捕捉了对话中交流双方的交际意图,为后面 的汉语生成提供了比较充足的知识。其次,在目标语(汉语)生成过程中,我 们采用面向具体领域的微观规划器和通用的汉语句法生成器,这样的组合使得 生成系统既能够很好地处理具体领域中的特殊现象,又能使系统很容易地移植 于其它领域。同时我们在系统实现时初步考虑了生成的鲁棒性问题,并采用了 两种策略:一是在微观规划器部分设计了两种规则:一般规则和详细规则;二 是在汉语生成器部分,放宽了对参与成分的约束,这些策略很好地处理了一些 识别和理解过程中带来的错误。 本文的第三个工作是研究了文本规划方法,并建立了一个汉语篇章生成系 统。篇章生成与句子生成的本质区别在于:篇章生成中必须有完善的文本规划 器组织文本。文本规划的主要作用是确定所要生成的内容以及生成内容之间的 逻辑关系,而规划的内容又受到用户模型的影响。因此,本文首先建立了用户 模型,并根据用户模型采用了Schema方法和Process方法相结合的混合文本规 划策略,它和句法实现系统一起用
其他摘要The theory and methods of Chinese generation was investigated in this paper. Especially the methods of general Chinese surface realization and text planning were developed.And then a general Chinese generation frame was brought forward.Using these methods,we have successfully designed and realized a Chinese generation system for spoken language translation and a Chinese text generation system for knowledge retrieval. The first aspect of this paper was to investigate the methods of general surface realization and then to develop a general Chinese syntactic realization system.Firstly, the systemic functional grammar was used to build the Chinese generation grammar. Secondly, in order to make the syntactic system suitable for both written Chinese and spoken Chinese generation,the system combined the template method and feature- based deep generation technology, the advantage of which is to maintain both flexibility and efficiency.Finally, an intermediate semantic model was also designed in order to make the generation system used in different domains.This model is a kind of semantic hierarchy.In the model,every specific domain concept has a corresponding super semantic class.And according to these classes,the generation grammar Call select suitable rules to generate Chinese texts.This solved the problem between the general Chinese realizer and different application domains.In addition, according to the characteristics of Chinese language,we also discussed the problem of adding functional words during the generation process,and then developed corresponding methods to solve this problem. The second aspect of this paper was to investigate target language generation in spoken translation systems.Firstly, according to the features of spoken language translation,we developed a kind of interlingua.This interlingua is based on speech act thcory,which can catch the intent of the speaker and therefore provide much information for target language generation.Secondly,a task-oriented microplanner and a general Chinese syntactic realizer were used for target language generation, which can tackle with some specific problems in the domain and make the system portable to other domains easily.Experiments showed that the generator embodied good performance.In spoken language translation,robustness is important for target language generation.In our generator, two strategies were designed to deal with the problem.In the microplanner,two kinds of rules are used:general rules and specific rules.In the Chinese realizer, some constraints are relaxed to allow some errors.All of these methods tackled with some recognition and parsing errors very well. The third aspect of this paper was to investigate the methods of text planner and to build a Chinese text generation.The essential difference between text generation and sentence generation is that a perfect text planner is needed in text generation systems. The main task of text planner is t
馆藏号XWLW653
其他标识符653
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/5717
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
吴华. 汉语自然语言生成的理论、方法的研究及系统实现[D]. 中国科学院自动化研究所. 中国科学院研究生院,2001.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[吴华]的文章
百度学术
百度学术中相似的文章
[吴华]的文章
必应学术
必应学术中相似的文章
[吴华]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。