CASIA OpenIR  > 数字内容技术与服务研究中心  > 听觉模型与认知计算
SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis
Hao, Shudong1; Xu, Yanyan1; Ke, Dengfeng2; Su, Kaile3; Peng, Hengli4
AbstractWriting in language tests is regarded as an important indicator for assessing language skills of test takers. As Chinese language tests become popular, scoring a large number of essays becomes a heavy and expensive task for the organizers of these tests. In the past several years, some efforts have been made to develop automated simplified Chinese essay scoring systems, reducing both costs and evaluation time. In this paper, we introduce a system called SCESS (automated Simplified Chinese Essay Scoring System) based on Weighted Finite State Automata (WFSA) and using Incremental Latent Semantic Analysis (ILSA) to deal with a large number of essays. First, SCESS uses an n-gram language model to construct a WFSA to perform text pre-processing. At this stage, the system integrates a Confusing-Character Table, a Part-Of-Speech Table, beam search and heuristic search to perform automated word segmentation and correction of essays. Experimental results show that this pre-processing procedure is effective, with a Recall Rate of 88.50%, a Detection Precision of 92.31% and a Correction Precision of 88.46%. After text pre-processing, SCESS uses ILSA to perform automated essay scoring. We have carried out experiments to compare the ILSA method with the traditional LSA method on the corpora of essays from the MHK test (the Chinese proficiency test for minorities). Experimental results indicate that ILSA has a significant advantage over LSA, in terms of both running time and memory usage. Furthermore, experimental results also show that SCESS is quite effective with a scoring performance of 89.50%.
KeywordAutomatic Essay Scoring Latent Semantic Analysis
WOS HeadingsScience & Technology ; Social Sciences ; Technology
Indexed BySCI ; SSCi
Funding OrganizationBeijing Higher Education Young Elite Teacher Project(YETP0768) ; Fundamental Research Funds for the Central Universities(YX2014-18) ; National Natural Science Foundation of China(61103152 ; 61472369)
WOS Research AreaComputer Science ; Linguistics
WOS SubjectComputer Science, Artificial Intelligence ; Linguistics ; Language & Linguistics
WOS IDWOS:000370862900005
Citation statistics
Document Type期刊论文
Affiliation1.Beijing Forestry Univ, Sch Informat Sci & Technol, Beijing, Peoples R China
2.Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
3.Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld 4111, Australia
4.Beijing Language & Culture Univ, Inst Educ Measurement, Beijing, Peoples R China
Recommended Citation
GB/T 7714
Hao, Shudong,Xu, Yanyan,Ke, Dengfeng,et al. SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis[J]. NATURAL LANGUAGE ENGINEERING,2016,22(2):291-319.
APA Hao, Shudong,Xu, Yanyan,Ke, Dengfeng,Su, Kaile,&Peng, Hengli.(2016).SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis.NATURAL LANGUAGE ENGINEERING,22(2),291-319.
MLA Hao, Shudong,et al."SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis".NATURAL LANGUAGE ENGINEERING 22.2(2016):291-319.
Files in This Item: Download All
File Name/Size DocType Version Access License
S1351324914000138a.p(1929KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Hao, Shudong]'s Articles
[Xu, Yanyan]'s Articles
[Ke, Dengfeng]'s Articles
Baidu academic
Similar articles in Baidu academic
[Hao, Shudong]'s Articles
[Xu, Yanyan]'s Articles
[Ke, Dengfeng]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Hao, Shudong]'s Articles
[Xu, Yanyan]'s Articles
[Ke, Dengfeng]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: S1351324914000138a.pdf
Format: Adobe PDF
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.