With the rapid growth of mobile internet, online shopping has greatly improved life for consumers. Against this background, many e-commerce websites provide online review platforms for consumers to share their purchase experiences and opinions on products. These reviews are of great value to both consumers and business organizations. However, manually reading throughout large scales of review texts is a very arduous task. Therefore, automatic opinion mining system emerges. Generally, opinion mining systems make summarizations of consumers' opinions through automatic analysis on review texts. In this thesis, we mainly focus on mining opinion words (which refer to those terms indicating sentiment polarities) and opinion targets (which are often attributes or functions of products). Conventional opinion mining methods often rely on employing syntactic dependency parsing to capture modified relations between opinion words and opinion targets, which may have many limitations. This thesis aims to provide several opinion mining methods to overcome shortcomings of conventional syntax-based opinion mining systems. The main contents and contributions of this thesis include: (1) This thesis proposes a two-stage method to improve conventional syntax-based opinion mining methods. Previous works often use many syntactic patterns to mine opinion words and opinion targets. However, some patterns are of low quality, which may introduce many noise terms. To alleviate this issue, we incorporate syntactic patterns in a Sentiment Graph and apply random walking on the graph to estimate confidence of patterns. In this way, low-quality patterns will have low confidence, so as to improve accuracy. On another hand, previous works tend to rank candidates by term frequencies, this may introduce high-frequency noise terms and lose low-frequency opinion terms. To solve this problem, we employ a semi-supervised binary classier to refine opinion targets, which does not rely on term frequencies to rank candidates. Experimental results show that the first stage effectively improves precision and the second stage significantly reduces adverse effects of term frequencies. (2) This thesis introduces a monolingual word alignment model, which substitutes syntactic parser to capture opinion relations. Current syntactic parsers can easily suffer from informal expressions in online reviews. To tackle this problem, instead of using syntactic parsers, this thesis employs an unsupervised monol...
修改评论