With the rapid development of the Internet technologies, huge amount of open source data are available on the Web. These massive data have brought about great difficulty in information retrieval. To face with the difficulty, automatic methods to effectively discover useful information are in great demand. Learning to rank as a promising new method, which utilizes machine learning methods for the automatic knowledge discovery, has demonstrated its great potential in automating the retrieval and discovery process and become a hot research topic in recent years. Listwise approach is the most advanced learning to rank method that overcomes the limitations of previous methods. In this thesis, we shall focus on the studies of listwise approach. The accomplishments of this work include the following aspects. First, we review the existing related work on listwise approaches and analyze their limitations on practical web search applications. To solve the problem, we put forward a method to get weak ranking results based on single features and then aggregate weak ranking lists to establish a strong ranking list. In addition, we propose a more efficient weak ranking method based on KNN to explore the “hidden order” implied in single features. Second, to specifically solve the problem of constructing strong ranking model from aggregating weak models, we propose the FeatureRank algorithm which is in- spired by the idea of boosting. FeatureRank employs the strategy of optimizing the measure value on ranking results in order to get the strong ranking model. To eliminate the “vibration” in the process of FeatureRank, we further propose an improved algo- rithm BL-FeatureRank to achieve better ranking model and results. Third, we propose another solution to constructing strong ranking model from aggregating weak models - DiffRank algorithm. DiffRank transfers the ranking order in ranking score in order to learn the ranking model. We theoretically prove that DiffRank has distinct advantage of bounded learning errors. The empirical experimental study also discovers that the performance of DiffRank is independent of the way of producing the hidden order. We conduct experiments to verify the effectiveness of each of our proposed algorithms, and compare the ranking results by our proposed algorithms with those by the state-of-the-art listwise approaches. The experimental results show that our algorithms demonstrate great improvement and ef...
修改评论