Now the rapid growth of the Internet techniques has made the communication between users convenient. The expedient chatting tools (such as QQ, MSN and ICQ) play a very important role in this aspect. Currently most of the instant messengers can transmit cartoon expression and multimedia items such as videos and images to enrich the users to use the tools except editing and displaying the chat texts. However, there rarely exist softwares that can associate the multi-modal elements with each other and make them synchronous effectively. We believe that this function can make great sense in improving the communication between the users speaking different languages. Imagine this when a Chinese tourist wants to know the route to Merlion statue by bus, if the system can return some information in the form of the images such as “Merlion”, “Bus” and “Map” intelligently, the service providers who cannot speak Chinese can understand the tourist question and then give him some accurate answers and guides. In order to achieve this goal, cross media association, especially the association between texts and images, is the key in our study. We attempt to discuss this problem based on the following two aspects. First, how to associate the keywords with the images properly can reflect the content of the users’ talk accurately. We aim to investigate “verb + noun” phrases in this facet. Second, how to organize and display the images that are related to the keywords and sentimental information aims to help the users who have language obstacle have a fluent communication. This thesis investigates the semantic cross media association in multimedia chatting. The main contributions of this dissertation are summarized as follows: 1. We propose the concept detection based on “Verb + Noun” phrases on the problem that the traditional concept detection based on noun cannot reflect the content of the verb-object phrases in oral speaking. The basic principle of our method is to treat the traditional slope in the expression of the classification plane in Support Vector Machine (SVM) as two parts, i.e. one part representing the individual action and the other part standing for the same noun object in this concept group. We can train these classifiers together in this group to obtain the detectors. 2. In order to verify the validity of our method, we implement a visualized user chatting platform. When the users input the texts, the system will utilize the Natural Language Pro...
修改评论