CASIA OpenIR  > 毕业生  > 硕士学位论文
Alternative TitleDesign and Implementation of Multi-Modality Music Search Engine
Thesis Advisor胡包钢
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword搜索引擎 倒排索引 哼唱检索 数据抓取 开源 Search Engine Inverted Index Query By Humming Data Retrieval Open Source
Abstract在网络高度发达的今天,搜索技术能使人们快速的从海量的数据中找到有用的信息,因而在现代生活中扮演着越来越重要的角色。对于文本内容的搜索,已有非常成熟的技术,而对于多媒体内容的搜索,目前大多还停留在对文本属性和标注的搜索阶段,对于基于多媒体内容的搜索,人们还在积极的探索之中。 本文的工作是研究并开发了一套音乐搜索引擎系统。该系统提供多种搜索音乐的途径,不仅包括了按传统的按文本属性(如歌曲名,歌手,作者,歌词等)的搜索,还包括按音乐旋律的搜索。对于按旋律的搜索,本系统提供了两种输入旋律的界面,一是钢琴键盘弹奏输入,二是哼唱输入。 本文的主要工作和贡献包括: 设计并开发了完整的音乐搜索系统,该系统是基于Web界面的B/S结构,实现了音乐信息的收集,分析,索引以及在线搜索等所有环节。实现了文本,哼唱,弹奏三种用户搜索界面。 在音乐旋律索引环节,提出了“旋律片段”的方法,该方法可将音乐旋律的索引和搜索与文本的处理方式统一起来。基于“旋律片段”的音乐旋律倒排索引显著地提高了搜索的效率。 在哼唱旋律分析环节,改进了对于哼唱信号的分析方法,增强了哼唱旋律的识别的鲁棒性和准确度。 在数据抓取环节,开发了基于规则的Web定向数据抓取系统,该系统可以用最少的网络流量,高效准确地抓取Web网站的数据,并自动将数据分类格式化存储。 本系统基于Java平台开发,运行于Linux服务器,使用开源软件平台搭建系统的运行环境。最小化客户端具有良好的跨平台特性,可以运行于Windows和Linux系统以及经过较少的改动移植到嵌入式设备中。
Other AbstractAs the development of Internet, searching technology plays more important roles in the people's life by helping users to find useful information in the massive amount of data. Technologies are very mature in the text-based search, while in the area of searching multimedia content, most of the current technologies still depend on text attribute and text remark. Researchers are pursuing content based multimedia searching technology actively. This paper's work studied and developed a music search engine system. This system provides multiple modalities for music searching, not only includes searching by text attributes (song name, singer name, author name, lyric, etc.), but also include searching by melody. In the melody-base searching approach, two interfaces are provided for the melody input: playing piano and humming. The major work and contributions are as follow: Designed and developed an entire music searching system, which is based on Web interface and B/S architecture. The system implemented all aspects of music information collecting, analyzing, indexing and online searching. It implemented the three user interfaces of text keyword, humming and piano playing. In the aspect of indexing of music melody, this paper proposed "Contour Snippet" method, which unified the indexing and searching procedure between melody and text content. Inverted index based on "Contour Snippet" remarkably promoted the efficiency of music searching. In the aspect of humming analysis, this paper improved the analysis method for the humming signal, increased the robustness and accuracy of humming melody recognition. In the aspect of data retrieval, this paper developed the fixed-directional Web data retrieval system based on rule definition. This retrieval system will crawl the web efficiently and accurately, categorize the data and save in well format, with the minimum network flow consumption. This system is developed on Java platform, running on Linux server. All software platforms are built by open source software. The minimized client has good performance on platform compatibility, which works on both Windows and Linux system. It can also works on embedded device with little modification.
Other Identifier200528014628031
Document Type学位论文
GB/T 7714
陈路佳. 多途径音乐搜索引擎设计与实现[D]. 中国科学院自动化研究所. 中国科学院研究生院,2008.
