基于内容的音频哼唱识别及检索系统
发布时间:2018-10-04 22:17
【摘要】: 在这个音视频数字化的时代,数字影视、数字音乐、数字动漫等多媒体已经大量的进入我们的生活。在数据库中,多媒体文件(例如歌曲)都是用它们的名字、作者、歌手等等来索引的,然而,人们对歌曲旋律的印象往往比名字、作者、歌手等等更深。随着多媒体数据库越来越庞大,数据的文字索引(名称、作者等)越来越多,人们不可能完全记住。因此,内涵式查询就突显出了其重要性与必要性。本文介绍了关于数字音频的哼唱识别系统的开发以及相关理论研究工作,详细讨论了在音频哼唱识别中的各部分的关键技术,并且实现了可用于演示的音频哼唱识别系统DEMO。 在整个研发过程中,我们总共在两个平台上进行:PC平台和Altera公司的DE2嵌入式平台。我们首先分别在PC上和DE2验证板上实现了基于20首歌的哼唱识别,进行了充分的实验和参数调整,实现了关于特征提取、噪声去除、特征值识别等课题,最终在DE2板子上得出了比较高的识别率和较好的运行时间。接下来主要在PC上研发,基于30多首不到的音乐建立一个有效的部分哼唱识别系统,同时对基础音的归一化算法、改进的DTW算法进行了研究。我们基于“首尾靠近”的先验条件,创造性地提出了利用正反两次DTW进行部分匹配的识别算法,并对该算法的时间复杂度、有效性、兼容性进行了深入的分析和研究。得到比较令人满意的结果:PC平台上52首乐段利用部分匹配算法可以达到85%左右的搜索成功率,相比较不支持部分匹配的48%的识别率是有了很大进步。而且正反DTW方法在时间复杂度上并没有太大的损失,运行时间仅仅是整体匹配方法的约1.5倍,同时它还保留了对整体匹配优秀的兼容性,完全满足实际的要求。
[Abstract]:In this digital audio and video era, digital video, digital music, digital animation and other multimedia has entered our lives. In databases, multimedia files (such as songs) are indexed by their names, authors, singers, etc. However, people tend to be more impressed with the melody of songs than names, authors, singers, etc. As multimedia databases become larger and more text indexes (names, authors, etc.) become more and more, it is impossible to fully remember them. Therefore, the implicit query highlights its importance and necessity. This paper introduces the development of digital audio humming recognition system and related theoretical research work, discusses the key technologies of each part of audio humming recognition in detail, and implements the audio humming recognition system DEMO., which can be used for demonstration. In the whole research and development process, we have two platforms: PC and Altera DE2 embedded platform. First of all, we have realized the humming recognition based on 20 songs on PC and DE2 verification board, carried on the full experiment and the parameter adjustment, has realized about the feature extraction, the noise removal, the characteristic value recognition and so on. Finally, the higher recognition rate and better running time are obtained on the DE2 board. Then it is mainly developed on PC, based on more than 30 pieces of music to establish an effective partial humming recognition system. At the same time, the normalization algorithm of basic sound and the improved DTW algorithm are studied. Based on the priori condition of "front and tail approach", we creatively propose a partial matching recognition algorithm using positive and negative DTW, and analyze and study the time complexity, validity and compatibility of the algorithm. The results show that the partial matching algorithm can be used in 52 segments on the platform of: PC to achieve a search success rate of about 85%, and the recognition rate of 48% which does not support partial matching has been greatly improved. Moreover, there is no great loss in the time complexity of the forward and inverse DTW method, and the running time is only about 1.5 times that of the global matching method. At the same time, it also retains the excellent compatibility of the global matching and fully meets the actual requirements.
【学位授予单位】:上海交通大学
【学位级别】:硕士
【学位授予年份】:2008
【分类号】:TP391.42
本文编号:2252086
[Abstract]:In this digital audio and video era, digital video, digital music, digital animation and other multimedia has entered our lives. In databases, multimedia files (such as songs) are indexed by their names, authors, singers, etc. However, people tend to be more impressed with the melody of songs than names, authors, singers, etc. As multimedia databases become larger and more text indexes (names, authors, etc.) become more and more, it is impossible to fully remember them. Therefore, the implicit query highlights its importance and necessity. This paper introduces the development of digital audio humming recognition system and related theoretical research work, discusses the key technologies of each part of audio humming recognition in detail, and implements the audio humming recognition system DEMO., which can be used for demonstration. In the whole research and development process, we have two platforms: PC and Altera DE2 embedded platform. First of all, we have realized the humming recognition based on 20 songs on PC and DE2 verification board, carried on the full experiment and the parameter adjustment, has realized about the feature extraction, the noise removal, the characteristic value recognition and so on. Finally, the higher recognition rate and better running time are obtained on the DE2 board. Then it is mainly developed on PC, based on more than 30 pieces of music to establish an effective partial humming recognition system. At the same time, the normalization algorithm of basic sound and the improved DTW algorithm are studied. Based on the priori condition of "front and tail approach", we creatively propose a partial matching recognition algorithm using positive and negative DTW, and analyze and study the time complexity, validity and compatibility of the algorithm. The results show that the partial matching algorithm can be used in 52 segments on the platform of: PC to achieve a search success rate of about 85%, and the recognition rate of 48% which does not support partial matching has been greatly improved. Moreover, there is no great loss in the time complexity of the forward and inverse DTW method, and the running time is only about 1.5 times that of the global matching method. At the same time, it also retains the excellent compatibility of the global matching and fully meets the actual requirements.
【学位授予单位】:上海交通大学
【学位级别】:硕士
【学位授予年份】:2008
【分类号】:TP391.42
【相似文献】
相关硕士学位论文 前1条
1 陈旭;基于内容的音频哼唱识别及检索系统[D];上海交通大学;2008年
,本文编号:2252086
本文链接:https://www.wllwen.com/wenyilunwen/dongmansheji/2252086.html