基于哼唱的MIDI音频检索算法研究

发布时间：2018-06-29 14:47

本文选题：哼唱检索 + MIDI　；参考：《山东科技大学》2017年硕士论文

【摘要】：随着音乐数据库爆炸式的增长,传统的基于文本的音频检索给用户带来极大的不便。基于哼唱的MIDI音乐检索是基于内容的音乐检索方式,它允许用户不需要歌词而只需哼唱旋律就可以检索到自己需要的歌曲。本文的目标是构建完整的基于哼唱的MIDI音频检索算法并检验其可行性。本文的主要研究内容如下:1.音频特征提取。分析了音频信号的时域、频域和倒谱特征,并介绍了几种基本的旋律轮廓的表达,阐述了音频信号的特征提取方法。2.基于HMM的哼唱检索算法。建立了以音符为基础的HMM模型,避免了音符切分。对音调进行转换,将音调转换后的音高序列作为旋律的音高特征,从而克服了因哼唱者哼唱习惯和音域差别导致的差异。用500个哼唱片段的测试集测试算法的性能,达到了 TOP3为78%的识别率。3.基于深度学习的哼唱检索算法。采用3层DBN网络结构得到每首歌曲的关键特征,保证旋律数据能精确描述歌曲旋律,解决了旋律特征不稳定的情况。并采用了基于聚类的方法实现旋律特征的近邻检索。构建了 200首MIDI格式的音乐库,用42首wav格式的哼唱查询文件验证算法的性能,达到了 TOP3为81.0%的识别率。同时引入基于DBN的哼唱检索算法与基于LSH的哼唱检索算法的对比实验,证明了基于DBN的检索算法的优良性能。上述两个算法的核心部分都包括旋律特征提取和旋律特征匹配,这也是各个检索算法着重研究的部分。MIDI音乐数据库的旋律特征提取和哼唱旋律特征提取相关技术在各个算法中都有着重研究。
[Abstract]:With the explosive growth of music database, traditional text-based audio retrieval brings great inconvenience to users. Midi music retrieval based on humming is a content-based music retrieval method, which allows users to retrieve the songs they need without the lyrics but only by humming the melody. The goal of this paper is to construct a complete midi audio retrieval algorithm based on humming and to test its feasibility. The main contents of this paper are as follows: 1. Audio feature extraction. In this paper, the time domain, frequency domain and cepstrum characteristics of audio signal are analyzed, and the expression of several basic melodic contours is introduced, and the feature extraction method of audio signal. Hem retrieval algorithm based on hmm. The hmm model based on notes is established to avoid the segmentation of notes. In order to overcome the differences caused by humming habits and range differences, the pitch sequence after tone conversion is regarded as the pitch feature of the melody. The performance of the algorithm is tested with 500 humming test sets, and the recognition rate of TOP3 is 78%. 3. Hem retrieval algorithm based on deep learning. The key features of each song are obtained by using a three-layer DBN network structure, which ensures that the melody data can accurately describe the melody of the song, and solves the unstable situation of the melody characteristic. The nearest neighbor retrieval of melody feature is realized by clustering method. 200 music libraries in midi format are constructed and 42 wav format humming query files are used to verify the performance of the algorithm. The recognition rate of Top3 is 81.0%. At the same time, the comparison experiment between the humming retrieval algorithm based on DBN and the Hem retrieval algorithm based on LSH proves the excellent performance of the retrieval algorithm based on DBN. The core parts of the above two algorithms include melody feature extraction and melody feature matching. This is also the part of each retrieval algorithm. The melody feature extraction and humming melody feature extraction of midi music database are studied in each algorithm.
【学位授予单位】：山东科技大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN912.3;TP391.3

【参考文献】