基于深度神经网络的音乐信息检索

发布时间：2018-09-01 20:49

【摘要】：音乐分类从本质上讲是一个模式识别的问题,主要包括两个方面内容：特征提取和分类。一般音频数据具有的高冗余、高维度的特点,必须经过特征提取才能有效的降低信号冗余度和维度。特征提取是通过对音频信号进行分析来获得表征声学信号随时间变化的一组特征参数。不同的特征提取方法所提取的特征参数直接影响着后续音乐分类的效果,是音乐分类任务的关键步骤。深度学习作为一种新的特征提取技术,在语音信号处理领域取得了一系列成功。本文借鉴深度学习在语音信号处理上的研究成果在音乐分类与深度学习理论相结合的基础上,针对如何利用深度学习强大的特征提取功能发现更加适用于音乐分类的声学特征这一问题展开研究。本文首先对音乐信息检索的概念和常用方法进行了介绍,接着介绍了深度学习原理以及典型模型。然后针对如何利用深度神经网络进行音乐信息检索问题展开研究。本文提出了一种利用深度信念网络对音乐进行情绪分类算法,结合卷积神经网络提出了加入卷积操作的深度信念网络。试验中,将用深度信念网络提取到的特征与MFCC特征进行比较,证明前者在音乐情绪分类任务中能取得更好的效果。
[Abstract]:Music classification is essentially a problem of pattern recognition, which includes two aspects: feature extraction and classification. General audio data has the characteristics of high redundancy and high dimension, which must be extracted to effectively reduce the signal redundancy and dimension. Feature extraction is based on the analysis of audio signals to obtain a set of feature parameters that represent the variation of acoustic signals over time. The feature parameters extracted by different feature extraction methods directly affect the effect of the subsequent music classification and are the key steps of the music classification task. As a new feature extraction technique, depth learning has achieved a series of successes in the field of speech signal processing. On the basis of the combination of music classification and depth learning theory, this paper draws lessons from the research results of deep learning in speech signal processing. This paper focuses on how to use the powerful feature extraction function of depth learning to find acoustic features which are more suitable for music classification. This paper first introduces the concept and common methods of music information retrieval, then introduces the principle of deep learning and typical models. Then the research on how to use depth neural network for music information retrieval is carried out. In this paper, a deep belief network is proposed to classify music emotions, and a deep belief network with convolution neural network is proposed. In the experiment, the features extracted by the deep belief network are compared with the MFCC features, and it is proved that the former can achieve better results in the task of music emotion classification.
【学位授予单位】：北京邮电大学
【学位级别】：硕士
【学位授予年份】：2015
【分类号】：TN912.3;TP183

【参考文献】