基于多模态学习的深度玻尔兹曼机的微博大数据分析
发布时间:2018-03-22 04:08
本文选题:深度学习 切入点:微博数据 出处:《海南大学》2016年硕士论文 论文类型:学位论文
【摘要】:深度学习是机器学习和模式识别中的一个重要领域,它能很好的应用于语言识别,计算机视觉和自然语言处理。在如今大数据时代下,它能成为大数据分析的驱动。数据量越大,深度学习就越会训练的更好。深度学习架构核心是受限玻尔兹曼机,常见的受限玻尔兹曼机的结构有二层对称链接的非自反馈随机神经网络结构和单层反馈网络的结构,包括一般玻尔兹曼机、半受限玻尔兹曼机和受限玻尔兹曼机。近年来通过对微博数据分析用户的心理压力受到了广泛的关注。目前分析微博数据的主要方法是针对纯文本数据,而忽略的图片数据很大程度上也反映了用户心理压力状态。通过多模态的微博数据分析用户的心理压力,其挑战是如何将二个不同模态的数据特征进行统一表示。为了解决上述问题,提出利用基于多模态学习的深度玻尔兹曼机模型(DBM)对微博图片和文本数据进行处理和分析,在模型中能实现文本和图片的低层次特征向稀疏高层次抽象特征的转变,最后用一个联合层表示来自二种不同模态数据的融合特征。此外,该模型发现二种不同模态数据的输入特征处在低层次时是高度非线性的。实验结果证明所提出的方法的有效性。
[Abstract]:Deep learning is an important field in machine learning and pattern recognition. It can be used in language recognition, computer vision and natural language processing. It can be a driving force for big data's analysis. The larger the amount of data, the better the training will be. The core of the deep learning architecture is the constrained Boltzmann machine. The common structure of constrained Boltzmann machine has a two-layer symmetric link non-self-feedback random neural network structure and a single-layer feedback network structure including the general Boltzmann machine. The semi-constrained Boltzmann machine and the restricted Boltzmann machine. In recent years, the psychological pressure of users has received extensive attention through the analysis of Weibo data. At present, the main method of analyzing Weibo data is aimed at pure text data. And the neglected image data also reflects the psychological stress state of the user to a large extent. The multimodal Weibo data is used to analyze the psychological pressure of the user. In order to solve the above problem, we propose to use the depth Boltzmann machine model based on multi-modal learning to process and analyze Weibo's image and text data. In the model, the low-level features of text and images can be transformed into sparse high-level abstract features. Finally, a joint layer is used to represent the fusion features from two different modal data. The model finds that the input characteristics of two different modes of data are highly nonlinear when the input characteristics are at a low level. The experimental results show that the proposed method is effective.
【学位授予单位】:海南大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP393.092;TP311.13
,
本文编号:1646975
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1646975.html