当前位置:主页 > 科技论文 > 自动化论文 >

基于深度学习的CTR预测研究

发布时间:2018-09-04 06:27
【摘要】:伴随着互联网、云计算、物联网等技术的迅猛发展,网络的数据规模也在急剧增长,信息社会已经慢慢步入“大数据”时代。网络广告投放系统架构于大数据的基础上,系统利用机器学习对海量用户行为进行分析挖掘并向用户实时地推送合适的广告。点击率(Click Through Rate,CTR)预测是网络广告投放系统的核心技术,对于提升系统的运作效率意义重大。CTR的精准预测是制定科学的电子商务市场营销决策的关键,直接影响用户的网络体验,直接关系到互联网公司的运营成本。因此,CTR的预测具有很高的商业价值和研究价值。面对网络广告投放系统的高精准度和高时效的要求,本文从浅层学习和深度学习两个角度开展特征选择、特征学习、分类预测和应用技术研究。以网络广告真实的数据集为实验对象,分别构建浅层学习模型和深度学习模型。为了全面验证深度学习模型,本次研究通过多视角的综合对比实验来证实深度学习的巨大潜力。综合考虑,具体的研究工作主要包括以下五个方面:(1)开展数据处理和特征工程技术研究。从真实数据集出发探索研究类别不平衡性对预测模型的影响机理,不平衡数据的重采样技术。(2)针对数据特征的高度非线性特点,开展浅层学习和深度学习理论及应用技术对比研究。为了克服浅层模型对复杂问题学习能力受限问题,构建深度学习模型,实验通过算法实现证实了相对比浅层学习,深度学习的预测效果提升了约21%,具有很强的优势。(3)为消除类别不平衡对预测模型的影响,提出了一种深度神经网络(Deep Neural Network,DNN)的改进模型——SDNN(Deep Neural Network based on Sampling,SDNN)。基于GPU的并行计算,通过构建模型和实现算法验证了在不影响预测效果的同时,SDNN预测模型训练时间缩短了约73.28%,大幅度的提升了DNN的运算效率。针对系统的精准度和时效性的高要求,SDNN被证实是一种面向大数据更为高效的预测方法。(4)研究Sigmoid激活函数和Relu激活函数对DNN预测模型的影响机理。通过分别构建DNN和SDNN模型和算法的实现,证实了相对比Sigmoid激活函数,Relu激活函数更适合于层次较深的网络模型,基于Relu激活函数的DNN和SDNN更适合复杂问题的建模。(5)为了避免单一SDNN训练的局限性提升模型的泛化能力,开展关键参数dropout敏感性分析研究。
[Abstract]:With the rapid development of Internet, cloud computing, Internet of things and other technologies, the data scale of the network is also growing rapidly, the information society has entered the "big data" era. Based on big data, the system uses machine learning to analyze and mine massive user behavior and push appropriate advertisements to users in real time. The prediction of click rate (Click Through Rate,CTR) is the core technology of the network advertisement delivery system. It is of great significance to improve the operational efficiency of the system. The accurate prediction is the key to making scientific electronic commerce marketing decisions, which directly affects the network experience of users. Directly related to the operating costs of Internet companies. Therefore, the prediction of CTR has high commercial value and research value. In the face of the requirement of high precision and high efficiency in the network advertising system, this paper carries out the research of feature selection, feature learning, classification prediction and application technology from the two angles of shallow learning and deep learning. Taking the real data set of network advertisement as the experimental object, the shallow learning model and the depth learning model are constructed respectively. In order to fully verify the depth learning model, this study verifies the great potential of depth learning through comprehensive comparative experiments from multiple perspectives. Considering synthetically, the concrete research work mainly includes the following five aspects: (1) carry out the research of data processing and feature engineering technology. Based on the real data set, this paper explores the influence mechanism of class imbalance on prediction model, and the resampling technique of unbalanced data. (2) aiming at the highly nonlinear characteristics of data features, To carry out a comparative study of shallow and deep learning theories and applied techniques. In order to overcome the problem of limited learning ability of shallow model for complex problems and to construct a deep learning model, the experiment proves that the learning ability of shallow model is compared with that of shallow learning. The prediction effect of depth learning is improved by about 21%, which has a strong advantage. (3) in order to eliminate the influence of class imbalance on prediction model, an improved model of depth neural network (Deep Neural Network,DNN) is proposed. Based on the parallel computation of GPU, it is verified that the training time of prediction model is reduced by 73.28%, and the efficiency of DNN is greatly improved by constructing model and implementing algorithm. It has been proved that SDNN is a more efficient prediction method for big data in view of the high requirement of accuracy and timeliness of the system. (4) the influence mechanism of Sigmoid activation function and Relu activation function on DNN prediction model is studied. By constructing DNN and SDNN models and algorithms, it is proved that compared with the Sigmoid activation function, Relu activation function is more suitable for the deeper network model. DNN and SDNN based on Relu activation function are more suitable for modeling complex problems. (5) in order to avoid the limitation of single SDNN training to improve the generalization ability of the model, the key parameter dropout sensitivity analysis is carried out.
【学位授予单位】:重庆工商大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP181

【参考文献】

相关期刊论文 前10条

1 奚雪峰;周国栋;;面向自然语言处理的深度学习研究[J];自动化学报;2016年10期

2 刘万军;梁雪剑;曲海成;;不同池化模型的卷积神经网络学习性能研究[J];中国图象图形学报;2016年09期

3 张蕾;章毅;;大数据分析的无限深度神经网络方法[J];计算机研究与发展;2016年01期

4 陈巧红;余仕敏;贾宇波;;广告点击率预估技术综述[J];浙江理工大学学报;2015年11期

5 朱志北;李斌;刘学军;胡平;;基于LDA的互联网广告点击率预测研究[J];计算机应用研究;2016年04期

6 王山海;景新幸;杨海燕;;基于深度学习神经网络的孤立词语音识别的研究[J];计算机应用研究;2015年08期

7 张鹏;黄毅;阮雅端;陈启美;;基于稀疏特征的交通流视频检测算法[J];南京大学学报(自然科学);2015年02期

8 徐培;蔡小路;何文伟;谢易道;;基于深度自编码网络的运动目标检测[J];计算机应用;2014年10期

9 刘建伟;刘媛;罗雄麟;;深度学习研究进展[J];计算机应用研究;2014年07期

10 余凯;贾磊;陈雨强;徐伟;;深度学习的昨天、今天和明天[J];计算机研究与发展;2013年09期

相关硕士学位论文 前1条

1 霍艳;网络广告投放算法的研究[D];东北大学;2013年



本文编号:2221286

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2221286.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户eeb3b***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com