展示广告点击率预估平台的设计与实现
发布时间:2018-09-11 14:37
【摘要】:随着互联网产业的成熟以及用户规模的扩张,互联网广告的营销价值也随之不断攀升。若能利用互联网广告的天然优势,通过点击率预估技术正确追踪用户对某广告的偏好,可带来广告主转化提升、用户体验提升、发布者收入提升等多维度的收益。因此,本文选择对计算广告学中的展示广告点击率预估问题展开研究,并将在研究过程中搭建展示广告点击率预估平台,为当前问题提供完整的机器学习解决方案。本文将系统地介绍展示广告点击率预估平台的构建过程。首先通过对点击率预估问题研究意义及研究现状的分析,引出本文的研究内容。随后,结合点击率预估问题的机器学习方案步骤对点击率预估问题展开业务分析,明确点击率预估平台的功能与性能需求,即针对计算广告中的海量数据,支持多种模型的独立、混合使用,配备离线批量学习和在线学习两种训练模式,为用户提供了从特征工程、模型训练、模型评估、模型预测到结果分析的一站式服务。紧接着,围绕需求分析的结果开始阐述点击率预估平台的详细设计及实现过程,其中特征工程与模型训练的设计与实现是文本的研究重点。由于点击率预估问题的数据来源往往是线上真实的服务日志,本文将通过系统的特征工程挖掘湮没在大量噪声中的有效特征,并力求使用最少的特征带来最佳的模型预测效果。模型训练阶段,本文选择了适合离散高维特征场景的逻辑回归模型,以及适合稀疏特征场景的因子分解机模型,并将其分别与上游的GBDT模型通过Stacking集成算法进行融合,达到提升模型预估效果的目的。平台初步实现后,将通过功能测试与性能测试,发现平台存在的问题。通过进一步的优化与迭代,完成平台的全部搭建工作。最后,对论文内容进行了总结,并对平台的改进方向进行了展望。
[Abstract]:With the maturity of the Internet industry and the expansion of the scale of users, the marketing value of Internet advertising is rising. If we can make use of the natural advantages of Internet advertising and correctly track the preference of users to a certain advertisement through the technology of predicting the click rate, we can bring about the multi-dimensional income, such as the transformation of advertisers, the improvement of user experience, the increase of publishers' income, and so on. Therefore, this paper chooses to study the prediction of display advertising click rate in computational advertising, and will build the display advertising click rate prediction platform in the process of research to provide a complete machine learning solution for current problems. This paper will systematically introduce the construction process of display advertising click rate prediction platform. Firstly, by analyzing the significance and current situation of the research on the prediction of click rate, the research content of this paper is introduced. Then, combined with the machine learning program of the click rate prediction problem, the operation analysis of the click rate prediction problem is carried out, and the function and performance requirements of the click rate prediction platform are clarified, that is, the mass data in the calculation advertising, It supports the independent, mixed use of multiple models and provides a one-stop service for users from feature engineering, model training, model evaluation, model prediction to result analysis, with offline batch learning and online learning. Then, the detailed design and implementation process of the platform is described around the results of requirement analysis, in which the design and implementation of feature engineering and model training are the focus of the text research. Because the data source of the prediction problem of click-through rate is often the online real service log, this paper will mine the valid features of annihilation in a large amount of noise through the feature engineering of the system, and make every effort to use the least features to bring the best prediction effect of the model. In the training stage of the model, the logical regression model suitable for discrete high-dimensional feature scenes and the factoring machine model for sparse feature scenes are selected, and the model is fused with the upstream GBDT model by Stacking integration algorithm, respectively. To achieve the purpose of improving the prediction effect of the model. After the initial implementation of the platform, through functional testing and performance testing, found the platform problems. Through further optimization and iteration, the platform is built. Finally, the content of the paper is summarized, and the improvement direction of the platform is prospected.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.52
[Abstract]:With the maturity of the Internet industry and the expansion of the scale of users, the marketing value of Internet advertising is rising. If we can make use of the natural advantages of Internet advertising and correctly track the preference of users to a certain advertisement through the technology of predicting the click rate, we can bring about the multi-dimensional income, such as the transformation of advertisers, the improvement of user experience, the increase of publishers' income, and so on. Therefore, this paper chooses to study the prediction of display advertising click rate in computational advertising, and will build the display advertising click rate prediction platform in the process of research to provide a complete machine learning solution for current problems. This paper will systematically introduce the construction process of display advertising click rate prediction platform. Firstly, by analyzing the significance and current situation of the research on the prediction of click rate, the research content of this paper is introduced. Then, combined with the machine learning program of the click rate prediction problem, the operation analysis of the click rate prediction problem is carried out, and the function and performance requirements of the click rate prediction platform are clarified, that is, the mass data in the calculation advertising, It supports the independent, mixed use of multiple models and provides a one-stop service for users from feature engineering, model training, model evaluation, model prediction to result analysis, with offline batch learning and online learning. Then, the detailed design and implementation process of the platform is described around the results of requirement analysis, in which the design and implementation of feature engineering and model training are the focus of the text research. Because the data source of the prediction problem of click-through rate is often the online real service log, this paper will mine the valid features of annihilation in a large amount of noise through the feature engineering of the system, and make every effort to use the least features to bring the best prediction effect of the model. In the training stage of the model, the logical regression model suitable for discrete high-dimensional feature scenes and the factoring machine model for sparse feature scenes are selected, and the model is fused with the upstream GBDT model by Stacking integration algorithm, respectively. To achieve the purpose of improving the prediction effect of the model. After the initial implementation of the platform, through functional testing and performance testing, found the platform problems. Through further optimization and iteration, the platform is built. Finally, the content of the paper is summarized, and the improvement direction of the platform is prospected.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.52
【相似文献】
相关期刊论文 前1条
1 ;新浪微博与淘宝合作 推信息流展示广告[J];互联网天地;2013年04期
相关重要报纸文章 前7条
1 记者 霍鑫;网络展示广告智能化 搜索引擎巨头发力[N];中国高新技术产业导报;2011年
2 本报记者 焦丽莎;谷歌瞄准展示广告[N];中国经济时报;2012年
3 本报记者 方方;谷歌发力展示广告[N];中国经济导报;2011年
4 记者 李蕾;Tim Andree:从注意力导向到价值导向的转变[N];第一财经日报;2014年
5 本报记者 许泳;Google 展示广告网络化繁为简[N];计算机世界;2011年
6 刘q,
本文编号:2236975
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2236975.html