Kepler联盟广告数据统计与分析平台的设计与实现
发布时间:2018-05-08 16:16
本文选题:互联网广告 + Storm ; 参考:《南京大学》2017年硕士论文
【摘要】:互联网广告经过近20多年的发展,取得了显著的成果。2015年10月9日,美团网和大众点评网正式合并成立新的公司新美大,当前,新美大在020领域已成为最大的线上平台和场景入口,包括团购、外卖、广告、休闲娱乐、亲子、结婚、丽人等等业务,几乎涵盖所有生活服务场景,其中包含各类数据:跨行业种类数据、交易支付数据、用户浏览数据等等。联盟广告业务部是美团点评广告平台部的一个子部门,它定位于外部媒体,为美团点评商户采买流量,提供推广服务。基于此,联盟广告需要设计一套针对本部门的广告数据统计与分析平台,以数据图表的方式展现广告数据,它能够作多维度的离线统计和实时监控,为后续广告的精细化的投放、定向、决策提供数据依据,同时可以及时发现、规避和解决可能潜藏的问题和风险。本文主要介绍Kepler联盟广告数据统计与分析平台的设计与实现。离线数据模型使用Hive、Shell和Python处理,通过内部调度器去实现任务调度,实时数据模型使用Storm和Kafka的组合,数据库采用MySQL和Redis,后端业务框架采用Spring、Spring MVC、Mybatis。采用Hive处理离线数据,相比Hadoop,不需要书写复杂的MapReduce函数进行编程,更多的关注HQL,减轻开发难度和成本。而实时数据处理使用Storm和Kafka,因为它非常适合广告日志的处理。最终项目正常运行并上线,对平台的运营和广告实时竞价的决策具有非常重要的意义。
[Abstract]:After nearly 20 years of development, Internet advertising has achieved remarkable results. On October 9, 2015, Meituan and Dianping formally merged to form a new company Meituan-Dianping. Meituan-Dianping has become the largest online platform and scene portal in the 020 field, including group buying, take-out, advertising, leisure and entertainment, parenthood, marriage, beauty and so on, covering almost all life service scenarios. It contains all kinds of data: cross-industry data, transaction payment data, user browsing data and so on. Alliance Advertising Business Department is a sub-department of Meituan Dianping Advertising platform Department, which is located in external media, buys traffic for Meituan Dianping merchants, and provides promotion services. Based on this, alliance advertising needs to design a set of advertising data statistics and analysis platform for the department, display advertising data in the form of data chart, it can make multi-dimensional off-line statistics and real-time monitoring. It can provide the data basis for the meticulous placement, orientation and decision making of the follow-up advertisement, and at the same time, it can find, avoid and solve the possible hidden problems and risks in time. This paper mainly introduces the design and implementation of Kepler Alliance Advertising data Statistics and Analysis platform. The offline data model uses Hiveer Shell and Python to process the task, the real-time data model uses the combination of Storm and Kafka, the database uses MySQL and Rediss, and the back-end business framework uses Spring Spring MV C to implement task scheduling. The real-time data model uses the combination of Storm and Kafka, and the back-end business framework uses Spring Spring MV C to implement task scheduling. Using Hive to deal with offline data, compared with Hadoop, it does not need to write complex MapReduce functions for programming, and pays more attention to HQLs, and reduces the difficulty and cost of development. Real-time data processing uses Storm and Kafka because it is very suitable for advertising log processing. It is of great significance for platform operation and advertising real-time bidding decision to run the project normally and go online.
【学位授予单位】:南京大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.52
,
本文编号:1862091
本文链接:https://www.wllwen.com/wenyilunwen/guanggaoshejilunwen/1862091.html