当前位置:主页 > 文艺论文 > 广告艺术论文 >

基于spark的网络广告交易计费系统的设计与实现

发布时间:2018-03-22 05:16

  本文选题:分布式系统 切入点:广告计费 出处:《哈尔滨工业大学》2016年硕士论文 论文类型:学位论文


【摘要】:近年来,网络广告市场规模发展迅速,各大互联网公司都在布局自己的网络广告交易平台。计费系统是整个网络广告交易流程中重要的、不可或缺的一环。本文根据实际的业务需求,设计并实现了网络广告交易计费系统,来支持广告交易平台的计费需求。本文使用java、scala语言开发了广告计费系统,主要研究内容分为广告反作弊和广告计费。广告反作弊部分用来判定作弊的广告,系统对作弊的广告不扣费,保护广告主的利益。本文提出了基于统计方法的作弊判定规则,来过滤作弊的广告;为避免单纯的统计方法判定结果过于武断,提出了打分算法来计算广告的作弊可能性,从而实现对作弊广告的平滑过滤。广告计费部分基于spark实现。spark是一个基于内存的、可扩展、可容错的分布式计算框架。它在分布式的环境下处理广告数据,过滤作弊的广告,计算扣费金额,生成扣费日志,充分利用分布式系统高效,容错等特点,提供可扩展、高可用的计费服务。为了避免分布式系统内,单个结点压力过大而导致整个任务变慢的情况,提出对大规模数据进行分片的解决方案,使得每个分片内的数据量都在一个合理范围内,数据可以平均分布到各个结点上。为了解决网络访问中的性能瓶颈,通过异步接口提升系统性能。由于系统中处理的数据都跟钱有关,系统出现故障将直接导致计费的损失。为了尽量减少损失、规避风险,系统内进行了多项指标的监控,出现异常可以及时告警。经过测试和实际的线上运行,证明本系统可以对作弊广告进行有效过滤,每天处理亿级的广告数据,而且系统的设计性能高于线上的平均负载流量,可以应对短时间的数据尖峰。整个处理过程中,重要的数据指标有监控,关键操作有日志记录,万一出现异常方便排查问题。系统具有可扩展、可容错、高可用的特点,很好地支持了广告计费的需求,具有较高的实用价值。
[Abstract]:In recent years, the scale of the online advertising market has developed rapidly, and all the major Internet companies are laying out their own online advertising trading platforms. The billing system is important in the entire network advertising transaction process. According to the actual business requirements, this paper designs and implements a network advertising transaction billing system to support the billing requirements of advertising trading platform. This paper uses Java Scala language to develop an advertising billing system. The main content of the study is divided into anti-cheating and advertising billing. The anti-cheating part of advertising is used to determine the cheating ads, the system does not charge the cheating ads, so as to protect the interests of advertisers. In order to avoid the simple statistical method to judge the results too arbitrary, a scoring algorithm is proposed to calculate the likelihood of cheating. Advertising billing part based on spark implementation. Spark is a memory-based, extensible, fault-tolerant distributed computing framework. It processes advertising data in a distributed environment and filters cheating ads. Calculate deduction amount, generate deduction log, make full use of the characteristics of distributed system, such as high efficiency, fault tolerance, provide scalable and highly available billing services. When the pressure on a single node is too great to slow down the whole task, a solution is proposed to divide the large scale data into pieces, so that the amount of data in each slice is within a reasonable range. Data can be distributed evenly among nodes. In order to solve the performance bottleneck in network access, the asynchronous interface is used to improve the performance of the system. Because the data processed in the system is related to money, The failure of the system will directly lead to the loss of accounting. In order to minimize the loss and avoid the risk, the system has carried on the monitoring of many indexes, and the abnormal can be alerted in time. It is proved that the system can filter the cheating advertisement effectively, deal with the ad data of 100 million level every day, and the design performance of the system is higher than the average load flow on the line, which can deal with the data spike of short time. The important data index has the monitoring, the key operation has the log record, in case of the unusual convenient checking problem. The system has the characteristics of expandability, fault-tolerance, high availability, which supports the demand of advertisement charging well, and has high practical value.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP311.52


本文编号:1647227

资料下载
论文发表

本文链接:https://www.wllwen.com/wenyilunwen/guanggaoshejilunwen/1647227.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户67721***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com