当前位置:主页 > 科技论文 > 搜索引擎论文 >

基于Storm的订单大数据实时监控系统

发布时间:2018-04-10 11:31

  本文选题:大数据 + 实时监控 ; 参考:《东华大学》2017年硕士论文


【摘要】:大数据时代,实时有效地收集海量订单数据,帮助企业智慧地从数据中获取目标信息,能够更加有效的、有条理的制定相关业务的发展和改进方向,企业可以通过对海量数据进行二次开发利用,进一步地总结和处理数据,最终制定出更加符合用户需求、更加适用于市场的设计方案。Storm作为一套高效的、安全的、实时的大数据处理引擎被本系统使用,No SQL数据库Elasticsearch和Mongo DB则能满足对海量数据的高效存储与查询。本论文将基于Storm、Kafka、No SQL数据库Elasticsearch和Mongo DB设计和实现订单大数据的实时监控系统。系统将主要的数据保存在Elasticsearch中,将一些配置参数和时间戳数据保存在Mongo DB中;文件系统使用HDFS;采用Scala语言来编写代码;使用分布式消息队列Kafka来连接Storm中不同功能的拓扑,提高了系统的可靠性。在数据处理过程中,执行的操作名称、时间节点和部分中间结果会被记录到系统日志中,以便解决系统错误和提升系统性能,平台处理后的重要结果,可以通过Web页面以多种图和表的形式向用户展示,该展示网站具备搜索引擎,能够通过关键字和特定规则的语句搜索目标数据,还支持点击图标和标签进行快捷查找。整套系统部署在分布式集群中,具有高实时性、高效率、高容错性、可扩展等特点,结果数据展示网站功能强大,界面清晰简洁,用户体验很好,平台可以实时地监控海量数据信息的变化,从结构混合、复杂的、规模庞大的数据中,通过智能化的方法,挖掘出有价值的信息,从而创造出一定的经济和社会价值。
[Abstract]:Big data era, real-time and effective collection of massive order data, to help enterprises intelligently obtain target information from the data, can be more effective, orderly development and improvement of related business direction,The enterprise can further sum up and process the data through the secondary development and utilization of the massive data, and finally work out a design scheme that is more in line with the needs of the user and more suitable for the market. Storm is a set of high efficiency and security.Using Elasticsearch and Mongo DB, the real-time big data processing engine can satisfy the high efficiency storage and query of massive data.This thesis will design and implement the real time monitoring system of order big data based on Elasticsearch and Mongo DB, which is based on Elasticsearch and Mongo DB.The system stores main data in Elasticsearch, saves some configuration parameters and timestamp data in Mongo DB, file system uses HDFS, uses Scala language to write code, uses distributed message queue Kafka to connect the topologies of different functions in Storm.The reliability of the system is improved.During data processing, the name of the operation performed, the time node, and some intermediate results are recorded in the system log to resolve system errors and improve system performance.Web pages can be displayed to users in the form of a variety of graphs and tables, the display site has a search engine, can search through keywords and specific rules of statements to search for target data, but also supports clicking on icons and labels for quick search.The whole system is deployed in the distributed cluster, with the characteristics of high real-time, high efficiency, high fault tolerance, extensibility, etc. The result data display website is powerful, the interface is clear and concise, and the user experience is very good.The platform can monitor the change of massive data information in real time. From the data of mixed structure, complex and large scale, through intelligent method, the platform can mine valuable information and create certain economic and social value.
【学位授予单位】:东华大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP277

【参考文献】

相关期刊论文 前1条

1 金澈清,钱卫宁,周傲英;流数据分析与管理综述[J];软件学报;2004年08期



本文编号:1731020

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1731020.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户be8ee***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com