当前位置:主页 > 科技论文 > 软件论文 >

手机阅读平台数据仓库管理模块的设计与实现

发布时间:2019-05-26 20:42
【摘要】:随着信息技术的不断发展,企业面临的数据愈加丰富和繁杂,其计算机系统产生的数据也越来越多,数据仓库技术的不断成熟,为企业的数据管理提供了有效的解决方案。在数据仓库的搭建、使用、建设中,随着海量数据的收集、入库、加工计算,数据资产的组织管理和日常访问的复杂性,成为人们关注的焦点。对于使用者而言,他们面对多个异构的数据来源,指标和解释有差异、统计口径不一致,业务人员的理解与开发人员具体实现不同步;对于业务和技术人员来说,他们要应对多重系统,业务名词的定义没有同业务、系统的发展相适应,缺少标准的信息承载平台。如何构建健全的数据仓库管理系统,是企业面对复杂数据资产,需要解决的重点问题。目前中国移动手机阅读基地的Hadoop Hive数据仓库平台已经稳定运营将近一年,开发任务基本都只在仓库的应用层面上,满足业务应用需求,然而仓库自身的管理、维护没有形成完整的功能集,用户只能依赖手工翻阅大量文本去获取所需信息。为了方便后期对整个数据仓库平台的科学管理与维护,本课题设计并实现了符合其自身特点的数据仓库管理模块。良好的仓库管理模块的设计不仅便于IT人员、技术人员及维护人员更好地管理和使用数据仓库资源,而且可以在很大程度上帮助普通业务人员灵活利用仓库所提供的海量数据。本课题设计并开发符合手机阅读基地Hadoop Hive数据仓库自身特点的仓库管理系统,提供元数据管理、任务调度监控、数据血缘分析功能。元数据管理使得用户可以高效的找出其所关心的数据,同时也是任务调度监控和数据血缘分析的基础。任务调度监控使得用户可以实时获取现网Hive运行状况,了解其上、下游节点信息和运行状况。数据血缘分析向用户提供数据地图,用户可以方便了解数据的来源和去向,并为后续仓库结构优化提供可靠的数据支撑。论文组织结构如下:第一章为绪论,介绍了本课题的研究背景、研究内容、研究意义,以及在手机阅读项目中的具体应用。第二章对本课题涉及的相关背景和技术进行简要介绍,包括手机阅读BI(商业智能)平台、数据仓库、元数据、Oozie和数据仓库应用现状。第三章对手机阅读平台数据仓库管理模块进行需求分析与总体设计。第四章介绍了仓库管理模块的详细设计和实现方案,包括各个关键模块实现的方案。第五章为系统测试与分析,对各项功能进行测试,并对系统应用前后的效果进行了对比分析。第六章针对本课题工作进行总结,对下一步的研究工作进行展望。
[Abstract]:With the continuous development of information technology, enterprises are facing more and more abundant and complicated data, and more data are generated by its computer system. the continuous maturity of data warehouse technology provides an effective solution for enterprise data management. In the construction, use and construction of data warehouse, with the collection, storage, processing and calculation of massive data, the organization and management of data assets and the complexity of daily access, people have become the focus of attention. For users, they face multiple heterogeneous data sources, indicators and interpretations are different, statistical caliber is inconsistent, and the understanding of business personnel is out of sync with the specific implementation of developers. For business and technical personnel, they have to deal with multiple systems, the definition of business terms is not in line with the development of business, system development, lack of standard information bearing platform. How to construct a sound data warehouse management system is a key problem that enterprises need to solve in the face of complex data assets. At present, the Hadoop Hive data warehouse platform of China Mobile phone Reading Base has been stable for nearly a year, and the development tasks are basically only on the application level of the warehouse to meet the business application needs. However, the warehouse itself is managed. Maintenance does not form a complete feature set, users can only rely on manual reading a large number of text to obtain the required information. In order to facilitate the scientific management and maintenance of the whole data warehouse platform in the later stage, a data warehouse management module in accordance with its own characteristics is designed and implemented in this paper. The design of a good warehouse management module is not only convenient for IT personnel, technicians and maintenance personnel to better manage and use data warehouse resources, but also can help ordinary business personnel to make flexible use of the massive data provided by the warehouse to a great extent. This paper designs and develops a warehouse management system which accords with the characteristics of Hadoop Hive data warehouse in mobile phone reading base, which provides the functions of metadata management, task scheduling monitoring and data consanguinity analysis. Metadata management enables users to find out the data they care about efficiently, and it is also the basis of task scheduling monitoring and data consanguinity analysis. Task scheduling monitoring enables users to obtain the running status of Hive in real time, and to understand the information and running status of upstream and downstream nodes. Data consanguinity analysis provides users with data maps, which can easily understand the source and direction of data, and provide reliable data support for subsequent warehouse structure optimization. The organizational structure of the paper is as follows: the first chapter is the introduction, which introduces the research background, research content, research significance, and the specific application in mobile phone reading project. The second chapter briefly introduces the related background and technology of this topic, including mobile phone reading BI (Business Intelligence) platform, data warehouse, metadata, Oozie and data warehouse application status. The third chapter carries on the requirement analysis and the overall design to the mobile phone reading platform data warehouse management module. The fourth chapter introduces the detailed design and implementation of warehouse management module, including the implementation of each key module. The fifth chapter is the system test and analysis, carries on the test to each function, and has carried on the comparative analysis to the system before and after the application effect. The sixth chapter summarizes the work of this topic and looks forward to the next research work.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP311.52

【参考文献】

相关期刊论文 前3条

1 张明治;;基于CWM规范设计的元数据管理系统[J];电脑知识与技术;2014年02期

2 杨鸿宾;宋明;;元数据管理平台总体架构设计研究[J];计算机系统应用;2007年11期

3 王磊;李一凡;赵怀慈;;银联数据仓库系统中ETL的设计和实现[J];微电子学与计算机;2007年05期

相关博士学位论文 前1条

1 魏建生;高性能重复数据检测与删除技术研究[D];华中科技大学;2012年

相关硕士学位论文 前4条

1 任桂禾;大数据处理支撑平台调度子系统的设计与实现[D];北京邮电大学;2015年

2 朱斌;基于Hadoop的日志统计分析系统的设计与实现[D];哈尔滨工业大学;2013年

3 毛瑞雪;基于数据血缘的审计证据追踪技术研究及应用[D];哈尔滨工程大学;2012年

4 贾文娟;基于hive分布式计算与数据挖掘的关联性营销的设计与实现[D];北京交通大学;2011年



本文编号:2485614

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2485614.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户42540***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com