面向实时交通流数据的HBase辅助索引技术研究和实现
发布时间:2018-06-12 18:25
本文选题:交通流数据 + HBase ; 参考:《北方工业大学》2017年硕士论文
【摘要】:随着城市智能交通的快速发展、打车软件和共享单车的盛行,交通数据呈现出指数式增长。对交通数据进行合理的分析可以预测出人们的出行习惯和热门交通路线,为城市交通的管理提供帮助,也为相关互联网公司创造营收提供支持。HBase数据库对大规模数据的存储具有先天优势,在交通领域的应用非常广泛。但是HBase只能在行键上构建索引,无法直接构建高效的多维索引和非行键索引,在交通数据的时空查询上性能较差,并且无法实现快速的非行键查询。针对这些问题,需要为HBase构建相应的辅助索引。本文结合实时交通流数据的特点,经过大量的调研工作,对HBase时空索引和二级索引两种辅助索引技术进行了研究和设计,实现了一个基于HBase辅助索引的查询系统。主要研究内容如下:一、针对交通数据的时空三维特征,提升HBase对交通数据的时空查询性能,本文通过Geohash降维的思想研究和设计了三种基于HBase的时空索引,并对其优缺点和适用的场景作了分析。二、针对HBase非行键查询性能低下的问题,提高HBase对交通数据查询支持的灵活度,本文通过倒排索引的思想研究和设计了两种HBase二级索引,并作了相关分析。三、实现了一个基于HBase辅助索引的查询系统。该系统实现了本文设计的两种辅助索引中各自最优的方案,还实现了 SQL解析、索引构建、索引匹配、索引管理等功能,简化了用户对HBase数据库的查询操作。四、最后通过实验验证了本文提出的索引方案能有效提升实时交通流数据的查询性能,并且验证了索引管理的有效性。
[Abstract]:With the rapid development of urban intelligent transportation and the prevalence of ride-hailing software and shared bicycles, traffic data show exponential growth. Reasonable analysis of traffic data can predict people's travel habits and popular transportation routes, and provide help for urban traffic management. HBase also provides revenue support for related Internet companies. HBase database has an inherent advantage in large-scale data storage and is widely used in transportation. But HBASE can only build index on row key, can not directly build multi-dimensional index and non-line key index directly, and can not realize fast non-line key query because of its poor performance in spatio-temporal query of traffic data. To solve these problems, it is necessary to build corresponding auxiliary indexes for HBASE. According to the characteristics of real-time traffic flow data and through a lot of research work, this paper studies and designs two kinds of auxiliary index technology: HBase space-time index and second-level index, and realizes a query system based on HBAS-based auxiliary index. The main research contents are as follows: firstly, aiming at the spatial and temporal features of traffic data and improving the spatio-temporal query performance of HBase to traffic data, this paper studies and designs three space-time indexes based on HBase through the idea of reducing dimension of Geohash. Its advantages and disadvantages and applicable scenarios are analyzed. Secondly, in order to improve the flexibility of HBASE in traffic data query, two kinds of HBASE secondary indexes are designed and analyzed through the idea of inverted index, aiming at the low performance of HBASE non-line key query and improving the flexibility of HBASE to traffic data query. Thirdly, a query system based on HBASE aided index is implemented. The system realizes the optimal schemes of the two auxiliary indexes designed in this paper, and also realizes the functions of SQL parsing, index construction, index matching, index management and so on, which simplifies the query operation of the user to HBase database. Finally, the experimental results show that the proposed indexing scheme can effectively improve the query performance of real-time traffic flow data, and verify the effectiveness of index management.
【学位授予单位】:北方工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:U495;TP311.13
【参考文献】
相关期刊论文 前10条
1 李冬;房俊;;基于HBase的交通数据区域查询方法[J];计算机与数字工程;2017年02期
2 房俊;李冬;郭会云;王嘉怡;;面向海量交通数据的HBase时空索引[J];计算机应用;2017年02期
3 李德仁;马军;邵振峰;;论时空大数据及其应用[J];卫星应用;2015年09期
4 葛微;罗圣美;周文辉;赵,
本文编号:2010651
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2010651.html