TMS:一种新的海量数据多维选择Top-k查询算法
发布时间:2018-03-18 06:26
本文选题:TMS算法 切入点:有序列表 出处:《计算机研究与发展》2017年03期 论文类型:期刊论文
【摘要】:在许多应用中,Top-k是一种十分重要的查询类型,它在潜在的巨大数据空间中返回用户感兴趣的少量数据.Top-k查询通常具有指定的多维选择条件.分析发现:现有算法无法有效处理海量数据的多维选择Top-k查询.提出了一个基于有序列表的TMS(top-k with multi-dimensional selection)算法,有效计算海量数据上的具有多维选择的Top-k结果.TMS算法利用层次化结构的选择属性网格对原数据表执行水平划分,每一个分片的元组以面向列的模式存储,并且度量属性的列表根据其属性值降序排列.给定多维选择条件,TMS算法利用选择属性网格确定相关网格单元,有效减少需要读取的元组数量,提出双排序方法执行多维选择的渐进评价,并提出有效剪切操作来剪切不满足多维选择条件和分数要求的候选元组.实验结果表明:TMS算法性能优于现有算法.
[Abstract]:Top-k is a very important query type in many applications. It returns a small amount of data of interest to the user in the potential huge data space. Top-k query usually has the specified multidimensional selection condition. It is found that the existing algorithms can not deal with the multi-dimensional selection Top-k query of the massive data effectively. A TMS(top-k with multi-dimensional selection algorithm based on ordered lists is presented. The Top-k result with multi-dimension selection on massive data is calculated effectively. The hierarchical selection attribute grid is used to divide the original data table horizontally, and the tuples of each slice are stored in a column-oriented mode. The list of metric attributes is arranged in descending order according to the value of the attribute. Given the multi-dimensional selection condition, the TMS algorithm uses the selection attribute grid to determine the relevant grid cells, which effectively reduces the number of tuples to be read. A two-order method is proposed to perform the progressive evaluation of multidimensional selection, and an effective shearing operation is proposed to cut candidate tuples which do not meet the requirements of multidimensional selection and scores. The experimental results show that the performance of the two-order algorithm is superior to that of the existing algorithms.
【作者单位】: 哈尔滨工业大学计算机科学与技术学院;
【基金】:国家“九七三”重点基础研究发展计划基金项目(2012CB316200) 国家自然科学基金项目(61502121,61402130,61272046,61190115,61173022,61033015) 山东省自然科学基金项目(ZR2013FQ028) 山东省科技重大专项基金项目(2015ZDXX0210B02)~~
【分类号】:TP311.13
【相似文献】
相关重要报纸文章 前1条
1 姜波;电脑也需要病历[N];中国电脑教育报;2003年
,本文编号:1628363
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1628363.html