当前位置:主页 > 科技论文 > 路桥论文 >

一种大规模流式数据聚类方法在交通热点分析中的应用

发布时间:2018-10-09 17:06
【摘要】:为了提高在大规模流式数据环境下交通热点区域分析的算法效率,提出了一种流式数据两阶段方法;该方法在第一阶段使用基于改进Canopy算法进行粗聚类并产生宏簇,在第二阶段使用K-means算法进行细聚类;并以粗聚类产生的宏簇个数和类簇中心位置为指导产生更加准确的微簇聚类结果。在试验中,使用流式数据两阶段方法对北京市出租车的定位数据进行了聚类分析;并结合热力图和电子地图对聚类结果进行可视化表达,在最终的热力分析结果中可以直观地发现出租车活动较为频繁的热点区域和线路,且与日常出行经验相符合。试验结果表明该算法能够实时地对流式数据进行聚类分析,产生的数据结果可供用户在任意时间窗口范围进行查询分析,有助于为交通活动情况实时分析、交通规划和拥堵治理等方面提供有价值的理论参考依据。
[Abstract]:In order to improve the efficiency of traffic hot spot analysis algorithm in large-scale flow data environment, a two-stage flow data analysis method is proposed, in the first stage, rough clustering based on improved Canopy algorithm is used to generate macro clusters. In the second stage, the K-means algorithm is used for fine clustering, and the number of macro clusters generated by rough clustering and the location of cluster center are used as the guidance to produce more accurate clustering results. In the experiment, a two-stage method of flow data was used to analyze the location data of taxis in Beijing, and the results of clustering were visualized with thermal maps and electronic maps. In the final thermal analysis results, the hot spots and routes with frequent taxi activities can be found directly, and the results are in accordance with the daily travel experience. The experimental results show that the algorithm can cluster and analyze the flow data in real time, and the resulting data can be queried and analyzed in any time window, which is helpful to the real-time analysis of traffic activity. Traffic planning and congestion management provide valuable theoretical reference.
【作者单位】: 大连海事大学交通运输管理学院;
【基金】:国家自然科学基金(71271034、61473053) 辽宁省教育厅科技研究项目(L2014203) 辽宁省社会科学规划基金(L14BGL012) 中央高校基本科研业务费专项资金(3132016046)联合资助
【分类号】:TP311.13;U495


本文编号:2260104

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/daoluqiaoliang/2260104.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户cfbb6***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com