基于通信数据的用户重要位置识别及区域功能发现
发布时间:2018-01-16 16:35
本文关键词:基于通信数据的用户重要位置识别及区域功能发现 出处:《浙江大学》2017年硕士论文 论文类型:学位论文
【摘要】:随着智能手机的普及,越来越多的人习惯随时携带手机,而手机的功能也在不断丰富,集成了更多的传感器。用户在使用智能手机时,产生了大量的使用记录。其中,语音通话、短信和流量数据等使用了运营商服务的行为,会在运营商端产生相应的日志记录。日志记录中包括使用时间、地点、业务类型,但是不包括具体的通信内容。这类数据最大的优势在于包含大规模的用户和覆盖大规模的用户活动空间,尽管在时间和空间上都有一定的稀疏性,还是能为用户行为的研究提供足够的信息。本文提出了一种基于运营商端通信数据对用户重要位置进行识别的方案。该方案没有依赖任何先验知识,是纯数据驱动的,在通信数据本身稀疏的情况下,仍然表现出了不错的性能。首先,我们从用户在各个位置上的通信数据中提取特征用于刻画用户在各个位置上的行为。然后,我们利用各个位置上得到的用户行为特征进行聚类分析。通过聚类分析我们发现,用户在大多数位置上的通信行为十分稀疏,但在某些位置上很密集,同时用户在这些位置上表现出来的行为模式也各不相同。我们称这种行为密集且行为模式各异的位置为特殊位置。根据对特殊位置的分析和友好用户提供的信息,我们认为用户的重要位置就在这些特殊位置附近。其次,文中还利用友好用户提供的住家和工作位置真值构建了识别住家和工作位置的分类器,90%的位置预测误差小于1600米。同时,还分析了与这两个特殊的重要位置相关的行为特征。分析结果表示,用户0~8点出现在住家位置的概率较大,且在住家位置没有明显的行为;12~20点在工作位置的概率较大,其行为倾向于通话短信等形式的通信行为。最后,我们将识别住家及工作位置的分类器推广到了在网的全量用户,并根据得到的预测结果对上海市的居住和办公功能的分布进行了分析。上海市工作区域的分布相对居住区域更加集中,但是总的来说居住区域和办公区域的分布基本重合。
[Abstract]:With the popularity of smartphones, more and more people are used to carrying mobile phones at any time, and the functions of mobile phones are becoming richer, integrating more sensors. A large number of usage records have been generated. Among them, voice calls, short messages and traffic data, which use the operator services, will generate the corresponding log records in the operator end. The log records include the time and place of use. Business types, but not specific communication content. The greatest advantage of such data is that it includes large scale users and covers large scale user activity space, although there is a certain amount of sparsity in time and space. It can provide enough information for the research of user behavior. This paper proposes a scheme to identify the important location of user based on the communication data of the operator. The scheme does not rely on any prior knowledge. It is pure data driven and still shows good performance when the communication data itself is sparse. First of all. We extract features from the user's communication data at each location to characterize the user's behavior at each location. Through clustering analysis, we find that the communication behavior of users in most locations is very sparse, but in some locations it is very dense. At the same time, the behavior patterns of users in these locations are different. We call this behavior dense and different behavior patterns as special location. Based on the analysis of the special location and the information provided by friendly users. . We think that the important location of the user is near these special locations. Secondly, we also construct a classifier to identify the home and working location using the real value of the home and working position provided by the friendly user. The position prediction error of 90% is less than 1600m. At the same time, the behavior characteristics related to these two special important positions are analyzed. And there is no obvious behavior in the house position; 1220 points in the working position of the probability is large, its behavior tends to call short message and other forms of communication behavior. Finally, we will identify the home and working location of the classifier to the full number of users in the network. According to the predicted results, the distribution of residential and office functions in Shanghai is analyzed. The distribution of working area in Shanghai is more concentrated than that in residential area. But on the whole, the distribution of living area and office area basically coincide.
【学位授予单位】:浙江大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.56;TN929.5
【相似文献】
相关期刊论文 前3条
1 吴凌云;王静;;串口通信数据的传输[J];数字技术与应用;2013年02期
2 郇义鹏,虞水俊;利用API拦截技术实现串口通信数据拦截[J];计算机应用;2003年11期
3 沈国珍;;干扰下航空通信数据结构弹性鲁棒性检测方法[J];计算机仿真;2014年07期
相关硕士学位论文 前2条
1 王程浩;基于通信数据的用户重要位置识别及区域功能发现[D];浙江大学;2017年
2 罗守昊;面向通信数据的工业无线网络拓扑和路径规划研究[D];东北大学;2013年
,本文编号:1433944
本文链接:https://www.wllwen.com/kejilunwen/xinxigongchenglunwen/1433944.html