基于大规模手机感知数据的用户特性挖掘
发布时间:2018-05-30 21:32
本文选题:智能手机 + 用户特性挖掘 ; 参考:《浙江大学》2017年博士论文
【摘要】:智能手机逐渐成为人们日常生活中不可或缺的一部分。作为智能手机的主体,用户在频繁使用手机的过程中产生了大量的个人历史数据。这些历史数据可以概括为以下几种:1)位置信号,通过GPS、手机信号塔、WiFi等方式获取的地理位置信息;2)使用信号,记录了用户在何时何地使用了手机做了什么;3)社交信号,隐含在CDR(call detail record),GPS,WiFi/蓝牙连接以及通讯录等数据里;4)个人行为信号,通过加速度、陀螺仪、相机等传感器获取。考虑到智能手机经常被同一个用户使用,这些历史数据隐含了很多与用户相关的个性化信息,例如性别,年龄,职业,婚姻状况等,也在一定程度上反应了用户的生活习惯和兴趣爱好。智能手机为推测用户属性与特征、理解用户提供了新的信息渠道。通过智能手机感知数据理解用户不仅有商业价值,并且可以帮助用户更好地理解自已。首先,通过智能手机感知数据理解用户有很强的商业价值,可以用来改善设备,应用和服务。例如,通过理解用户的兴趣爱好、属性等基本信息更好地提高应用的个性化,例如,个性化网页搜索和个性化推荐,进而提高商业利益。其次,通过手机记录的数据来理解用户可以帮助用户更全面更客观地了解自已。手机记录的一些行为信息可以帮助用户去客观的了解自已,也帮助他们发现自已不了解的一面。另外,人们的记忆能力是有限的,而手机的记录是无限的,可以持续长时间的记录用户的行为信息,从而帮助用户全面地理解自已。用户更全面地理解自已,可以帮助用户改善不健康的生活习惯等,从而提高生活质量。本文基于真实的手机感知数据,以理论研究为基础,着重从位置信息、手机App的安装信息以及手机app的使用信息等三个方面来理解用户的移动性、生活模式、兴趣偏好及习惯等特性。考虑到移动信息揭示了用户在日常生活中“何时”“何地”的基本要素,我们首先通过匿名WiFi扫描列推测用户的动态属性,移动性;其次,试图通过手机App安装列表挖掘用户的静态属性,例如年龄、性别、兴趣、偏好等;最后,我们通过手机App的使用信息去综合理解用户之间的相似性和差异性,并发现多个用户群体的存在。我们具体研究内容与意义描述如下:(1)基于匿名WiFi扫描列表的用户移动模式分析首先,我们试图从匿名的WiFi扫描列表里推测用户的移动轨迹,并在此基础上发现用户的生活方式。我们在WiFi扫描列表里提取出驻留地点之后,利用图论知识给每个用户建立了移动图,以描述他/她的移动轨迹。在用户的移动图里,我们通过社群检测的方法推测出用户的活动区域。在发现的活动区域的基础之上,我们定义了活跃性和多样性两个指标来衡量用户的移动性。除此之外,我们识别出家庭和工作地点两个重要的地点,并学习用户在家和工作地点方面的生活习惯,例如,某个用户在家待的平均时长,晚上外出的活跃性,分别在工作日和周末的工作时长等。我们在Device Analyzer数据集上验证了我们的方法,其中Device Analyzer数据及包含了17,000多个用户详细的手机使用信息。(2)基于手机App安装列表的用户属性挖掘除了推测用户的动态属性,移动性,我们还试图通过手机app安装列表挖掘用户的静态属性,例如,性别、年龄、兴趣、偏好等。我们尝试通过用户的手机App安装列表去挖掘用户的属性。我们提出基于特定属性的表征方法来描述用户的特性,并且对手机app与特定的属性之间的关系进行建模。为了验证我们的方法,我们在一个包含100,000多用户的手机App列表的数据集上做了很多实验。我们的方法对于12个预定义的用户属性,平均等错误率为16.4%。据我们所知,这是第一个通过手机App安装列表来挖掘用户属性的工作。(3)基于手机App使用记录的用户群体发现最后,我们试图通过分析手机App的使用情况,综合地理解用户之间的差异性和相似性,从而发现多个用户群体。我们分析了 106,672个安卓手机用户持续一个月的手机App的使用信息,利用我们提出的两步聚类法和特征排序的方法,基于手机App使用行为的相似性,发现了 382个明显不同的手机用户群体。我们的研究结果对可推广的研究,手机应用的设计和开发,不同用户群体的手机应用预安装的决策方面都有着深远的意义。
[Abstract]:Smartphone has gradually become an integral part of people's daily life. As the main body of smart phones, users generate a lot of personal historical data in the process of frequent use of their mobile phones. These historical data can be summed up as follows: 1) location signals, access to geographical locations through GPS, cell phone tower, WiFi and so on. Interest; 2) using signals, recording what when and where a user has used a mobile phone; 3) social signals, hidden in data such as CDR (call detail record), GPS, WiFi/ Bluetooth connection and address book; 4) personal behavior signals, obtained by sensors such as acceleration, gyroscopes, phase machines, etc., considering that smart phones are often used by the same user, These historical data imply a lot of user related personalized information, such as sex, age, occupation, marital status and so on. It also reacts to the user's habits and interests to a certain extent. Smart phones provide new information channels to speculate on user attributes and features, understand users and understand the data through smart phones. Users not only have commercial value, but also help users to better understand themselves. First, it can be used to improve the equipment, application and service by using the smart phone to understand the user's strong business value, for example, to improve the personalization of the application by understanding the user's interests, attributes and other basic information, for example, personalization. Web search and personalized recommendation to improve business interests. Secondly, it is understood that users can help users understand themselves more comprehensively and objectively through the data recorded by the mobile phone. Some behavioral information of the mobile phone records help users to understand themselves objectively and help them find out their own side. In addition, people's records are recorded. The memory ability is limited, and the record of the mobile phone is unlimited, it can record the user's behavior information for a long time, thus helping the user to understand themselves fully. The user can understand themselves more comprehensively, can help the users to improve the unhealthy habits and so on, so as to improve the quality of life. This article is based on the real mobile phone perception data. On the basis of research, we can understand the mobility, life pattern, interest preference and habit of the user from three aspects, such as location information, the installation information of mobile phone App and the use information of mobile phone app. The name WiFi scan shows the dynamic property and mobility of the user. Secondly, it tries to excavate the static attributes of the user through the App installation list of the mobile phone, such as age, sex, interest, preference and so on. Finally, we understand the similarity and the difference between the users through the use information of the mobile phone App, and discover the existence of multiple user groups. The specific content and significance are described as follows: (1) first of all, based on anonymous WiFi scan list, we attempt to speculate the user's mobile trajectory from the anonymous WiFi scan list and discover the user's lifestyle on this basis. We use the graph theory knowledge to extract the resident location in the WiFi scan list. Each user has set up a mobile map to describe his / her movement trajectory. In the user's mobile graph, we speculate the user's active area by the method of community detection. On the basis of the discovered active area, we define two indicators of activity and diversity to measure the mobility of the user. In addition, we identify the home. There are two important locations in the court and the workplace and learn the habits of the user at home and place of work, such as the average length of a user at home, the activity of night out, and the length of work on the weekdays and weekends, respectively. We verify our methods on the Device Analyzer data set, of which the number of Device Analyzer According to and includes more than 17000 users' detailed mobile phone use information. (2) user attributes mining based on the App installation list of mobile phones, in addition to speculating the user's dynamic properties and mobility, we also try to excavate the user's static properties through the mobile app installation list, such as sex, age, interest, preference and so on. We try to use the user's mobile Ap P installation list to excavate user properties. We propose a characterization method based on specific attributes to describe user characteristics and model the relationship between mobile app and specific attributes. In order to verify our method, we have done a lot of experiments on a data set of a mobile App list containing more than 100000 users. Method for 12 predefined user attributes, the average error rate is 16.4%., as we know, this is the first work to dig user attributes through a mobile App installation list. (3) finally, based on the user group discovery of the App using the mobile phone, we try to analyze the use of the hand machine App and understand the difference in the user. We have analyzed the use of 106672 Android mobile phone users for one month, using our two step clustering method and the feature sorting method, based on the similarity of the mobile App usage behavior, and found 382 distinct group of mobile phone users. We have found 382 Android mobile phone users. The research results have far-reaching significance for the research and development of mobile phones, the design and development of mobile applications, and the pre installation decisions of different user groups.
【学位授予单位】:浙江大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:TP311.13;TP311.56
【参考文献】
相关期刊论文 前1条
1 陈龙彪;李石坚;潘纲;;智能手机:普适感知与应用[J];计算机学报;2015年02期
,本文编号:1956895
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1956895.html