当前位置:主页 > 管理论文 > 信息管理论文 >

通信业客服热线文本主题识别与演化研究

发布时间:2018-07-20 16:29
【摘要】:客户服务热线(后简称“客服热线”)是企业及时获取用户心声的重要渠道。长期以来,囿于技术手段的局限,客服热线数据分析只针对话务量、用户满意度等结构化数据,而对蕴涵有潜在价值的话音转文本类非结构数据挖掘不深。随着客服热线话务量的爆发式增长,用户投诉问题类别和范围的扩大,如何从海量热线文本数据中快速识别投诉主题,研判用户的投诉情感演化趋势,成为客服人员亟待解决的重要实践问题。客服热线文本挖掘属于意见挖掘的研究范畴,现有的意见挖掘对象多以互联网文本数据为主,针对企业内部的客服热线文本挖掘的研究尚不多见。本文的研究,对于拓展意见挖掘研究范围,验证相关理论、方法的适用性具有较强理论意义。本文以意见挖掘的理论和方法为依托,以R语言为编程工具,对我国某通信运营商2013年9月-2014年9月为期13个月的客服热线工单文本信息进行语义和情感层面的深度分析,实现热线文本主题的自动识别与主题的情感趋势预测。具体而言,在语义分析层面,应用结构化主题建模(Structural Topic Modeling,STM)算法将70余万条文本记录自动识别归类为20个主题;在情感分析层面,首先,通过构建通信领域情感词库,设计文本情感极性强度算法,总结热线文本主题内容/情感倾向的分布特征,之后,应用时间序列自回归分析方法,对20个主题的情感倾向趋势进行预测,总结不同类型热线文本主题情感演化特点。通过以上研究,首先,构建了适用于通信行业客服热线文本情境的意见挖掘分析框架;其次,分别验证了结构化主题建模算法、文本情感极性强度算法和基于情感极性的文本主题时间序列自回归预测方法在客服热线文本语义挖掘和情感挖掘领域的适用性。在实践层面,开发的程序已实现客服热线文本主题的自动识别与归类,文本主题的情感倾向演化趋势预测,拓展了运营商客服部门基于热线文本数据决策的新思路。未来的研究可以从分析维度多样性和分析准实时性两方面进行完善:一方面,考虑将热线工单的其他“元数据”,如将投诉人、投诉地点、问题级别等因素加入主题模型,丰富语义挖掘多样性;另一方面,将目前实现的R单机程序与Spark等分布式系统结合,提升分析的准实时性。
[Abstract]:Customer service hotline (customer service hotline for short) is an important channel for enterprises to get users' voice in time. For a long time, due to the limitation of technical means, customer service hotline data analysis is only aimed at structured data such as traffic, user satisfaction, etc., but it is not deep for unstructured data mining of voice transliteration which contains potential value. With the explosive growth of customer service hotline traffic and the expansion of the category and scope of user complaints, how to quickly identify complaint topics from mass hotline text data and study the emotional evolution trend of users' complaints. Customer service personnel become an important practical problem to be solved. Customer service hotline text mining belongs to the research category of opinion mining. Most of the existing opinion mining objects are mainly Internet text data, but the research on customer service hotline text mining in enterprises is still rare. The research in this paper is of great theoretical significance for expanding the research scope of opinion mining and verifying the applicability of relevant theories and methods. Based on the theory and method of opinion mining and using R language as programming tool, this paper makes a deep semantic and emotional analysis on the text information of a customer service hotline from September 2013 to September 2014. The automatic identification of hot line text topic and the prediction of emotional trend are realized. Specifically, at the level of semantic analysis, using the structural topic Modeling (STM) algorithm, more than 700,000 text records are automatically classified into 20 topics. After designing text affective polarity intensity algorithm and summarizing the distribution characteristics of hot-line text topic content / affective tendency, using time series autoregressive analysis method to predict the tendency of emotional tendency of 20 themes. This paper summarizes the characteristics of emotional evolution of different types of hot wire text. Through the above research, firstly, we construct a framework of opinion mining analysis suitable for the text situation of customer service hotline in the communication industry. Secondly, we verify the structured topic modeling algorithm, respectively. The applicability of text affective polarity intensity algorithm and text subject time series autoregressive prediction method based on affective polarity in the field of customer service hotline text semantic mining and emotional mining. On the practical level, the developed program has realized the automatic identification and classification of the customer service hotline text topic, and the prediction of the trend of emotional tendency evolution of the text theme, which has expanded the new thinking of the operator customer service department based on the hot line text data decision-making. Future research can be improved in terms of dimension diversity and quasi-real-time analysis: on the one hand, consider adding other "metadata" of hotline work order, such as the complainant, complaint location, problem level and other factors into the thematic model. On the other hand, the realization of R single program is combined with distributed systems such as Spark to improve the quasi-real-time performance of analysis.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1;F626

【参考文献】

相关期刊论文 前10条

1 杨海霞;吴维芳;孙含林;;基于STM分析旅行者对不同档次酒店的偏好差异[J];现代图书情报技术;2016年09期

2 梁昕露;李美娟;;电信业投诉分类方法及其应用研究[J];中国管理科学;2015年S1期

3 王博;刘盛博;丁X;刘则渊;;基于LDA主题模型的专利内容分析方法[J];科研管理;2015年03期

4 史伟;王洪伟;何绍义;;基于微博情感分析的电影票房预测研究[J];华中师范大学学报(自然科学版);2015年01期

5 蒋翠清;梁坤;丁勇;刘士喜;刘尧;;基于社会媒体的股票行为预测[J];中国管理科学;2015年01期

6 周咏梅;阳爱民;杨佳能;;一种新闻评论情感词典的构建方法[J];计算机科学;2014年08期

7 邸亮;杜永萍;;LDA模型在微博用户推荐中的应用[J];计算机工程;2014年05期

8 胡吉明;陈果;;基于动态LDA主题模型的内容主题挖掘与演化[J];图书情报工作;2014年02期

9 王文文;周澍民;;社会化媒体对电影票房的预测价值研究[J];新闻传播;2013年12期

10 史庆伟;李艳妮;郭朋亮;;科技文献中作者研究兴趣动态发现[J];计算机应用;2013年11期

相关硕士学位论文 前6条

1 董文;基于LDA和Word2Vec的推荐算法研究[D];北京邮电大学;2015年

2 褚卫艳;基于投诉历史数据的分析和预测系统设计[D];北京邮电大学;2013年

3 董婧灵;基于LDA模型的文本聚类研究[D];华中师范大学;2012年

4 彭柳艳;中文网络产品评论的特征抽取及观点分类研究[D];武汉纺织大学;2011年

5 严孙荣;中文产品评论的意见挖掘研究[D];北京交通大学;2010年

6 张巧;基于用户评论的社会化媒体新闻推荐系统研究[D];西南财经大学;2010年



本文编号:2134103

资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/sjfx/2134103.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户2eac1***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com