面向顾客目录分割算法的研究及应用
发布时间:2018-05-24 16:07
本文选题:数据挖掘 + 目录分割 ; 参考:《华南理工大学》2013年硕士论文
【摘要】:目前,数据挖掘在微观经济领域有着广泛的应用。目录分割是其中一个重要的研究方向。在过去目录分割的研究中,主要是以目录中商品被购买的数量来衡量制定目录的好坏。但随着目录分割研究的深入、目录分割实际应用领域的增多,以吸引更多顾客前来光顾实体店或网店为目标的目录分割问题(简称面向顾客目录分割)的重要性日益凸显。面向顾客目录分割主要应用于顾客分类、产品推送、客户关系管理等方面。 本文在对现有面向顾客目录分割算法进行深入分析的基础上,修正了Best-Product-fit (BPF)算法中用于选择“最佳商品”的评分函数;提出了新的面向顾客目录算法主要包括两个步骤:①从历史交易数据中寻找频繁项集;②创建合适的商品目录,依据商品目录对顾客进行划分。最后在面向顾客目录分割模型中引入了商品利润指标,得到了新的一般化模型;着重研究了数据归一、数据库表示方式、利润加权计算等几个关键问题。 本文的主要工作有以下几点: (1)针对BPF算法中的评分函数在兴趣度t2时,两部分存在的不平衡性,提出了采用动态加权方式,来获得较均衡的商品分值。通过实验比较发现改进后的评分函数相对于传统的评分函数,算法可以覆盖更多的顾客。 (2)针对BPF算法不能适用于企业收益为负的情形,引入了风险度概念并给出了新的评分函数,在此基础上得到了算法适用于服务性质的企业。 (3)根据顾客兴趣度和频繁模式挖掘中最小支持度间的关系,提出了通过频繁模式创建商品目录的思想。提出了基于频繁模式挖掘的目录分割算法。 (4)在面向顾客目录分割中引入利润指标,给出了利润加权的两种方式,即直接加权和间接加权。实验证明,该模型可以覆盖更多的高价值顾客。 (5)使用真实电商上的交易数据,将改进后的面向顾客目录分割算法和经典的算法BPF做了对比分析;并将两种不同的利润加权方式做了对比分析,,结果发现间接利润加权方式有着更好的效果。
[Abstract]:At present, data mining has a wide range of applications in the field of microeconomics. Directory segmentation is one of the important research directions. In the past research on catalog segmentation, the quality of cataloguing was mainly measured by the quantity of goods purchased in the catalogue. However, with the development of research on directory segmentation, the importance of directory segmentation, which aims at attracting more customers to visit physical stores or online shops, is becoming more and more important. Customer-oriented catalog segmentation is mainly used in customer classification, product push, customer relationship management and so on. Based on the in-depth analysis of the existing customer-oriented catalog segmentation algorithms, this paper modifies the scoring function used to select the "best commodity" in the Best-Product-fit / BPF algorithm. In this paper, a new customer-oriented directory algorithm is proposed, which includes two steps: 1 to find frequent itemsets from the historical transaction data and to create a suitable catalog, and to divide the customers according to the catalog. Finally, a new general model is obtained by introducing the commodity profit index into the customer oriented catalog segmentation model, and some key problems, such as data normalization, database representation, profit weighting calculation and so on, are studied emphatically. The main work of this paper is as follows: 1) aiming at the imbalance between the two parts of the scoring function in the BPF algorithm, a dynamic weighting method is proposed to obtain a more balanced commodity score. The experimental results show that the improved scoring function can cover more customers than the traditional scoring function. 2) in view of the fact that the BPF algorithm can not be applied to the situation where the enterprise returns are negative, the concept of risk degree is introduced and a new scoring function is given. On the basis of this, it is obtained that the algorithm is suitable for enterprises with service nature. 3) according to the relationship between customer interest degree and minimum support degree in frequent pattern mining, the idea of creating catalog by frequent pattern is put forward. A directory segmentation algorithm based on frequent pattern mining is proposed. In this paper, the profit index is introduced into customer oriented catalog segmentation, and two ways of weighted profit are given, that is, direct weighting and indirect weighting. Experiments show that the model can cover more high value customers. (5) using the transaction data on real e-commerce, the improved client-oriented catalog segmentation algorithm and the classical algorithm BPF are compared and analyzed, and the two different profit weighting methods are compared and analyzed. The results show that indirect profit weighting has better effect.
【学位授予单位】:华南理工大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP311.13;TP391.41
【参考文献】
相关期刊论文 前5条
1 周常恩;林端宜;杨雪梅;赖新梅;褚剑锋;;频繁模式挖掘算法综述[J];福建电脑;2010年02期
2 李晓毅;徐兆棣;;关联规则挖掘的算法分析[J];辽宁工程技术大学学报;2006年02期
3 何友全;;大型数据库中关于多频项集的动态增量式挖掘[J];计算机工程;2006年02期
4 谢廷婷;;频繁集挖掘算法研究[J];计算机与现代化;2007年03期
5 赵丹丹;;Apriori算法改进及其在中药知识发掘中的应用[J];计算机与现代化;2007年08期
相关博士学位论文 前2条
1 马海兵;频繁模式挖掘相关技术研究[D];复旦大学;2005年
2 徐秀娟;商务智能中的利润挖掘研究[D];吉林大学;2008年
本文编号:1929712
本文链接:https://www.wllwen.com/guanlilunwen/kehuguanxiguanli/1929712.html