基于B2C电子商务数据仓库的研究与设计
[Abstract]:B2C e-commerce website system generates a large amount of product transaction data and access log data every day, which contains a lot of valuable information, such as the source of orders, the behavior of customers, the interest of visitors and so on. The analysis of these data can not only help the decision makers to guide the operation of B2C e-commerce website, attract more users, but also can reflect the marketing and sales promotion of enterprises. After-sales service and financial management and other aspects of the situation. In a word, in-depth and effective analysis of these data can help managers to improve customer relations and enhance the competitiveness of all aspects of the enterprise. On the basis of describing the relevant theories of data warehouse, including the concept, basic characteristics, system structure, concept of B2C electronic commerce and OLAP multidimensional data analysis, this paper puts forward a perfect data warehouse model of B2C electronic commerce. The main work of this paper is as follows: 1. Based on the analysis of user requirements of B2C e-commerce data warehouse, a multi-level conceptual model of B2C e-commerce data warehouse is proposed, and the related dimension model and fact set are designed. Based on the model, the physical design of some dimension tables and fact tables is completed. 2. The data source of B2C e-commerce data warehouse is analyzed and the semi-structured data source processing is discussed. An improved session recognition algorithm of page media type time threshold is proposed for Web access log combined with the pre-processing method of semi-structured data. Through different URL page types, different page time threshold calculation method is adopted. Compared with the existing user access pages using a single prior threshold and the existing dynamic threshold calculation, this method can more truly reflect the user session, and the recognition accuracy has been greatly improved. Provide efficient and accurate data for subsequent analysis. 3. Based on the B2C e-commerce data warehouse model proposed in this paper, an experimental B2C e-commerce data warehouse project is constructed. Taking the Zen Cart website system as an example, the analysis topic is determined and based on the idea of multidimensional modeling, different grained dimensions, data marts are established, and a ETL architecture is designed, including ETL scheduling scheme, data preprocessing method and so on. Finally, the online analysis of order data is carried out to show the value of B 2 C e-commerce data warehouse. The B2C electronic commerce data warehouse model proposed in this paper has the following characteristics: 1. The model has the characteristics of pertinence and practicability. It involves all the main aspects of the enterprise in both internal and external e-commerce trade activities, including page clicks, product sales, orders, users' comments on products, sales profits, warehouses, etc. Order products, logistics distribution, etc. 2. The model adopts multi-level dimension design and provides a better perspective for enterprise decision making through rational and effective conceptual stratification. Finally, the validity of the model is verified by experiments.
【学位授予单位】:广东工业大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP311.13
【参考文献】
相关期刊论文 前7条
1 殷贤亮;张为;;Web使用挖掘中的一种改进的会话识别方法[J];华中科技大学学报(自然科学版);2006年07期
2 方元康;胡学钢;夏启寿;;Web日志预处理中优化的会话识别方法[J];计算机工程;2009年07期
3 李燕;冯博琴;鲁晓锋;;Web日志挖掘中的数据预处理技术[J];计算机工程;2009年22期
4 范纯龙;姜宏飞;李华;;利用图片类日志信息改进会话识别质量[J];计算机应用;2010年04期
5 杨富华;;网络日志预处理中优化的会话识别算法[J];计算机仿真;2011年04期
6 蔡浩;贾宇波;黄成伟;黄志强;;Web日志挖掘中的会话识别算法[J];计算机工程与设计;2009年06期
7 周爱武;程博;李孙长;夏松;;Web日志挖掘中的会话识别方法[J];计算机工程与设计;2010年05期
相关硕士学位论文 前10条
1 皮涛;基于Lucene的面向主题信息搜索系统的关键技术分析及应用[D];武汉理工大学;2011年
2 周庆华;面向电子商务的数据挖掘研究与实现[D];中国人民解放军国防科学技术大学;2002年
3 张开松;基于Web技术的数据仓库研究与设计[D];武汉理工大学;2005年
4 岳志强;制造企业销售信息处理及分析系统研究[D];大连交通大学;2005年
5 徐益军;电子商务公共服务平台下的Web挖掘系统研究[D];天津工业大学;2006年
6 卜建峰;电子商务系统中数据融合与OLAP的研究与设计[D];西北工业大学;2007年
7 李勇;数据挖掘的算法研究及其在Web日志分析中的应用[D];长春理工大学;2008年
8 张弛;开源OLAP技术在多媒体教学系统中的应用研究[D];北京邮电大学;2009年
9 蔡俊;基于数据仓库的点击流技术的研究[D];江苏大学;2009年
10 王立;OLAP在视频网站日志分析中的应用[D];东华大学;2010年
,本文编号:2307936
本文链接:https://www.wllwen.com/jingjilunwen/dianzishangwulunwen/2307936.html