异构语义日志知识库上频繁访问模式发现的研究

发布时间：2018-07-22 17:27

【摘要】：语义网为Web上的内容赋予了计算机可以理解并解释的语义,能够有效提高Web使用挖掘的效率,目前成为人工智能领域的一个重要研究方向。而频繁访问模式发现作为Web上用户使用挖掘的重要内容,能够从海量的Web使用数据中挖掘出用户在不同情况下的频繁访问行为,挖掘结果对于发展电子商务、改善网站管理以及提升个性化服务等,具有重要意义。本文从语义Web的本体与规则出发,着重研究两者的有机结合和在此异构语义日志知识库上的频繁Web访问模式发现,主要工作包括:一、改进日志本体的形式化描述以事件为核心,完善日志本体的分层形式化描述,将本体定义为六元组,并将其领域关系采用应用规则表示。因为目前日志本体中的领域关系主要由领域专家定义,既不能保证内容的全面性,又不能很好的满足应用场景的动态性,这种改进不仅精简了日志本体的内容,也更加符合现实需求。二、基于异构法结合日志本体与规则构建语义日志知识库采用Datalog异构规则表示领域关系及用户访问行为,在Datalog安全性的约束下,结合日志本体构建异构语义日志知识库。这种方法克服了本体推理能力和动态语义表达能力较弱的缺点,实现两者的优势互补,有效提高了知识库的表达能力和推理能力。三、提出Datalog异构语义日志知识库上频繁Web访问模式挖掘的方法在异构语义日志知识库上,基于ILP理论提出一种频繁Web访问模式挖掘方法,通过输入核心事件refE,扩展Web访问的模式空间,构建候选访问模式集并验证模式有效性及计算支持度,发现Web上的频繁用户访问模式。四、基于异构语义日志知识库的Web频繁访问模式挖掘的系统设计与实现设计和实现一个以Java为编程语言的频繁Web访问模式发现系统,系统包括异构语义日志知识库构建和频繁访问模式挖掘两个部分。通过本体解析器生成日志本体,规则解析器和异构规则安全性检查产生应用规则,结合两者构建异构语义日志知识库。使用频繁Web访问模式挖掘的方法从知识库上发现用户的频繁访问模式。通过实验验证了理论研究的可行性。
[Abstract]:The semantic Web gives the content on the Web semantics that the computer can understand and interpret, and it can effectively improve the efficiency of Web usage mining. It has become an important research direction in the field of artificial intelligence. Frequent access pattern discovery, as an important part of user usage mining on the Web, can mine the frequent access behavior of users in different situations from the mass of Web usage data, and the mining results are useful for the development of electronic commerce. It is of great significance to improve website management and enhance personalized services. Based on the ontology and rules of semantic Web, this paper focuses on the combination of them and the discovery of frequent Web access patterns in this heterogeneous semantic log knowledge base. The main work includes: 1. The formal description of the improved log ontology takes the event as the core and the hierarchical formal description of the log ontology is improved. The ontology is defined as a six-tuple and its domain relations are represented by application rules. Because the domain relationship in log ontology is mainly defined by domain experts, it can neither guarantee the comprehensiveness of the content nor satisfy the dynamic nature of the application scenario. This improvement not only simplifies the content of the log ontology, Also more in line with the actual needs. Secondly, the semantic log knowledge base is constructed based on heterogeneous method combined with log ontology and rules. Datalog heterogeneous rules are used to represent domain relationship and user access behavior. Under the constraint of Datalog security, heterogeneous semantic log knowledge base is constructed with log ontology. This method overcomes the weakness of ontology reasoning ability and dynamic semantic expression ability, realizes their complementary advantages, and effectively improves the expression and reasoning ability of knowledge base. Thirdly, a method of mining frequent Web access patterns on Datalog heterogeneous semantic log knowledge base is proposed. Based on ILP theory, a method of mining frequent Web access patterns is proposed. By inputting core event refE, expanding the pattern space of Web access, constructing candidate access pattern set and verifying the validity and computing support of the schema, the frequent user access patterns on the Web are discovered. Fourthly, the system design and implementation of Web frequent access pattern mining based on heterogeneous semantic log knowledge base is designed and implemented, and a frequent Web access pattern discovery system based on Java programming language is designed and implemented. The system includes two parts: heterogeneous semantic log knowledge base construction and frequent access pattern mining. Log ontology is generated by ontology parser, application rules are generated by rule parser and security check of heterogeneous rules, and heterogeneous semantic log knowledge base is constructed by combining them. The frequent access patterns of users are found from the knowledge base by mining frequent Web access patterns. The feasibility of theoretical research is verified by experiments.
【学位授予单位】：电子科技大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP391.1

【相似文献】