基于模糊规则的知识发现与表示研究

发布时间：2018-03-04 22:22

本文选题：模糊规则　切入点：模糊决策树　出处：《大连理工大学》2015年博士论文　论文类型：学位论文

【摘要】：在当今数字化革命的信息时代,从数据中发现知识变得越来越重要。知识表示和推理一直被认为是知识工程的核心问题。基于模糊规则的系统作为一种重要的知识发现技术可以有效地解决知识表示和推理问题,其通过模糊规则表示知识并采用模糊逻辑推理运用知识。基于模糊规则的系统可以有效地完成各种各样的知识发现与运用任务,包括回归、聚类、分类、预测等。并且基于模糊规则的系统具有很多优势,其中包括较高的精度和具有易于理解的语义解释。本文研究了基于模糊规则的系统,包括模糊决策树、基于模糊规则的聚类和数据粒化。主要内容包括：1.提出了一种新的基于模糊规则的决策树——模糊规则决策树(FRDT)。应用模糊一致性设计了关联规则提取算法(AREA),通过关联规则提取算法AREA构建了模糊规则决策树。传统的决策树在每个节点处仅考虑一个特征,所提出的模糊规则决策树在每个节点上考虑一个由若干个特征组成的模糊规则,进而构造了高纯度的叶子节点并减小了树的规模。模糊规则决策树克服了以超平面作为决策函数的斜决策树没有语义解释的问题。应用UCI机器学习标准数据库的数据进行了实验研究,分析了模糊规则决策树的性能。与传统决策树(C4.5, LADtree, BFTree, SimpleCart, NBTree)进行了对比,所进行的统计假设检验说明了模糊规则决策树在准确率和树的规模上均优于传统决策树。2.在公理模糊集理论框架下,提出了一种新的聚类方法——AFS模糊聚类。首先,对于每个不同的样本选取模糊语义集合。其次,运用公理模糊集理论,根据所选取的模糊语义构造复杂概念作为每个样本的模糊描述。最后,将相同的或者相似的样本描述放入同一类中,形成该类的模糊描述,并且根据类描述实现对样本数据的聚类。通过UCI机器学习标准数据库的数据检验了AFS模糊聚类算法的可解释性,并且在聚类精度上可以同其他基于规则的聚类算法以及经典聚类算法FCM和K-means相比较。3.提出了一种快速数据粒化方法(FRCGC),对每个得到的信息粒给出了语义解释。首先,应用提出的无监督特征选择方法选择属性。接下来,构建模糊规则对样本进行描述,并且根据样本描述的重要程度,将典范描述挑选出来。最后,通过典范描述实现数据的粒化。可以根据实际问题,调整样本描述的定义,使得易于对该粒化方法进行改进,以处理复杂问题。通过UCI机器学习标准数据库的数据验证了所提出的数据粒化方法的可解释性和有效性。
[Abstract]:In the information age of the digital revolution, Finding knowledge from data has become more and more important. Knowledge representation and reasoning have always been considered as the core of knowledge engineering. As an important knowledge discovery technology, fuzzy rule based system can effectively solve knowledge. Representation and reasoning problems, The system based on fuzzy rules can effectively accomplish all kinds of tasks of knowledge discovery and application, including regression, clustering and classification. Moreover, the system based on fuzzy rules has many advantages, including high precision and easy to understand semantic interpretation. This paper studies the system based on fuzzy rules, including fuzzy decision tree, Clustering and data granulation based on fuzzy rules. The main contents include: 1. A new decision tree based on fuzzy rules, fuzzy rule decision tree (FRDTT), is proposed. The association rule extraction algorithm is designed by using fuzzy consistency. The rule extraction algorithm AREA constructs a fuzzy rule decision tree. The traditional decision tree considers only one feature at each node. The proposed fuzzy rule decision tree considers a fuzzy rule composed of several features on each node. The fuzzy rule decision tree overcomes the problem that the oblique decision tree with hyperplane as the decision function has no semantic interpretation. Using UCI machine learning standard database. The data were studied experimentally. The performance of fuzzy rule decision tree is analyzed and compared with traditional decision tree C4.5, LADtree, BFTreeSimpleCart, NBTree. The statistical hypothesis test shows that the fuzzy rule decision tree is superior to the traditional decision tree in accuracy and tree scale. A new clustering method, AFS fuzzy clustering, is proposed under the framework of axiomatic fuzzy set theory. For each sample, fuzzy semantic set is selected. Secondly, using axiomatic fuzzy set theory, complex concepts are constructed according to the selected fuzzy semantics as the fuzzy description of each sample. Finally, Put the same or similar sample descriptions into the same class to form a fuzzy description of the class, And the clustering of sample data is realized according to class description. The interpretability of AFS fuzzy clustering algorithm is tested by the data of UCI machine learning standard database. And the clustering accuracy can be compared with other rule-based clustering algorithms and classical clustering algorithms FCM and K-means. 3. A fast data granulation method is proposed. The proposed unsupervised feature selection method is applied to select attributes. Next, fuzzy rules are constructed to describe the sample, and the canonical description is selected according to the importance of the sample description. Finally, Granulation of data can be realized by canonical description. The definition of sample description can be adjusted according to practical problems, which makes it easy to improve the granulation method. In order to deal with complex problems, the interpretability and validity of the proposed data granulation method are verified by the data of UCI machine learning standard database.
【学位授予单位】：大连理工大学
【学位级别】：博士
【学位授予年份】：2015
【分类号】：O159;TP311.13

【相似文献】