基于异构信息的债券知识服务的研究与实现

发布时间：2018-03-01 00:13

本文关键词： 异构信息检索结果评估方法本体规则自适应不平衡分类　出处：《哈尔滨工业大学》2013年硕士论文　论文类型：学位论文

【摘要】：随着金融行业的迅猛发展，金融产品的网络知识服务平台越来越得到众多投资者的认可。以债券为例，网络中大量债券异构信息的存在，为构建自动化的债券知识服务平台提供了一定的数据来源。因此，本课题将研究金融产品异构信息的获取方法，以及对这些异构信息进行加工、处理，，进一步完成信息的分类融合，并将最终整合的信息应用于债券知识服务平台当中。本课题研究的主要内容有以下几个方面：债券产品异构信息的获取方法：包括债券结构化数据和非结构化网页数据的获取、预处理；债券数据的来源包括固定金融网站和搜索引擎两部分，在搜索引擎部分本文提出了基于搜索引擎的特定领域检索结果评估模型RDMDRR，进一步提高了债券公告信息获取的准确性和全面性。债券产品异构信息的抽取：首先使用WHISK算法构建债券特征的本体规则库，然后利用本体规则自适应的方法对构建的规则进行剪枝操作，得到完善的本体规则库，并将其运用到债券实体信息的抽取中，为构建债券的知识服务提供数据来源。债券信息的分类及融合：针对债券的不同类别，分别采用了规则和机器学习的方法对债券进行分类。基于类别不均衡分布的特点，本文提出了一种新的特征权重方法，对原来的TFIDF进行了改进，并将其运用到不均衡分类当中，提高了少数类的识别率，准确的对债券信息进行归类整理，然后将其与其它债券信息进行融合，形成较完整的债券知识库。异构信息经过上述三个环节的处理、加工与融合，得到完整的债券知识，并将其整合到债券知识服务平台中。实验表明，构建的知识服务平台改变了传统的知识服务平台的知识扩充模式，知识获取的准确度和召回率在不同处理环节均得到了相应的提高，知识服务平台也得到债券投资用户的认可。
[Abstract]:With the rapid development of financial industry, the network knowledge service platform of financial products is more and more recognized by many investors. It provides a certain data source for the construction of automated bond knowledge service platform. Therefore, this paper will study the methods of obtaining heterogeneous information of financial products, as well as the processing and processing of these heterogeneous information. Further complete the classification and fusion of information, and apply the final integrated information to the bond knowledge service platform. The main contents of this research are as follows:. The methods of obtaining isomerous information of bond products include the acquisition and preprocessing of structured and unstructured data of bonds, and the sources of bond data include fixed financial websites and search engines. In the part of search engine, this paper puts forward the evaluation model of search results based on search engine in specific domain, which further improves the accuracy and comprehensiveness of obtaining bond announcement information. The extraction of heterogeneous information of bond products: firstly, the ontology rule base of bond features is constructed by using WHISK algorithm, and then the rules are pruned by the adaptive method of ontology rules, and a perfect ontology rule base is obtained. It is applied to the extraction of bond entity information to provide data source for constructing bond knowledge service. The classification and fusion of bond information: according to the different categories of bonds, the methods of rule and machine learning are used to classify bonds. Based on the characteristics of class disequilibrium distribution, a new method of feature weight is proposed in this paper. This paper improves the original TFIDF and applies it to the unbalanced classification, improves the recognition rate of a few classes, classifies the bond information accurately, and then merges it with other bond information. Form a complete bond knowledge base. The heterogeneous information is processed, processed and integrated into the bond knowledge service platform through the processing and fusion of the above three links. The experimental results show that, The knowledge service platform has changed the knowledge expansion mode of the traditional knowledge service platform. The accuracy and recall rate of knowledge acquisition have been improved in different processing links. The knowledge service platform has also been recognized by bond investment users.
【学位授予单位】：哈尔滨工业大学
【学位级别】：硕士
【学位授予年份】：2013
【分类号】：TP391.3

【参考文献】