基于医疗知识图谱的探索式搜索研究
发布时间:2018-04-25 15:43
本文选题:知识图谱 + 信息检索 ; 参考:《湘潭大学》2017年硕士论文
【摘要】:随着互联网、移动互联网的成熟与发展以及网络数据的爆炸式增长,如何从海量信息中快速、方便、准确的获取需要信息是一个具有挑战的问题。而目前主流搜索引擎的“查询-应答”式的一次性交互模式难以满足用户便捷探索知识的需求,为达到探索目的,用户不得不分析、理解查询结果,并修正关键词再次进行查询。这一过程效率较低,并且需要用户自身使用一定的搜索策略,才能达到预期的目的,因此导致用户体验较差。而这些搜索策略由搜索引擎通过一定算法实现,使其对于用户透明。另外,产业信息化的进程使得生物医学领域的各类信息资源都以数字化存储形式下来。其中蕴含的大量的信息为人类医学的进步提供了助力,但如何从中挖掘出关键信息,便于医学研究者利用好海量信息资源找到感兴趣的研究点也是亟待解决的问题。医学信息检索需要有医学背景知识,利用知识图谱将专家知识保留并加以处理,是将领域数据用好的一个途径。因此,本文针对上述的一些问题,做了如下几点创新性的工作:(1)为弥补一次性交互模式的不足,我们利用共现关系构建了语义图谱,将知识概念通过语义关系关联起来,方便用户快速浏览知识网络。另外,我们提出了一种新颖的基于图谱的挖掘多目标关联关系的探索式搜索算法,通过扁平化压缩图谱和逆扁平化解压图谱操作,能快速、有效的从图中搜索出多个目标之间有较强关联的节点和路径,以推测用户的搜索意图。并实验结果得到,我们提出的方法挖掘的关联关系较其他方法更好。(2)将医学文本作为研究对象,分别基于Medline引文数据和CT影像报告文本从不同关系粒度上构建了知识图谱,提出了一种相对共现关系具有更细粒度的基于CRF和规则推导的知识图谱构建方法。测试发现在不同粒度图谱中,挖掘出的实体之间关联关系在不同应用场景下都具有较好的效果。(3)构建了医疗信息的探索式搜索引擎的原型系统,我们在系统中采用了基于边的索引机制,便于关系集合的运算。并提出了一种高可扩展性的分布式关系抽取算法,提高系统计算吞吐,以适应海量数据需求。
[Abstract]:With the maturity and development of the Internet, mobile Internet and the explosive growth of network data, how to quickly, conveniently and accurately obtain the information needed from mass information is a challenging problem. At present, the "query-response" mode of the mainstream search engine is difficult to meet the needs of the users to explore knowledge conveniently. In order to achieve the purpose of exploration, the users have to analyze and understand the query results. And correct keywords to query again. This process is inefficient and requires users to use certain search strategies in order to achieve the desired purpose, resulting in poor user experience. These search strategies are implemented by search engines through certain algorithms to make them transparent to users. In addition, the process of industry informatization makes all kinds of information resources in biomedical field in the form of digital storage. The large amount of information contained therein has provided the help for the progress of human medicine, but how to dig out the key information from it and make it easy for medical researchers to make good use of the massive information resources to find interesting research points is also an urgent problem to be solved. Medical information retrieval requires medical background knowledge. Using knowledge map to retain and process expert knowledge is a way to make good use of domain data. Therefore, in order to make up for the deficiency of one-off interaction model, we construct a semantic map by using co-occurrence relation, and associate the concept of knowledge with semantic relation. It is convenient for users to browse the knowledge network quickly. In addition, we propose a novel exploratory search algorithm for mining multi-object association relations based on atlas. By using flat compression map and inverse flat decompression map, we can quickly, The nodes and paths with strong correlation between multiple targets are effectively searched from the graph to speculate the user's search intention. The experimental results show that the association relation of the proposed method is better than other methods. (2) Medical text is taken as the research object, and the knowledge map is constructed based on Medline citation data and CT image report text from different relational granularity, respectively. In this paper, a method of constructing knowledge map based on CRF and rule derivation is proposed, which has a finer granularity than co-occurrence relation. It is found that in different granularity maps, the relationship between the entities mined out has a good effect in different application scenarios.) the prototype system of exploratory search engine for medical information is constructed. We use the edge-based indexing mechanism in the system to facilitate the operation of relational sets. A highly scalable distributed relational extraction algorithm is proposed to improve the throughput of the system to meet the needs of mass data.
【学位授予单位】:湘潭大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3
【参考文献】
相关期刊论文 前3条
1 李俊;帅存勇;;肝硬化及其继发性改变的CT影像学特征研究[J];中国CT和MRI杂志;2016年04期
2 刘峤;李杨;段宏;刘瑶;秦志光;;知识图谱构建技术综述[J];计算机研究与发展;2016年03期
3 杜小勇;陈峻;陈跃国;;大数据探索式搜索研究[J];通信学报;2015年12期
,本文编号:1801957
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1801957.html