当前位置:主页 > 科技论文 > 搜索引擎论文 >

面向数学搜索的排序算法研究

发布时间:2018-08-02 14:48
【摘要】:目前,Web中的数学信息量逐渐增加,数学搜索成为人们关注的焦点。近几年,浏览器对数学公式的显示和存储问题己得到逐步解决,为面向数学公式的搜索引擎的研究和开发提供了良好的基础。 尽管数学公式可以存储在web文档中,在网络中搜索数学公式仍具有局限性。数学公式具有复杂的二维结构以及蕴涵有复杂的数学表达意义,不同描述的数学公式可能具有相同的意义,同一数学公式的表示形式可能有多种,另外用户查询公式可能为某一公式的子公式,因此用传统的文本检索系统搜索数学公式显得力所不足。国际上现有的或者正在研究的数学公式检索系统,在建立索引方面已取得逐步发展,在返回结果集的排序算法方面大部分仍应用文本搜索引擎的排序算法,未深入研究面向数学公式搜索结果排序的算法。因此,本文将在深入研究现有的基于文本搜索引擎排序算法的原理和基础上,结合数学公式的特点以及数学公式间的关系(等价、代数相关、子公式等)尝试提出面向数学公式搜索排序的算法。本文将计算机代数系统(CAS)和数学公式搜索引擎相结合去挖掘公式与公式之间的关系,不但为查询公式和网页之间相关度的计算方面提供更加合理可靠的相关度量方法,还将促进系统对数学公式语义检索的能力。
[Abstract]:At present, the amount of mathematical information in Web is increasing gradually, and mathematical search has become the focus of attention. In recent years, the problem of displaying and storing mathematical formulas in browsers has been gradually solved, which provides a good foundation for the research and development of search engines oriented to mathematical formulas. Although mathematical formulas can be stored in web documents, searching for them in a network has its limitations. Mathematical formulas have complex two-dimensional structure and implicature of complex mathematical expressions. Different mathematical formulas may have the same meaning, and the same mathematical formulas may have many forms of expression. In addition, the user query formula may be a subformula of a certain formula, so it is insufficient to search the mathematical formula with the traditional text retrieval system. The existing or currently studied mathematical formula retrieval systems in the world have made gradual progress in indexing, and most of the sorting algorithms for returning result sets still use the sorting algorithms of text search engines. The algorithm for sorting search results for mathematical formulas is not studied in depth. Therefore, on the basis of studying the principle and foundation of the existing text search engine sorting algorithm, this paper will combine the characteristics of mathematical formula and the relationship between mathematical formulas (equivalent, algebraic correlation, etc.) Subformulas, etc.) an algorithm for searching and sorting mathematical formulas is proposed. In this paper, the computer algebra system (CAS) and the search engine of mathematical formulas are combined to mine the relationship between the formulas and the formulas, which not only provides a more reasonable and reliable correlation measure method for the calculation of the correlation between the query formulas and the web pages. It will also promote the system's ability of semantic retrieval of mathematical formulas.
【学位授予单位】:兰州大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP391.3;O223

【参考文献】

相关期刊论文 前3条

1 李世奇;计算机代数系统MAPLE及其程序设计语言[J];重庆师范学院学报(自然科学版);1998年04期

2 姜楚江;余轶军;;基于分块和净化的搜索引擎排序算法[J];计算机工程与应用;2012年01期

3 李绍华;高文宇;;搜索引擎页面排序算法研究综述[J];计算机应用研究;2007年06期



本文编号:2159783

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2159783.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户395b0***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com