基于社会网络分析方法的网络数据挖掘
发布时间:2018-04-24 02:17
本文选题:数据挖掘 + 搜索引擎 ; 参考:《吉林大学》2012年硕士论文
【摘要】:在当今这个信息膨胀的时代,网络上的网页数量是非常大的,而且仍在飞速增加。如果想要在网络上得到我们所需要信息,,搜索引擎能帮助我们得到所需要的相关信息,但是搜索到的大部分信息并不是我们真正要找的,而且也需要鉴别搜索引擎提供的信息的准确程度。所以在这种情况下,想要获取信息的最好方式就是通过权威网页。当用户使用搜索引擎在因特网上搜索时,权威网页能直接提供给我们所需要的信息,这样搜索结果的效率和质量将会有很大的提高。 本篇论文研究基于社会网络分析方法的网络数据挖掘,包括对相关技术的研究以及从网络资源中挖掘权威网页。 本篇论文的目的是从网络资源中发掘权威网页。这样可以帮助人们找出权威网页,让人们可以更准确的得到有用信息,这里所用到的方法是通过社会网络分析方法来分析相关网页间的关系。 我们做了一些相关技术的研究,包含数据挖掘、数据挖掘技术、Web数据的特征、网络挖掘、搜索引擎、Google、权威网页、社会网络分析、点度中心性和社会网络分析软件UCINET6等。这些理论研究的目的是为论文中做的实验提供理论基础。 论文中实验所用的主要方法是点度中心性。点度中心性是一种社会网络分析方法,它被用来分析网页之间的关系。权威网页可以被认为是一个网页,该网页被其它网页引用了很多次,所以这个网页是值得信赖的,并且具有较高的可接受度。在本论文所做的实验中,通过社会网络分析软件UCINET6来计算各个网页的度(Degree),从而找到几个权威网页。实验结果验证了本论文论证的观点的合理性,通过论文提出的原理和方法,可以找到适当的权威网页。另外有一些扩展的工作可以在本论文的基础上今后逐步完善,比如相关性计算、移除重复链接、扩展数据集等。
[Abstract]:In this era of information inflation, the number of web pages on the network is very large, and is still growing rapidly. If we want to get the information we need on the Internet, search engines can help us get the relevant information we need, but most of the information we search is not really what we are looking for. It is also necessary to identify the accuracy of the information provided by the search engine. So in this case, the best way to get information is through authoritative web pages. When users use search engines to search on the Internet, authoritative web pages can directly provide the information we need, so the efficiency and quality of search results will be greatly improved. This paper studies the network data mining based on the social network analysis method, including the research of related technologies and the mining of authoritative web pages from the network resources. The purpose of this paper is to explore the authoritative web pages from the network resources. This can help people to find authoritative web pages, so that people can get useful information more accurately. The method used here is to analyze the relationship between relevant web pages through the method of social network analysis. We have done some research on related technologies, including data mining, data mining technology and web data features, web mining, search engine Google, authoritative web pages, social network analysis, point centrality and social network analysis software UCINET6. The purpose of these theoretical studies is to provide a theoretical basis for the experiments in this paper. The main method used in the experiment is pointwise centrality. Point centrality is a social network analysis method, which is used to analyze the relationship between web pages. An authoritative page can be regarded as a page that has been quoted many times by other pages, so the page is trustworthy and has a high acceptability. In the experiment of this paper, the social network analysis software UCINET6 is used to calculate the degree of each web page, and several authoritative web pages are found. The experimental results verify the rationality of the argument in this paper. Through the principle and method proposed in the paper, we can find the appropriate authoritative web page. In addition, some extended work can be improved gradually on the basis of this paper, such as correlation calculation, removing duplicate links, extending data sets and so on.
【学位授予单位】:吉林大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP311.13
【参考文献】
相关硕士学位论文 前1条
1 高明;关联规则挖掘算法的研究及其应用[D];山东师范大学;2006年
本文编号:1794688
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1794688.html