基于语义标注的网页广告加载模型研究
本文选题:语义标注 + 网页广告加载模型 ; 参考:《武汉大学》2010年博士论文
【摘要】:随着互联网的迅速发展,网页广告也得到了快速的发展。与传统媒体广告相比,网页广告能通过多媒体方式进行全天候、全球性的展示,具有成本低、互动性强等特点。网页广告越来越受到广告商的青睐,并成为许多学者研究的对象。 近几年,网页广告不论在商业应用还是研究领域都取得了极大的发展。然而随着网页广告爆炸式的投放,横幅广告点击率不断下降,网页广告的发展前景受到了诸多质疑。为了提高网页广告的服务效率,在特定目标网页的上下文中寻求当前用户和网页广告之间的“最佳匹配”是当前业界与学术界的一个研究焦点。 本文将“最佳匹配”界定为:一方面目标网页和网页广告相关,另一方面网页广告在一定程度上符合用户的兴趣。在此界定下,针对网页广告加载过程中目标网页、用户兴趣和网页广告相互间不匹配的现状,本文从网页广告排序过程的形式化分析入手,提出了基于语义标注的方法来解决三个广告主体间的相关性匹配问题,并对整个网页广告加载过程进行了系统建模,进而对其中的关键算法进行了系统地分析与研究。研究的关键算法包括基于网页内容的网页广告排序算法和基于用户兴趣的网页广告重排算法。 论文的主要工作和创新点主要体现在如下几个方面。 首先,提出基于语义标注的方法解决网页广告加载过程中的广告主体间的相关性匹配问题。具体过程为:首先,对目标网页、用户兴趣和网页广告分别进行语义标注,并提取它们的语义特征;然后,提取目标网页和网页广告的相关性特征,按两者的相关度进行排序,得到首轮网页广告排序结果;最后,提取用户兴趣和网页广告的相关性特征,基于两者的相关度对首轮网页广告排序结果进行重排,得到最终的网页广告排序结果。基于语义标注的方法将广告主体间的相关性匹配问题转换成文本语义相关性排序问题,进而可以使用成熟的文本处理技术对问题进行分析和求解。 其次,在网页广告排序过程形式化模型基础上,提出了基于网页内容的网页广告排序算法。在此阶段,主要是对目标网页和网页广告的相关性匹配问题进行研究。分两个步骤进行:首先,提取目标网页和网页广告的匹配特征。和前人方法不同的是,本文提出的方法除了使用传统的向量空间模型匹配特征和语义关联匹配特征外,还使用了新引入的统计匹配特征和潜在主题匹配特征;然后,基于多种匹配特征对网页广告进行排序。本文使用RSVM排序模型对基于网页内容的网页广告排序算法进行了学习,该排序模型能有效的融合各种匹配特征,并提高网页广告排序的性能。 再次,提出了基于用户兴趣的网页广告重排算法,将网页广告的个性化推荐过程形式化为基于用户兴趣的网页广告重排问题。具体步骤为:首先,分析用户浏览行为和用户兴趣的关系,建立用户兴趣模型,本文使用簇兴趣模型表示用户兴趣,使用质心模型和高斯模型量化用户兴趣;然后,基于用户兴趣模型,通过Web日志挖掘对用户兴趣进行提取;最后,提取用户兴趣和网页广告的相关性特征,使用两者的相关度对基于网页内容的网页广告排序结果进行重排,得到最终的网页广告排序结果。使用重排技术可以有效地平衡目标网页内容、用户兴趣和网页广告之间的相关性匹配,为解决网页广告与其加载的上下文之间的相关性匹配问题找到了切实可行的解决方法。 最后,依据本文提出的广告主体语义标注和网页广告排序的一整套方法,本文设计实现了一个基于语义的网页广告加载原型系统。该系统采用层次化的思路进行设计,并能有效和其他系统进行集成。 综上所述,本文使用基于语义标注的方法对网页加载模型相关问题进行研究是自然语言处理技术在计算广告学中的一个有益尝试,有助于在深层次上实现目标网页、用户兴趣和网页广告的一致性,对于搜索引擎优化及互联网信息获取也有一定的意义。
[Abstract]:With the rapid development of Internet , the advertisement of web page has been developed rapidly . Compared with traditional media advertisement , the advertisement of web page can carry on all - weather , global display through multi - media mode . It has the characteristics of low cost , strong interactivity and so on . Web advertisement is more and more popular with advertisers and becomes the object of many scholars .
In recent years , web - based advertising has been greatly developed in both commercial and research fields . However , with the launch of the advertisement of web page and the decline of banner ad spot rate , the development prospect of web advertisement has been questioned . In order to improve the service efficiency of the web advertisement , " best match " between the current user and the web advertisement in the context of the specific target web page is a research focus of the current industry and the academic circle .
This paper defines " best match " as : on the one hand , the target web page and the web page advertisement are related , on the other hand , the web page advertisement fits the user ' s interest to a certain extent . In this definition , the paper starts with the formal analysis of the web page advertisement ordering process , proposes a method based on the semantic annotation to solve the problem of correlation matching among the three advertisement bodies , and then systematically analyzes and studies the key algorithms in the web page advertisement loading process . The key algorithms of the research include the web page advertisement sorting algorithm based on the web content and the webpage advertisement rearrangement algorithm based on the user interest .
The main work and innovation points of the thesis are mainly embodied in the following aspects .
Firstly , the semantic tagging method is proposed to solve the problem of the matching between the advertisement bodies in the web advertisement loading process . The specific process is as follows : Firstly , the semantic annotation of the target web page , user interest and web page advertisement are carried out respectively , and their semantic features are extracted ;
secondly , extracting relevance characteristics of the target webpage and the webpage advertisement , and sequencing according to the correlation degree of the two , so as to obtain the first round webpage advertisement ranking result ;
finally , the relevance characteristic of the user interest and the webpage advertisement is extracted , the advertisement ranking result of the first round webpage is rearranged based on the correlation degree of the two , so as to obtain the final webpage advertisement ranking result .
Secondly , based on the formal model of web page advertisement ordering process , a web page advertisement ranking algorithm based on web page content is proposed . In this stage , the matching features of target web page and web page advertisement are studied .
Then , the webpage advertisements are sorted based on a plurality of matching characteristics , and the webpage advertisement sorting algorithm based on the webpage content is studied by using RSVM sequencing model , and the sorting model can effectively fuse various matching features and improve the performance of the webpage advertisement ranking .
The method comprises the following steps : firstly , analyzing the relationship between user browsing behavior and user interest , establishing a user interest model , expressing user interest using a cluster interest model , and quantifying user interest using a centroid model and a Gaussian model ;
Then , based on the user interest model , the user interest is extracted through the Web log mining ;
finally , the relevance characteristic of the user interest and the webpage advertisement is extracted , and the webpage advertisement ranking result based on the webpage content is rearranged by using the correlation degree of the two , so as to obtain the final webpage advertisement ranking result .
Finally , according to the whole set of methods of advertising principal semantic tagging and web page advertisement sequencing , this paper designs a semantic - based web advertisement loading prototype system . The system is designed with hierarchical thinking and can be integrated with other systems .
In conclusion , using semantic tagging method to study the problem of web page loading model is a useful attempt in computational advertising , which helps to achieve the consistency of target web page , user interest and web page advertisement in the deep level , and has some meaning for search engine optimization and Internet information acquisition .
【学位授予单位】:武汉大学
【学位级别】:博士
【学位授予年份】:2010
【分类号】:TP393.092
【参考文献】
相关期刊论文 前10条
1 唐培丽;解飞;陈志雨;;基于概念检索的中文搜索引擎研究[J];长春大学学报;2006年04期
2 肖德荣;;广告的战略转移与广告教育的应对[J];长沙铁道学院学报(社会科学版);2006年02期
3 李信利;吕月娥;;基于概念的论文相似性检索[J];计算机工程与应用;2007年21期
4 程豪;黄磊;刘金刚;;基于笔画提取和颜色模型的视频文字分割算法[J];计算机工程;2009年04期
5 唐成;;基于概念的搜索引擎[J];科技信息(学术研究);2007年09期
6 冯翱,刘斌,卢增祥,路海明,王普,李衍达;Open Bookmark——基于Agent的信息过滤系统[J];清华大学学报(自然科学版);2001年03期
7 王建,周源华;一种基于纹理能量的JPEG图像文本定位算法[J];上海交通大学学报;2004年09期
8 朱征宇,裴仰军,陈华月,付关友;个性化服务中用户近期兴趣视图的生成[J];计算机工程与设计;2005年04期
9 张选平;蒋宇;袁明轩;马琮;梁平;;一种基于概念的信息检索查询扩展[J];微电子学与计算机;2006年04期
10 ;Automatic character detection and segmentation in natural scene images[J];Journal of Zhejiang University Science A(Science in Engineer;2007年01期
相关博士学位论文 前2条
1 应晓敏;面向Internet个性化服务的用户建模技术研究[D];中国人民解放军国防科学技术大学;2003年
2 潘建国;基于语义的用户建模技术与应用研究[D];上海大学;2009年
,本文编号:1911690
本文链接:https://www.wllwen.com/wenyilunwen/guanggaoshejilunwen/1911690.html