基于语义相似度和信息量的Web服务标签优化

发布时间：2018-02-24 05:39

本文关键词： Web服务相似度计算标签语义相似度信息量　出处：《郑州大学》2014年硕士论文　论文类型：学位论文

【摘要】：随着云计算技术的飞速发展，Web服务作为其关键技术之一被广泛应用，，互联网上可用Web服务数量快速增长，因此如何快速准确定位Web服务，实现服务发现和组合变得十分必要和困难。目前网络上发布的Web服务多基于WSDL进行描述，因此如何有效利用WSDL进行服务发现显得尤为重要。由于WSDL缺乏对Web服务的语义描述，存在相似度匹配准确率低的问题，而且很多WSDL文档结构缺乏规范性，现有Web服务相似度计算方法不能够有效满足需求。Web服务标签是用户向Web服务添加的描述其功能或属性的关键词，它可以向Web服务提供额外的信息，弥补WSDL提供信息不足的问题，从而提高Web服务相似度匹配的准确率，进而改善服务发现、服务组合和服务聚类等，然而目前不准确甚至错误的无效标签比例较高，影响了服务相似度匹配的质量。针对当前WSDL结构描述缺乏规范性和描述Web服务的无效标签比例较高的问题，本文提出一个Web服务标签优化模型WS-TOM，该模型分为Web服务相似度计算和Web服务标签优化两个模块。在Web服务相似度计算模块，首先分析了大量的WSDL文档，给出一种考虑到编程风格和命名规范的特征提取方案，用于Web服务相似度计算；在Web服务标签优化模块，给出了一个标签排名算法，通过综合标签与WSDL的语义相似度和标签的信息量来对标签进行排名，然后根据幂律分布的规律，过滤不准确的标签，从而降低其负面影响。实验结果及分析验证了WS-TOM模型的有效性，Web服务相似度计算方法在WSDL结构不规范的情况下能够良好执行并能一定程度上提高相似度匹配的准确率；Web服务标签优化能够过滤不准确的标签，进一步提高了Web服务匹配的准确率。
[Abstract]:With the rapid development of cloud computing technology, web services are widely used as one of its key technologies, and the number of Web services available on the Internet is growing rapidly, so how to locate Web services quickly and accurately? It is necessary and difficult to realize service discovery and composition. At present, most of the Web services published on the network are described on the basis of WSDL, so it is very important to use WSDL effectively for service discovery. Due to the lack of semantic description of Web services in WSDL, there is a problem of low similarity matching accuracy. Moreover, many WSDL document structures are not standardized, and the existing Web service similarity calculation methods can not effectively meet the requirements. The web service label is the key word that users add to the Web service to describe its function or attribute. It can provide additional information to Web services, make up for the insufficient information provided by WSDL, improve the accuracy of similarity matching of Web services, and then improve service discovery, service composition and service clustering, etc. However, the proportion of invalid tags is high, which affects the quality of service similarity matching. In view of the lack of standardization in the description of current WSDL structure and the high proportion of invalid tags describing Web services, This paper presents a Web service label optimization model WS-TOM, which is divided into two modules: Web service similarity calculation and Web service label optimization. In the Web service similarity calculation module, a large number of WSDL documents are first analyzed. This paper presents a feature extraction scheme considering programming style and naming specification, which is used to calculate the similarity of Web services, and gives a label ranking algorithm in the Web service label optimization module. The label is ranked by synthesizing the semantic similarity between tag and WSDL and the information of tag, and then the inaccurate label is filtered according to the law of power law distribution. Experimental results and analysis show that the effectiveness of the WS-TOM model can be well implemented in the case of non-standard WSDL structure and can improve the accuracy of similarity matching to a certain extent. Accuracy rate Web service label optimization can filter out inaccurate tags, The accuracy of Web service matching is further improved.
【学位授予单位】：郑州大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TP393.09

【参考文献】