偏好查询结果可用性研究

发布时间：2018-04-30 06:46

本文选题：数据库可用性 + Causality与Responsibility问题　；参考：《浙江大学》2017年博士论文

【摘要】：偏好查询(如反Top-k查询、反Skyline查询等)是数据库领域重要的查询类型之一,它能够根据用户指定的偏好要求进行个性化查询,并向用户返回与其偏好相匹配的查询结果。偏好查询在多标准决策支持和个性化推荐等方面具有广阔的应用前景。当前的数据库查询(包括偏好查询)仅仅向用户返回查询结果。如果查询结果不是用户想要的,现有的数据库系统既不能向用户解释为什么会得到这样的结果,也无法给出有效的建议来帮助用户得到满意的查询结果。查询结果可用性研究正是针对当前数据库系统的这一不足而展开,其目标旨在向用户解释为什么会产生当前的查询结果,并帮助用户得到使其满意的查询结果,从而使用户能够更加高效、方便地使用数据库,提高用户对数据库的满意度。然而查询结果可用性研究依赖于具体的查询。不同的查询所对应的查询结果可用性问题解决方案也不相同。但当前的查询结果可用性研究大多关注关系型数据库查询。因此,现有的查询结果可用性技术不能有效地解决偏好查询上的查询结果可用性问题。鉴于此,本文以反Top-k查询和反Skyline查询为例,对偏好查询的查询结果可用性进行了深入地研究,研究内容主要包括:(1)反Top-k查询和反Skyline查询上的Causality与Responsibility问题处理。当查询结果中包含了用户不想要的对象,或者用户想要的对象没有包含在查询结果中,那么用户可能想要知道导致这些查询结果的原因及其相应的责任大小,以便更好地理解原有的查询。为此,本文将开展反Top-k查询和反Skyline查询上的Causality与Responsibility问题处理研究,以帮助用户找出导致其不想要的结果出现或想要的结果没有出现的原因(即Causality),并计算每个原因的责任大小(即 Responsibility)。(2)反Top-k查询上的 Why-not 与 Why 问题处理。Causality 与 Responsibility问题处理只向用户解释当前查询结果产生的原因,而不能帮助用户得到其想要的查询结果。因此,本文将开展反Top-k查询上的Why-not与Why问题处理研究。其中,Why-not问题是将用户想要但却没有出现在查询结果中的对象包含在查询结果中;Why问题旨在将用户不想要但却出现在查询结果中的对象从查询结果中排除。(3)反Top-k查询和反Skyline查询上的Why-few与Why-many问题处理。在实际应用中,查询可能返回太多或者太少(甚至为空)的答案对象。若答案对象太多,用户往往无从选择,而答案对象太少,用户又会没有选择的余地。这两种情况都不是用户想要的。所以,针对答案对象太多或太少(甚至为空)的情况,本文研究了反Top-k查询和反Skyline查询上的Why-few与Why-many问题。其中,Why-few问题针对的是原查询结果中答案对象太少甚至为空的情况,以帮助用户增加答案对象;Why-many问题针对的是原查询结果中答案对象太多的情况,以帮助用户减少答案对象。(4)反Top-k查询结果可用性分析系统。集成上述研究成果,本文开发了一个反Top-k查询结果可用性分析系统。该系统能够根据用户的反馈信息向其解释为什么会产生当前的查询结果,并向用户给出相应的建议使其能够得到满意的查询结果。
[Abstract]:Preference queries (such as anti Top-k query, anti Skyline query, etc.) are one of the most important query types in the database domain. It can make personalized queries according to the user's specified preference and return the query results that match their preferences to the user. Preference query has a wide application in multi standard decision support and personalized recommendation. The current database query (including a preference query) returns only the result of the query to the user. If the query result is not what the user wants, the existing database system can neither explain to the user why it will get such a result nor give effective suggestions to help the user get satisfactory results. The result of the query can be obtained. The purpose of usability research is to address the shortcoming of the current database system, which aims to explain to the user why the current query results are generated and to help the user get the satisfactory results, so that the user can use the database more efficiently, conveniently, and improve the user's satisfaction with the database. Results availability studies are dependent on specific queries. The availability problem solutions of query results for different queries are also different. However, most current query results availability studies focus on relational database queries. Therefore, existing query results availability can not effectively solve query results on preference queries. In view of this, this article takes the anti Top-k query and anti Skyline query as an example to make a thorough study of the availability of query results for preference queries. The main contents include: (1) Causality and Responsibility problems on anti Top-k queries and anti Skyline queries. The object that the user wants is not included in the query result, so the user may want to know the reason for the result of the query and the size of the responsibility to better understand the original query. For this purpose, this article will carry out the research on the Causality and Responsibility problems on the anti Top-k query and the anti Skyline query to help The user finds out the cause of the result that the unwanted result appears or wanted (that is, Causality), and calculates the size of the responsibility for each cause (Responsibility). (2) the Why-not and Why problems on the anti Top-k query process the.Causality and Responsibility problems to explain the current query results to the user only. It does not help users get their desired results. Therefore, this article will conduct research on Why-not and Why problems on anti Top-k queries. In this, the Why-not problem is that the object that the user wants but not in the query result is included in the query result; the Why problem aims to make the user unwanted but appear in the query result. The objects in the query are excluded from the query results. (3) the Why-few and Why-many problems on the anti Top-k query and the anti Skyline query. In practical applications, the query may return to too many or too few (even empty) answers. If the answer object is too many, the user often has no choice but the answer is too few and the user will have no choice. These two cases are not what the user wants. So, for the case of too many or too little (or even empty) answers, this paper studies the Why-few and Why-many problems on the anti Top-k query and the anti Skyline query. In this case, the Why-few problem is aimed at the case that the answer object is too few or even empty in the original query to help the user to increase the answer. Case object; the Why-many problem is aimed at the fact that there are too many answers in the original query result to help the user to reduce the answer object. (4) the anti Top-k query result availability analysis system. Integrated the above research results, this paper develops an anti Top-k query result availability analysis system. The system can be based on the feedback information of the user to it Explains why the current query results are generated, and gives corresponding suggestions to users so that they can get satisfactory query results.

【学位授予单位】：浙江大学
【学位级别】：博士
【学位授予年份】：2017
【分类号】：TP311.13

【参考文献】