基于KD树最优投影划分的k匿名算法
发布时间:2018-12-12 11:29
【摘要】:针对现有数据发布隐私保护保护算法中的"局部最优"划分问题,提出了一种基于KD树最优投影划分的k匿名算法.首先,在全局范围内对每一个属性维度进行遍历,根据投影距离方差值衡量每个维度的离散度,并确定最优维度;然后,在最优属性维度上,计算其划分系数值,并确定最优划分点.进一步引入一种改进的KD树结构,与传统的KD树结点是一个数据点不同,新设计的KD树中的每个结点均是一个集合.用经过划分点并垂直于最优维度的超平面将一个结点分成两部分,分别作为其左、右孩子结点.最后通过理论分析证明了本文算法的正确性,用实验比较和验证了算法的性能,实验结果显示所提算法平均概化范围减小10%~22%,能够实现更优的划分和更好的数据集可用性.
[Abstract]:A k-anonymous algorithm based on optimal projection partition of KD tree is proposed to solve the problem of "local optimal" partitioning in existing privacy protection algorithms for data publishing. Firstly, every attribute dimension is traversed in the global scope, the dispersion of each dimension is measured according to the difference of projection distance, and the optimal dimension is determined. Then, in the optimal attribute dimension, the partition coefficient is calculated and the optimal partition point is determined. Furthermore, an improved KD tree structure is introduced, which is different from the traditional KD tree node is a data point, each node in the newly designed KD tree is a set. A node is divided into two parts by a hyperplane perpendicular to the optimal dimension and divided into two parts as the left and right child nodes respectively. Finally, the correctness of the algorithm is proved by theoretical analysis. The performance of the algorithm is compared and verified by experiments. The experimental results show that the average generalizability range of the proposed algorithm is reduced by 10% 22%. It can achieve better partition and better data set availability.
【作者单位】: 安徽师范大学数学计算机科学学院;安徽师范大学网络与信息安全工程研究中心;
【基金】:国家自然科学基金(61672039,61370050) 安徽省自然科学基金(1508085QF133)
【分类号】:TP309
本文编号:2374478
[Abstract]:A k-anonymous algorithm based on optimal projection partition of KD tree is proposed to solve the problem of "local optimal" partitioning in existing privacy protection algorithms for data publishing. Firstly, every attribute dimension is traversed in the global scope, the dispersion of each dimension is measured according to the difference of projection distance, and the optimal dimension is determined. Then, in the optimal attribute dimension, the partition coefficient is calculated and the optimal partition point is determined. Furthermore, an improved KD tree structure is introduced, which is different from the traditional KD tree node is a data point, each node in the newly designed KD tree is a set. A node is divided into two parts by a hyperplane perpendicular to the optimal dimension and divided into two parts as the left and right child nodes respectively. Finally, the correctness of the algorithm is proved by theoretical analysis. The performance of the algorithm is compared and verified by experiments. The experimental results show that the average generalizability range of the proposed algorithm is reduced by 10% 22%. It can achieve better partition and better data set availability.
【作者单位】: 安徽师范大学数学计算机科学学院;安徽师范大学网络与信息安全工程研究中心;
【基金】:国家自然科学基金(61672039,61370050) 安徽省自然科学基金(1508085QF133)
【分类号】:TP309
【相似文献】
相关期刊论文 前1条
1 陈鹏;;优化KD树在多维区域检索中的研究[J];福建电脑;2011年07期
相关会议论文 前1条
1 金树东;冯玉才;孙小薇;;hB树的索引页内kd树的存储结构比较[A];数据库研究进展97——第十四届全国数据库学术会议论文集(上)[C];1997年
,本文编号:2374478
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2374478.html