分布式在线社交网络数据存储及优化技术研究
[Abstract]:In recent years, the online social network (OSN) has achieved great success, with billions of users worldwide. Through OSN, users can make new friends or share information with their own friends. With centralized data storage architecture, all user data is centrally stored on servers operated and maintained by service providers. Service providers can use and analyze this data, and even sell it directly to third parties, thus destroying user privacy. In this context, distributed online social network (DOSN) has been proposed to solve the problem of user data privacy leakage. Although DOSN is not as popular and mature as COSN, the research on it is very active. In DOSN, to protect privacy, user data is stored and forwarded directly in a friend's circle bypassing the server. Although DOSN can prevent service providers from leaking user's privacy data, there is a problem of low data availability: when a user is offline, other users cannot access the data stored in the offline circle. In order to improve data availability under data privacy protection constraints, data storage schemes and corresponding optimization strategies must be designed for DOSN scenarios, which is one of the biggest challenges in DOSN research. 4) Social data is mainly small data, and rarely modified. Through the in-depth study of existing DOSN data storage technology and storage optimization related work found that the existing work mainly focused on user dynamics, while ignoring other characteristics of the impact of data storage optimization goals. This paper systematically studies the DOSN data storage and storage optimization problem with the main objective of improving data availability under data privacy protection constraints. It mainly includes the following aspects: 1. Storage capacity-sensitive DOSN data availability modeling and analysis. Existing DOSN data storage schemes usually assume that friends always provide sufficient storage for users. Storage capacity holds data published by users, however, this assumption is inappropriate in DOSN. In order not to disclose user privacy, unprotected user privacy data can only be stored in the circle of friends. Energy devices usually have limited storage capacity. Intuitively, limited total Friends storage capacity reduces data availability. But it's not enough to know this rough conclusion. We also want to know how much storage capacity affects data availability to determine whether data storage optimization is necessary. Before the DOSN data storage scheme, it is necessary to quantitatively analyze the relationship between the total storage capacity contributed by the friend circle and the data availability that can be achieved, which is the primary problem to be solved in this paper. In addition, the dynamic changes of the online friends'height in the circle of friends affect the total storage capacity that the circle of friends can contribute to, and consequently lead to a high degree of dynamic changes in data availability. To solve this problem, this paper predicts the real-time data availability by predicting the total real-time storage capacity of the circle of friends. Finally, a large number of experiments are carried out to verify the validity of the storage capacity-sensitive data availability model. Based on the storage capacity-sensitive data availability model, given the expected data availability can be determined. The minimum total storage capacity required by the circle of friends can then determine the average minimum storage capacity that each friend needs to contribute and provide a basis for the allocation of application storage capacity; conversely, given the total storage capacity of the circle of friends, the maximum data availability that the circle of friends can achieve can be determined, thus determining the expected data availability is 2. Cloud-assisted dosn data storage scheme cadros, as mentioned above, in dosn, data can only be stored redundantly in the friends'circle without protection in order to ensure the privacy of users is not leaked. But dosn is a highly dynamic network, users can at any time. Adding and deleting friends, and friends can be online and offline at any time, so the collection of online friends and the total storage capacity contributed by friends are limited and dynamic changes. To achieve this goal, designing a data storage scheme suitable for dosn is the second key problem to be solved in this paper. To solve this problem, a cloud-assisted dosn data storage scheme, cadros, is proposed based on the storage capacity-sensitive data availability model. Cloud servers are introduced to improve data availability. When the Friendship Circle can not meet the data storage needs. In order to prevent cloud service providers from obtaining original data and protect user data privacy, this paper quantitatively studies cadros The data storage capability is discussed, and the data availability of cadros is discussed, which proves the feasibility and validity of the cadros scheme theoretically. At the same time, the probabilistic model of the dynamic behavior of friends in the circle of friends is established. By predicting the future data storage capacity and storage requirements of the circle of friends, a real-time data availability prediction model of cadros is established. The next step is to design the data storage strategy to provide the basis. 3. The real-time data availability prediction results of the research on the storage optimization technology of social data in dosn only show that cadros has the ability to achieve the corresponding data availability under the premise of the total storage capacity of a given circle of friends. It also depends on the data storage strategy. Even if the friend circle can provide enough storage capacity, the ideal data availability can not be achieved without a good data storage strategy. In the cadros data storage scheme, how to design a suitable data storage based on the prediction results of real-time data availability for dosn user behavior characteristics To solve this problem, this paper further optimizes the Cadros data storage scheme and studies the storage optimization technology of social data in DOSN. Firstly, an overhead-sensitive data partitioning method and storage strategy are proposed to determine the data stored in friends and cloud servers, respectively. Make full use of the available storage capacity of the friend circle to minimize the system overhead; then, propose a usability-driven DOSN data replica placement method, reasonably put the data into the friend circle, can achieve the expected data availability, and can balance the system load, reduce the system maintenance overhead to achieve data availability. 4. Social number According to the storage optimization technology in cloud server as mentioned above, Cadros data storage scheme not only stores user data redundancy in the friend circle, but also stores some data in the cloud server when the friend circle can not meet the data storage requirements. The cloud server has the characteristics of long-term high availability, so the data on the cloud server is available. Usability is approximated to 100%. There is no data availability problem. However, when users access social data on cloud servers, there is a problem of poor access performance. Social data is mainly small data and rarely modified. How to improve the access performance of small social data in cloud servers is the fourth key problem to be solved in this paper. To solve this problem, this paper first studies the performance bottleneck of distributed file systems for handling large amounts of small social data, and then proposes a lightweight file system iFlatLFS pair. IFlatLFS greatly simplifies the metadata structure and data access process. The total amount of new metadata accounts for only a small part of the total amount of original metadata and can be cached into the server memory, eliminating the small data addressing overhead and improving performance. Finally, this paper implements it in the CentOS 5.5 operating system. A prototype of iFlat LFS is implemented and integrated into the open source distributed file system TFS. At the end of this chapter, a large number of experiments are carried out. The results show that iFlat LFS can optimize the storage of large amounts of social small data and greatly improve the data access performance. In this paper, firstly, we quantitatively analyze the relationship between the total storage capacity contributed by the friend circle and the data availability that can be achieved. On this basis, we propose a cloud-assisted DOSN data storage scheme Cadros, which solves the problem of low data availability caused by the limited total storage capacity of the friend circle. The protection problem improves the data availability, and theoretically proves the feasibility and validity of the Cadros scheme, establishes a real-time data availability prediction model; then studies the storage optimization problem of social data in the circle of friends, and proposes an overhead-sensitive data partitioning method and storage strategy based on the prediction results, as well as availability. Sex-driven data placement method can achieve the expected data availability, and can balance the system load and reduce the maintenance overhead of data availability. Finally, the storage optimization of social data in cloud servers is studied, and an efficient lightweight file system iFlatLFS is designed to improve the access performance of social data on cloud servers.
【学位授予单位】:国防科学技术大学
【学位级别】:博士
【学位授予年份】:2014
【分类号】:TP393.09;TP333
【相似文献】
相关期刊论文 前10条
1 郑士贵;数据存储的全面管理[J];管理科学文摘;1997年09期
2 相晓明;网上存储:X:Drive[J];互联网周刊;2000年30期
3 王宇葳;谁来吞吐你的数据[J];互联网周刊;2000年30期
4 袁胜,冯毅,伍显峰,涂春明,盛云川;移动计费营业系统中数据存储的考虑[J];电信技术;2001年01期
5 杨向东;数据存储——深化金融电子化的奠基之石[J];华南金融电脑;2002年03期
6 李子臣,王振光,王文静;外包数据存储——经济、安全、高效[J];现代情报;2002年11期
7 杨向东;数据存储——金融电子化的基石[J];中国金融电脑;2002年03期
8 黄重讯;企业的数据存储[J];乡镇企业研究;2003年06期
9 李婕;;医院信息化促进数据存储中心的建立[J];医学信息;2006年09期
10 夏欢;熊前兴;冯樱;;数据存储的探讨[J];科技信息;2006年S4期
相关会议论文 前10条
1 孙峥皓;汪宏f;阎岩;岑小锋;邓志均;;浅谈信息化战争对大数据存储与分析的要求及对策[A];2013第一届中国指挥控制大会论文集[C];2013年
2 张沁川;王厚军;;基于大容量闪存的数据存储与管理[A];2008中国仪器仪表与测控技术进展大会论文集(Ⅲ)[C];2008年
3 霍跃华;;IP SAN在煤炭企业数据存储的应用研究[A];煤矿自动化与信息化——第20届全国煤矿自动化与信息化学术会议暨第2届中国煤矿信息化与自动化高层论坛论文集[C];2010年
4 盛磊;李美华;程林;;一种轧钢过程数据存储方法[A];全国冶金自动化信息网2014年会论文集[C];2014年
5 王文峰;李佳;;刍议信息系统数据存储与备份系统的构建方式[A];2011年云南电力技术论坛论文集(入选部分)[C];2011年
6 张艳秋;李建中;杨艳;张兆功;;混合负载多媒体服务器的数据存储和数据提交[A];第二十届全国数据库学术会议论文集(研究报告篇)[C];2003年
7 王淑江;;烟台日报传媒集团存储体系规划[A];中国新闻技术工作者联合会五届一次理事会暨学术年会论文集(上篇)[C];2009年
8 ;Wallstor网络数据存储的新技术应用[A];江苏省微型电脑应用协会产学研成果交流会会议资料[C];2010年
9 韦大伟;;分布式数据存储中的机密性保护[A];2006年全国开放式分布与并行计算机学术会议论文集(三)[C];2006年
10 韩德志;;内网数据存储安全关键技术的研究与实现[A];2010年第16届全国信息存储技术大会(IST2010)论文集[C];2010年
相关重要报纸文章 前10条
1 中国惠普公司网络存储事业部技术顾问 周志峰;数据存储面临七大挑战[N];计算机世界;2001年
2 本报记者 郭涛;中兴通讯打造安全高效的大数据存储[N];中国计算机报;2013年
3 本报记者 陈巍巍;数据存储 进化正当时[N];计算机世界;2013年
4 本报记者 黄锐;绿源巢:大数据存储弄潮儿[N];东莞日报;2014年
5 毛玲玲 吴非;数据存储 安全为重[N];解放军报;2014年
6 本报记者 郭涛;华为存储:高端存储、大数据存储齐头并进[N];中国计算机报;2013年
7 本报记者 方慧玲;纠删码技术:大数据存储的“安全卫士”[N];江苏科技报;2014年
8 ;培养皿中的数据存储[N];网络世界;2007年
9 ;2010年中小企业数据存储市场六大趋势[N];网络世界;2010年
10 本报实习记者 陈勋燕;数据存储网络凸现商机 上海邮通转型前景看好[N];通信信息报;2002年
相关博士学位论文 前3条
1 付松龄;分布式在线社交网络数据存储及优化技术研究[D];国防科学技术大学;2014年
2 张杰;一种高速数据存储方法的研究[D];中国科学技术大学;2013年
3 付永忠;基于AFM和硫系相变材料的超高密度数据存储机理研究[D];江苏大学;2010年
相关硕士学位论文 前10条
1 葛佳;P2P网络信誉数据存储与恢复方法的研究与实现[D];昆明理工大学;2015年
2 潘阳;基于Hadoop技术在分布式数据存储中的应用研究[D];大连海事大学;2015年
3 萨日娜;一种基于综合阈值的分布式数据存储方法[D];哈尔滨工程大学;2011年
4 胡海光;钻探工程项目数据存储及其安全的应用研究[D];中国地质大学(北京);2012年
5 史玉丽;基于嵌入式的数据存储与通信模块的设计[D];内蒙古师范大学;2012年
6 赵晋;基于宽表的多租户数据存储模式研究[D];郑州大学;2014年
7 陈春霖;云计算中数据存储的完整性校验模型研究[D];东华大学;2013年
8 单旭;异构大数据存储方法研究[D];北京交通大学;2014年
9 王永洲;基于HDFS的存储技术的研究[D];南京邮电大学;2013年
10 王东晨;网络试验平台数据存储研究与实现[D];北京邮电大学;2013年
,本文编号:2211281
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2211281.html