当前位置:主页 > 科技论文 > 计算机论文 >

基于位图索引的FITS文件分布式存储与索引技术研究

发布时间:2018-01-08 20:15

  本文关键词:基于位图索引的FITS文件分布式存储与索引技术研究 出处:《昆明理工大学》2014年硕士论文 论文类型:学位论文


  更多相关文章: FITS文件 分布式存储 位图索引 FITS文件检索


【摘要】:大多数天文观测中产生的数据是以FITS (Flexible Image Transport System)文件的形式存储的,这种文件格式在全世界范围内被用于保存和交换数据。由于大量的大型多通道多波段天文望远镜的应用,当今天文观测产生的FITS文件的数量激增,这为如何存储和快速检索如此数量惊人的文件提出了挑战。在以前,这止匕FITS文件是没有被索引的。它们被直接存在硬盘或者其它存储介质上。当一个硬盘存满的时候,会被换上一个新的,被替换下来的硬盘将会被存放在一个专门用于存放使用过的硬盘的仓库内。这些硬盘的替换工作都需要由人工来完成,造成了人力资源的浪费。而且这些被替换下来的硬盘当然不是联机的,所以查询在它们上存储的文件是一项困难的任务。所以只有当查询条件是一个日期或是一个时间段,才有可能比较容易获得查询结果,而像锥形检索这样复杂的检索条件很难被完成。这种由数量激增的FITS文件所导致的问题曾经被数据库管理系统(DBMS),如MySQL和Oracle等所解决。但是随着文件的数量越来越快地增长,传统的数据库管理系统无法跟上文件数量增长的脚步。这使得索引和查询所花费的时间也越来越长。 本文介绍了使用分布式存储系统来解决FITS文件存储问题的方法,介绍并通过实验对比了几种分布式文件系统。通过对实验结果的分析,得出了类似GlusterFS和Lustre这类的对文件的写入性能表现得较好的分布式文件系统更适合用于存储在持续天文观测中不断产生的海量的FITS文件的结论。并且最终选取了GlusterFS作为FITS文件分布式存储系统所使用的分布式文件系统。 在解决FITS文件的检索问题上,本文提出了使用位图索引的方式加速FITS文件的检索,并通过将FastBit位图索引技术应用在分布式系统上,开发了FITS文件分布式索引系统,实现海量FITS文件的快速索引和查询。本文通过实验证明了FastBit位图索引技术在解决海量FITS文件索引的问题上有其性能优势,并证明了在FITS文件分布式存储的情况下,基于FastBit位图索引技术的FITS文件索引与查询系统能很好地发挥多机协作的优势,能较大地提高海量FITS文件的检索速度。
[Abstract]:The majority of the astronomical observation data is based on FITS (Flexible Image Transport System) stored files, this file format is used to store and exchange of data within the scope of the whole world. Due to the application of a large number of large multi channel multi band astronomical telescope, when the number of observations today FITS file in the the challenge is how to store and retrieve such a surprising number of documents. In the past, this check dagger FITS file is not indexed. They are directly the existence of the hard disk or other storage medium. When a hard drive is full of time, will be replaced with a new, hard disk will be replaced stored in a specially used for storage of hard disk in the warehouse. These are hard to replace the work needs to be completed by the artificial, resulting in a waste of human resources. And these were replaced hard Of course the disk is not online, so the query stored in files on them is a difficult task. So only when the query is a date or a period of time, it may be easier to obtain query results, and like the cone search complex search condition is difficult to be completed by the surge in the number of. The FITS file has been the problem caused by the database management system (DBMS), such as MySQL and Oracle to solve. But as the number of files is becoming more and more fast growth, the traditional database management system can not keep up with the pace of growth in the number of documents. This makes the cost of indexing and query time is getting longer.
This paper introduces the use of distributed storage system to solve the problem of FITS file storage, and through the experimental comparison of several distributed file system. Through the analysis of experimental results, the write performance similar to GlusterFS and Lustre this kind of file was distributed file system is more suitable for storage are generated continuously the astronomical observation of massive FITS documents and final conclusion. GlusterFS has been selected as the distributed file system using FITS file distributed storage system.
In the search to solve the problem of FITS documents, this paper proposes the use of bitmap index way to accelerate the retrieval of FITS documents, and through the FastBit bitmap indexing technology application in the distributed system, the development of the FITS file distributed index system to achieve massive FITS file fast indexing and query. In this paper, experiments show that the FastBit bitmap indexing technology the performance advantage in solving the problem of massive FITS file index, and proved in the FITS file distributed storage case, FITS document indexing and query system of FastBit bitmap indexing technology can play a very good multi computer cooperation based on the advantages, can greatly improve the massive FITS file retrieval speed.

【学位授予单位】:昆明理工大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP333

【参考文献】

相关期刊论文 前4条

1 徐虹;张钦;;EXT2文件系统的分析与研究[J];成都信息工程学院学报;2007年03期

2 梁金千,张跃;NTFS文件系统的主要数据结构[J];计算机工程与应用;2003年08期

3 朱颂;;linux操作系统中EXT2文件的组成[J];武汉工程大学学报;2011年04期

4 崔辰州;李文;于策;徐祯;赵永恒;于建军;;FITS数据文件的检索和访问[J];天文研究与技术;2008年02期



本文编号:1398518

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1398518.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户b3219***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com