中文标准地址库构建关键技术研究
发布时间:2018-05-06 10:59
本文选题:地址解析 + 地址模型 ; 参考:《南京师范大学》2013年硕士论文
【摘要】:地址是某一特定空间位置上自然或人文地理实体位置的结构化描述,提供一种关于人、构筑物及其它空间物体的定位实现。以地名/地址信息作为基础,进行标准地址库构建,能在统一的地理参考框架中,建立标准和非标准地址信息之间有机的联系,实现空间信息和非空间信息的融合,满足直接、实时地址数据共享要求,实现不同地址数据类型、不同系统之间的集成。针对当前中文地址描述特点和地址匹配服务需求,本文从地址模型的智能化构建、地址编码和标准数据管理等方面,较为系统地探索标准地址数据库构建的关键技术问题,研发相应的原型系统。主要研究内容和结论如下: (1)自适应地址模型:在借鉴国家和行业分类标准基础上,对地址要素分类进行完善;采用地址要素描述词汇频率统计方法构建了地理要素特征词库;提出了基于关联规则和有限自动机的自适应地址模型构建方法,解决了地名描述要素和结构的规范化表达问题。 (2)双重地址编码:针对地址具有描述名称多样性和空间位置唯一性的特点,以及“时间、空间、名称”的复合演变特性,提出了空间、属性“双重,,地址编码方案。空间编码表达空间唯一性特征,有利于实现地址数据的跨系统、跨行业共享;属性编码适应中文地址模型要素组成的多变性,有利于实现地址要素的高效查询检索。以常州地址数据为例,对地址编码质量进行定量和定性评价。 (3)原型系统研发:根据地址要素层次关系和数据一致性约束,构建了标准地址数据模型;在分析地址生命周期和效用状态的基础上,以空间编码为纽带,实现了地址历史数据和现势数据的关联更新;采用地址要素组合查询模式,实现了地址数据的精确查询和模糊查询。在此基础上,研发了中文标准地址库原型系统,包括地址采集、地址查询、地址更新、地址匹配等功能。
[Abstract]:Address is a structured description of the physical position of a natural or human geographical entity in a particular spatial location, which provides a location for people, structures and other spatial objects. Based on the toponymic / address information, the standard address database can be constructed in the unified geographical reference frame, which can establish the organic connection between the standard and non-standard address information, realize the fusion of spatial information and non-spatial information, and satisfy the direct needs. Real-time address data sharing is required to realize the integration of different address data types and different systems. In view of the characteristics of Chinese address description and the requirement of address matching service, this paper systematically explores the key technical problems in the construction of standard address database from the aspects of intelligent construction of address model, address code and standard data management. Develop the corresponding prototype system. The main contents and conclusions are as follows: (1) Adaptive address model: on the basis of reference from national and industry classification standards, the classification of address elements is improved, and the geographic feature lexicon is constructed by using the method of address element description vocabulary frequency statistics. An adaptive address model based on association rules and finite automata is proposed to solve the problem of normalized representation of geographical names description elements and structures. (2) dual address coding: in view of the characteristics of address which describes the diversity of names and the uniqueness of spatial location, and the compound evolution of "time, space and name", a scheme of space, attribute "dual address coding" and address coding is put forward. Spatial coding can express the unique feature of space, which is conducive to the cross-system and cross-industry sharing of address data. Attribute coding adapts to the variability of the elements of Chinese address model, and is conducive to the efficient query and retrieval of address elements. Taking Changzhou address data as an example, the address coding quality is evaluated quantitatively and qualitatively. Research and development of prototype system: based on hierarchical relationship of address elements and constraints of data consistency, a standard address data model is constructed, which is based on the analysis of address life cycle and utility status, and takes spatial coding as a link. The address history data and current data are updated and the address element combination query mode is used to realize the accurate query and fuzzy query of address data. On this basis, a Chinese standard address library prototype system is developed, including address collection, address query, address update, address matching and so on.
【学位授予单位】:南京师范大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:P208
【参考文献】
相关期刊论文 前3条
1 张雪英;闾国年;李伯秋;陈文君;;基于规则的中文地址要素解析方法[J];地球信息科学学报;2010年01期
2 万剑华;叶海波;;浅谈城市地址编码数据库的建立[J];工程勘察;2009年11期
3 陈细谦,迟忠先,金妮;城市地理编码系统应用与研究[J];计算机工程;2004年23期
相关硕士学位论文 前1条
1 于滨;面向经济普查项目需求的模糊中文地址匹配方法研究[D];中南大学;2010年
,本文编号:1852046
本文链接:https://www.wllwen.com/kejilunwen/dizhicehuilunwen/1852046.html