基于MSER的自然场景文本定位算法研究
发布时间:2019-06-18 19:03
【摘要】:自然场景图像中的文本含有大量语义信息,是对图像场景的重要补充。随着智能手机、平板电脑和数码相机的普及,人们越来越容易获取高质量的场景图像。从自然场景图像中提取文本信息不仅有助于人们更深层次地理解场景,而且在检索、查询以及视觉辅助系统中有重要用途。准确提取自然场景中的文本信息的前提是精确定位文本区域,自然场景文本定位面临着图像背景复杂、字体多样以及遮挡、模糊等难题,是一个极具挑战性的研究课题。本文对自然场景文本定位的相关技术进行探索,提出了一种新的基于最大稳定极值区域的自然场景文本定位算法框架。本文的主要贡献如下:(1)针对MSER检测器检测文本候选区域的重复检测问题,提出了一种基于区域变化率的MSER重复嵌套区域删除规则。首先对图像进行预处理,从各个颜色通道中提取出MSER,然后根据区域的变化率以及包含关系,删除重复检测的区域。(2)针对低分辨率或者有阴影的图像,相邻字符之间存在边缘粘连的问题,本文用边缘增强的MSER作为字符候选区域,并且在此基础上设计了一种由粗到细的字符候选区域验证规则。首先利用区域的形状特征设计了验证候选字符区域的启发式规则,然后结合区域的笔画宽度变换和支持向量机实现字符区域的确认。(3)设计了一种基于字符区域特征相似性的文本行建立方法,将从多个通道中提取出的字符区域合并为能够表达完整语义信息的文本行。为了验证提出算法的性能,分别在ICDAR 2003、ICDAR 2013和SVT三个公开数据库进行了仿真实验,得到了良好的实验效果。
[Abstract]:The text in natural scene image contains a lot of semantic information, which is an important supplement to image scene. With the popularity of smartphones, tablets and digital cameras, it is more and more easy to obtain high-quality scene images. Extracting text information from natural scene images not only helps people to understand the scene more deeply, but also plays an important role in retrieval, query and visual assistance system. The premise of accurately extracting text information from natural scene is to accurately locate text area. Natural scene text location is faced with complex image background, diverse fonts, occlusion, blur and other problems, which is a very challenging research topic. In this paper, the related technologies of natural scene text location are explored, and a new natural scene text location algorithm framework based on maximum stable extremum region is proposed. The main contributions of this paper are as follows: (1) in order to solve the problem of repeated detection of text candidate regions detected by MSER detector, a MSER repeated nesting region deletion rule based on region change rate is proposed. Firstly, the image is preprocessed, and then the MSER, is extracted from each color channel, and then the repeated detection area is deleted according to the change rate of the region and the inclusion relationship. (2) aiming at the problem of edge adhesion between adjacent characters in low resolution or shadowed images, this paper uses edge enhanced MSER as character candidate region, and on this basis, designs a verification rule of character candidate region from thick to fine. Firstly, the heuristic rules for verifying the candidate character region are designed by using the shape features of the region, and then the recognition of the character region is realized by combining the stroke width transformation of the region and the support vector machine. (3) A text line establishment method based on the feature similarity of the character region is designed, which merges the character region extracted from multiple channels into a text line that can express the complete semantic information. In order to verify the performance of the proposed algorithm, three public databases, ICDAR 2013 and SVT, are simulated in ICDAR 2003, and good experimental results are obtained.
【学位授予单位】:西安科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1
本文编号:2501713
[Abstract]:The text in natural scene image contains a lot of semantic information, which is an important supplement to image scene. With the popularity of smartphones, tablets and digital cameras, it is more and more easy to obtain high-quality scene images. Extracting text information from natural scene images not only helps people to understand the scene more deeply, but also plays an important role in retrieval, query and visual assistance system. The premise of accurately extracting text information from natural scene is to accurately locate text area. Natural scene text location is faced with complex image background, diverse fonts, occlusion, blur and other problems, which is a very challenging research topic. In this paper, the related technologies of natural scene text location are explored, and a new natural scene text location algorithm framework based on maximum stable extremum region is proposed. The main contributions of this paper are as follows: (1) in order to solve the problem of repeated detection of text candidate regions detected by MSER detector, a MSER repeated nesting region deletion rule based on region change rate is proposed. Firstly, the image is preprocessed, and then the MSER, is extracted from each color channel, and then the repeated detection area is deleted according to the change rate of the region and the inclusion relationship. (2) aiming at the problem of edge adhesion between adjacent characters in low resolution or shadowed images, this paper uses edge enhanced MSER as character candidate region, and on this basis, designs a verification rule of character candidate region from thick to fine. Firstly, the heuristic rules for verifying the candidate character region are designed by using the shape features of the region, and then the recognition of the character region is realized by combining the stroke width transformation of the region and the support vector machine. (3) A text line establishment method based on the feature similarity of the character region is designed, which merges the character region extracted from multiple channels into a text line that can express the complete semantic information. In order to verify the performance of the proposed algorithm, three public databases, ICDAR 2013 and SVT, are simulated in ICDAR 2003, and good experimental results are obtained.
【学位授予单位】:西安科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1
【参考文献】
相关期刊论文 前5条
1 赵丹;;SVM核函数与选择算法[J];数字技术与应用;2014年09期
2 管士勇;陆利忠;闫镔;童莉;;一种基于稳定区域的图像特征描述子[J];计算机工程;2012年18期
3 王国锋;宋鹏飞;张蕴灵;;智能交通系统发展与展望[J];公路;2012年05期
4 晋瑾;平西建;张涛;陈明贵;;图像中的文本定位技术研究综述[J];计算机应用研究;2007年06期
5 ;Automatic character detection and segmentation in natural scene images[J];Journal of Zhejiang University Science A(Science in Engineer;2007年01期
相关硕士学位论文 前3条
1 黄攀;基于深度学习的自然场景文字识别[D];浙江大学;2016年
2 吴慧;面向盲人视觉辅助系统的自然场景文本检测[D];中南大学;2014年
3 马海清;基于边缘和纹理的文本定位算法的研究[D];哈尔滨工业大学;2007年
,本文编号:2501713
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2501713.html