场景图像文本定位与字符识别方法研究
[Abstract]:The text in the scene image contains rich and accurate information, which has a wide range of application requirements in the fields of industrial automation, traffic management, automatic translation, service for the disabled and so on. However, due to the influence of non-uniform lighting, background texture and text diversity, the accuracy of scene text extraction is low. Therefore, how to extract text information accurately from these scene images has become a research focus in the field of pattern recognition. The research of this project has important practical value to improve the accuracy and robustness of scene image text recognition system. The main work and contributions of this paper are as follows: firstly, based on the consistency of the gray value of the characters in the text area, the amplitude of the gradient in the x direction is convex and the nearest neighbor of the text characters. In this paper, a text location method of scene image based on convolution neural network (CNN) and support vector machine (SVM) output score is proposed. According to the convexity distribution of the gradient amplitude in the x direction of the text region and the consistency of the character gray value, the typical points in the text region are detected, and the candidate connected components are extracted by the typical point position and gray clustering, and then the regions other than the candidate connected components are extracted. Other candidate connected components were further extracted by k-means clustering method. Then, the text connected component SVM classifiers based on CNN are used, the texture features of connected components are extracted by CNN, and then the non-text connected components are suppressed by SVM output score, and the nearest neighbor connected components are combined into candidate text regions. Finally, the support vector machine (SVM) is used to verify the candidate region according to the gradient direction histogram HOG feature of the candidate region. For the scene text image datasets of ICDAR2011 and ICDAR2013, the F values of 76% and 78% are obtained by the localization method, respectively, which shows that the method can effectively suppress the complex background texture interference. Secondly, based on the similarity of character color in text line, a text region character cutting method based on color clustering and gradient vector stream is proposed. Firstly, k-means clustering method is used to cluster the spatial position distribution of pixel color to obtain k candidate layers, and then the geometric features such as duty cycle and aspect ratio of connected components are used to extract the layers in which the candidate characters are connected. In the homogeneous region, the point far from the edge is found as the candidate segmentation pixel point, and the square of the gray difference is used as the cost to find the cutting path with the lowest cumulative cost. On the text dataset of ICDAR2013 scene image, the F value of 87.9% is obtained by this method. The experimental results show that color clustering can effectively suppress the interference of non-uniform light and occlusion. Finally, based on the rotation invariance of character structure, a multi-direction single character recognition model is proposed. The deformed HOG operator and concentric circular template sampling are used to extract the local joint HOG texture features and the quadrant structure features between the sampling points, and the character features are obtained by combining the above two features. Then the character word bag model of feature dictionary is established by learning, and then the character is recognized by support vector machine (SVM). Character recognition experiments are carried out for ICDAR character datasets, Chars74K datasets and manual collected datasets. The accuracy of the proposed method is 82%, 87% and 73% respectively, which shows that the proposed model has good robustness to rotation change.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.41
【相似文献】
相关期刊论文 前10条
1 ;有限自然码非接触光电字符识别[J];中国计量学院学报;2001年02期
2 许振新;字符识别要面向应用[J];中国计算机用户;2003年13期
3 卢达,浦炜,谢铭培;一种用于提高字符识别速度的字符预分类法研究 [J];计算机工程与应用;2000年04期
4 孙广玲,唐降龙;基于识别结果反馈信息的闭环联机字符识别系统[J];计算机工程与应用;2002年22期
5 乌凌超,莫玉龙;基于独立分量分析的字符识别方法[J];上海大学学报(自然科学版);2003年03期
6 陈薇,李勇;基于块输入的神经网络英语字符识别研究[J];计算机时代;2005年07期
7 汤茂斌;谢渝平;李就好;;基于神经网络算法的字符识别方法研究[J];微电子学与计算机;2009年08期
8 田立岩;胡晓光;;一种改进的快速嵌入式字符识别方法[J];光电子.激光;2010年10期
9 陈默;何小海;吴炜;杨晓敏;付光荣;;结合独立与连续字符识别的集装箱号识别技术[J];四川大学学报(工程科学版);2011年S1期
10 韩林峰;赵晖;;基于支持向量机的联机手写维吾尔字符识别[J];计算机应用与软件;2012年03期
相关会议论文 前10条
1 汤茂斌;谢渝平;李就好;;基于神经网络算法的字符识别方法研究[A];2009年全国开放式分布与并行计算机学术会议论文集(上册)[C];2009年
2 洪汉玉;郭强;章秀华;张艳;林志敏;;复杂背景条件下字符识别新方法研究[A];第十四届全国图象图形学学术会议论文集[C];2008年
3 车扬;郑智捷;;速记字符识别的预处理模式和方法探讨[A];2010通信理论与技术新发展——第十五届全国青年通信学术会议论文集(下册)[C];2010年
4 李玉良;王良松;李晶;;图像中数字字符识别技术概览[A];节能环保 和谐发展——2007中国科协年会论文集(一)[C];2007年
5 刘云曼;王磊;;盲人阅读机中图像字符识别方法的研究[A];天津市生物医学工程学会第三十三届学术年会论文集[C];2013年
6 余晓华;陈晓春;刘好炯;;手持式仪表字符识别技术研究[A];《IT时代周刊》论文专版(第300期)[C];2014年
7 陆璐;张旭东;赵莹;高隽;;基于卷积神经网络的车牌照字符识别研究[A];第十二届全国图象图形学学术会议论文集[C];2005年
8 朱小燕;史一凡;马少平;;脱机手写体字符识别研究[A];面向21世纪的科技进步与社会经济发展(上册)[C];1999年
9 欧梅芳;宋瑞霞;;V-系统在信息重构与字符识别中的应用探索[A];中国图学新进展2007——第一届中国图学大会暨第十届华东六省一市工程图学学术年会论文集[C];2007年
10 张雪山;田慧;;字符识别系统的一种定位算法[A];图像 仿真 信息技术——第二届联合学术会议论文集[C];2002年
相关重要报纸文章 前3条
1 尼克;计算历史学:大数据时代的读书[N];东方早报;2014年
2 王庆国;票据印刷视觉字符检测系统中硬件的选择[N];中国包装报;2008年
3 方忠诚;OCR技术及其应用[N];北京电子报;2000年
相关博士学位论文 前4条
1 巫义锐;视觉场景理解与交互关键技术研究[D];南京大学;2016年
2 文颖;数字、字符识别及其应用研究[D];上海交通大学;2009年
3 彭健;多类小字符集自适应字符识别技术及系统的研究[D];重庆大学;2002年
4 罗特飞(Mohammed Lutf);基于HMM与决策树的多字体阿拉伯文的字符识别[D];华中科技大学;2015年
相关硕士学位论文 前10条
1 张佳伟;基因组自动化进化仪的研制[D];浙江大学;2015年
2 邱立松;国际音标字符识别算法的研究[D];上海师范大学;2015年
3 张靖娅;钢板点阵喷印字符识别方法研究[D];沈阳理工大学;2015年
4 武威;基于模板匹配与结构特征的字符识别算法研究[D];郑州大学;2015年
5 王劲松;基于神经网络的字符识别系统的设计与实现[D];电子科技大学;2014年
6 周炳昱;基于手机摄像取词的电子词典的设计与实现[D];大连理工大学;2015年
7 戴威;联机手写智能计算系统的研究[D];华北电力大学;2015年
8 尹少东;基于嵌入式Linux的字符识别[D];河北科技大学;2015年
9 周军;图像中自然场景字符区域定位[D];东北大学;2014年
10 周品;车牌分割和字符识别的算法研究[D];南京邮电大学;2015年
,本文编号:2484739
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2484739.html