当前位置:主页 > 科技论文 > 软件论文 >

可变光照下的唇读识别技术研究

发布时间:2018-11-02 19:43
【摘要】:唇读技术拥有重大的研究价值和极为广泛的应用前景。近年来越来越多的唇部定位和唇动识别算法被提出,然而这些算法的研究主要局限在正面理想光照条件下,而实际的唇读识别系统都将工作在光照变化的应用环境中。因此,本文致力于可变光照环境下的唇读识别技术研究,以减弱外部光照对唇读造成的影响,提高唇部定位和唇动特征提取算法的鲁棒性。唇读数据库是本文展开研究的基石。为此,本文首先对国内外已有的唇读数据库进行了研究和对比,以借鉴其建库的方法和思路。在此基础上,针对本课题的需要建立了光照可变的唇读数据库以用于后续研究。为了准确的定位和分割唇部区域,本文设计了一种三段式唇部定位算法。首先采用Haar-like特征和Ada Boost算法定位人脸,在此基础上根据人脸固有的结构特征对唇部进行粗定位,对于最后的唇部精确分割,本文提出了一种基于HSV颜色空间H分量的分割算法。实验证明,本文所提方法在光照可变的环境下仍可准确的定位唇部区域。为了减弱外部光照变化对唇动特征提取造成的影响,本文从去光照预处理和提取光照不变特征两个方面来增强唇动特征提取算法的鲁棒性。本文设计的去光照预处理链由中值滤波、Gamma校正、多尺度Retinex滤波和对比度均衡化构成。经过该预处理算法处理可有效滤除部分光照噪声。在此基础上,本文通过对传统LBP特征提取算法进行拓展改进,对唇部提取了改进的LBP直方图特征。该特征具有一定的光照不变性,可进一步提高可变光照下的唇读识别率。本文采用SVM算法进行唇动识别。针对SVM算法只能进行二分类的缺陷,本文采用了一对一的推广策略使其能识别多个词汇;对于SVM算法要求输入特征向量维度固定的问题,本文设计了唇动特征序列长度规整算法将其规整为统一的维度。最后基于SVM算法分别在自然光照和可变光照不同条件下验证了所提的唇动特征提取算法的合理性和有效性。
[Abstract]:Lip reading technology has great research value and wide application prospect. In recent years, more and more lip localization and lip motion recognition algorithms have been proposed. However, the research of these algorithms is mainly confined to the positive ideal illumination condition, and the actual lip reading recognition system will work in the application environment of illumination change. Therefore, in order to reduce the influence of external illumination on lip reading and improve the robustness of lip location and lip feature extraction algorithms, this paper focuses on lip recognition technology in variable illumination environment. Lip-reading database is the cornerstone of this paper. Therefore, this paper first studies and compares the existing lip reading databases at home and abroad in order to draw lessons from the methods and ideas of building them. On this basis, the database of lip reading with variable illumination was established to meet the needs of the subject. In order to locate and segment lip region accurately, a three-segment lip location algorithm is designed in this paper. Firstly, the Haar-like feature and Ada Boost algorithm are used to locate the face, and then the lip is roughly located according to the inherent structural features of the face, and the final lip is segmented accurately. In this paper, a segmentation algorithm based on H component in HSV color space is proposed. Experiments show that the proposed method can accurately locate the lip region in the environment of variable illumination. In order to reduce the influence of external illumination on lip feature extraction, this paper improves the robustness of lip feature extraction algorithm from two aspects: removing illumination preprocessing and extracting illumination invariant feature. The de-illumination preprocessing chain is composed of median filter, Gamma correction, multi-scale Retinex filter and contrast equalization. The preprocessing algorithm can effectively filter part of the illumination noise. On this basis, the traditional LBP feature extraction algorithm is extended and improved, and the improved LBP histogram feature is extracted from the lip. The feature has certain illumination invariance, which can further improve the recognition rate of lip reading under variable illumination. In this paper, SVM algorithm is used for lip recognition. Aiming at the defect that SVM algorithm can only be classified into two categories, a one-to-one generalization strategy is adopted in this paper so that it can recognize more than one vocabulary. For the problem that the SVM algorithm requires the input feature vector dimension to be fixed, this paper designs the lip motion feature sequence length regularization algorithm to make it a unified dimension. Finally, based on the SVM algorithm, the rationality and effectiveness of the proposed lip feature extraction algorithm are verified under different conditions of natural illumination and variable illumination, respectively.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.41

【参考文献】

相关期刊论文 前10条

1 张泽梁;宋绍成;张滴石;曹健;;基于Fourier描述子的唇形分类方法[J];吉林大学学报(理学版);2015年01期

2 张泽梁;杨成佳;宋绍成;;唇读研究进展综述[J];计算机工程与设计;2014年06期

3 徐诚;;唇读研究回顾:从聋人到正常人[J];华东师范大学学报(教育科学版);2013年01期

4 梁亚玲;杜明辉;;基于DCT和ONPP的唇部特征提取[J];计算机科学;2011年05期

5 梁亚玲;杜明辉;;基于Lab色度空间a分量的唇部提取方法[J];计算机工程;2011年03期

6 何俊;张华;刘继忠;;在DCT域进行LDA的唇读特征提取方法[J];计算机工程与应用;2009年32期

7 赵晖;林成龙;唐朝京;;基于视频三音子的汉语双模态语料库的建立[J];中文信息学报;2009年05期

8 洪晓鹏,姚鸿勋,徐铭辉;基于句子级的唇读语料库及其切分算法[J];计算机工程与应用;2005年03期

9 姚鸿勋,高文,王瑞,郎咸波;视觉语言——唇读综述[J];电子学报;2001年02期

10 周治,杜利民,徐彦君;汉语听觉视觉双模态信息的互补作用[J];中国科学E辑:技术科学;2000年03期

相关博士学位论文 前1条

1 梁亚玲;基于单视觉通道唇读系统的研究[D];华南理工大学;2011年



本文编号:2306769

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2306769.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户24896***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com