基于语音识别的构音及语音障碍自动评估系统研制

发布时间：2018-06-23 00:37

本文选题：语音识别 + 构音障碍评估　；参考：《华东师范大学》2014年博士论文

【摘要】：我国言语构音、语音障碍患者数量较多,而相关的障碍评估方法主要以主观听觉感知为主,缺乏一定的客观性和稳定性。近年来,语音识别技术在多个领域得到了广泛的应用,在言语语言教育方面的应用研究也取得了一定成果。但是在言语障碍评估与康复研究领域,基于语音识别的相关研究成果并不多见,而且未能引起足够的重视。本研究根据国内外言语构音、语音障碍的评估方法研究现状和发展趋势,综合语音识别技术在言语语言教育中应用的研究成果,进行了言语构音障碍、语音障碍自动评估的探索性研究。本研究首先提出基于语音识别进行构音障碍自动评估的基本思想,即能够通过计算机等设备对患者的构音功能从内容、声调以及障碍类型三方面进行自动评价。为验证基于语音识别的构音障碍自动评估方法的可行性,本研究基于微软公司发布的自带识别引擎的Speech SDK开发了构音障碍自动评估可行性分析系统。通过比较使用该系统和主观听觉感知方法对3-6岁健听儿童的构音能力评估的结果,得到构音障碍自动评估可行性分析系统对全部被试的平均识别准确率达到83.5%,初步说明基于语音识别的构音障碍评估方法是可行的,但仍无法满足言语障碍评估与康复的临床实际需求。本研究也因此提出了基于改进技术方案的构音障碍自动评估方法。采用自行构建的识别引擎和微软内置的识别引擎构建“双识别引擎”的构音障碍自动评估系统。首先,本研究采用基于隐马尔科夫模型的语音识别算法,从整体平均的角度来实现最优的识别过程,在统计框架中寻找能够使模型参数最大化的词条作为识别结果。提取62名3-6岁健听儿童按照指定词表所发语音的39维参数制作标准声学模型。根据前人对听障儿童构音障碍评估的研究成果,得到普通话声母和韵母的常见构音障碍具体情况,将构音障碍产生的词条汇总成表。基于微软Speech SDK实现构音障碍类型的检测模块。最后,基于SMDSF算法提取语料的4维基频特征,制作用于实现声调识别的标准声调识别模型。效果验证包括两方面,一方面使用与构音障碍自动评估可行性分析实验相同的健听儿童语料来验证系统的识别性能,识别准确率达到98%以上；另一方面采用主观听觉感知评估与语音识别评估对比的方式,对3-5岁的听障儿童的构音能力进行评估。结果证明两种方法得到的结果没有显著性差异,四项构音清晰度指标的值基本一致,能够基本实现构音障碍的自动评估。在构音障碍自动评估系统构建的基础上,本研究通过改进识别算法提出了语音障碍自动评估系统。在自行构建的标准声学模型基础上,分别提出了《语音重复能力测验词表》和《语音切换能力测验词表》以及基于这些词表的语音障碍自动评估方法。然后,同样基于微软Speech SDK提供的具备优先识别指定词条功能的API函数,实现语音重复和语音切换的障碍检测。最后,以听障儿童为对象,采用主观听觉感知评估与构音识别评估对比的方式,对语音障碍自动评估系统进行效果验证。结果证明。语音障碍自动评估系统的评估结果与主观评估结果没有显著性差异,两种方法得到的评估指标结果基本一致,能够基本实现语音重复能力和语音切换能力的自动评估功能。
[Abstract]:In our country, the number of speech sounds and speech disorders is large, and the related obstacle assessment methods are mainly subjective auditory perception and lack of objectivity and stability. In recent years, speech recognition technology has been widely used in many fields, and some achievements have been achieved in the application and research of speech language education. In the field of language barrier assessment and rehabilitation research, the related research results based on speech recognition are not very common, and they have not been paid enough attention. This study is based on the research status and development trend of the evaluation methods of speech disorders at home and abroad, and the research results of the application of speech recognition technology to speech language education. An exploratory study of automatic assessment of dysarthria and speech disorders.
In this study, the basic idea of automatic evaluation of dysarthria based on speech recognition is first proposed, that is, it can be automatically evaluated from three aspects of content, tone and obstacle type by computer and other devices. This study is based on the feasibility of the automatic evaluation method based on speech recognition. This study is based on Microsoft. The Speech SDK issued by the company has developed an automatic evaluation feasibility analysis system for dysarthria. By comparing the results of the system and the subjective auditory perception method for the assessment of the sound building ability of 3-6 year old healthy children, the average recognition accuracy of the system is obtained for all the subjects. Up to 83.5%, it shows that the speech recognition based articulation disorder assessment method is feasible, but it still can not meet the clinical needs of speech impairment assessment and rehabilitation.
In this study, an automatic evaluation method based on improved technical scheme is proposed. The self constructed recognition engine and the built-in recognition engine of Microsoft are used to construct a "double recognition engine" automatic evaluation system. Firstly, the speech recognition algorithm based on the hidden Markov model is used in this study, from the overall average. In order to achieve the best recognition process, we find a word which can maximize the parameters of the model in the statistical framework. 62 children of 3-6 years old are extracted according to the 39 dimension parameters of the speech on the specified word list. According to the previous research results of the impairment assessment of the sound barrier of the hearing impaired children, the Chinese consonant is obtained. And the specific situation of the common sound barrier of vowel, the words which are generated by the dysarthria are summarized into tables. Based on the Microsoft Speech SDK, the detection module of the dysarthria type is realized. Finally, based on the SMDSF algorithm, the 4 dimension fundamental frequency characteristics of the corpus are extracted, and the standard tone recognition model for tone recognition is made. The effect verification includes two aspects, on the one hand We use the same sound hearing children's corpus to verify the recognition performance of the system, and the recognition accuracy is above 98%. On the other hand, we use the way of subjective auditory perception assessment and speech recognition evaluation to evaluate the sound building ability of the 3-5 year old hearing impaired children. The results prove that two kinds of methods are used. There is no significant difference in the results obtained by the method. The four articulation indices are basically the same, which can basically achieve the automatic assessment of articulation disorders.
On the basis of the construction of the automatic evaluation system for dysarthria, an automatic speech obstacle evaluation system is proposed by improving the recognition algorithm. On the basis of the standard acoustic model constructed by ourselves, the speech repeating ability test word list > speech switching ability test word list > and the speech barrier based on these words are proposed respectively. And then, based on the API function provided by Microsoft Speech SDK, which has the priority to identify the function of the specified word, it realizes the obstacle detection of speech repetition and speech switching. Finally, the hearing impaired children are used to compare the subjective auditory perception assessment and the construction of the speech recognition evaluation, and the effect of the speech obstacle automatic evaluation system is achieved. The results show that there is no significant difference between the evaluation results of the speech obstacle automatic evaluation system and the subjective evaluation results. The results of the two methods are basically the same, and can basically realize the automatic evaluation function of the speech repetition ability and the voice switching ability.
【学位授予单位】：华东师范大学
【学位级别】：博士
【学位授予年份】：2014
【分类号】：TN912.3

【参考文献】