当前位置:主页 > 科技论文 > 自动化论文 >

基于改进极限学习机的口音识别

发布时间:2021-01-07 21:10
  非母语说话人用英语讲话时,会表现出不同的口音或非英语母语口音的特点,基于该特点可识别出说话人的口音及其母语。外国口音的自动识别在众多语音系统中具有重要作用,如辨认说话人、数字学习、电话银行、语音邮件、语音转换和移民筛选等,此外,在保证自动语音识别(ASR)系统的鲁棒性方面也十分重要。但是,自动化的口语识别也面临很多困难,主要包括口音特征往往和语言内容、韵律、环境噪声以及说话人自身语音特点混杂,需要搭建复杂和非线性的口音识别模型。另外,包含大量样本的口音语料库也需耗费大量时力。本论文通过对语言学发音方法的研究,获得有效体现母语的发音特征,并采用改进的极限学习机算法,获取较为权威和丰富的英语方言语料库,分别实现二元口音分类和多元口音分类识别模型,获得了较好的识别结果。本文首先通过研究阿拉伯人在英语辅音方面的发音差异,提出了基于极限学习机(ELM)的口音识别模型。将切分好的辅音音素的梅尔倒频谱系数(MFCC)作为声学特征输入,对ELM分类器进行训练。分类器采用KFold验证的方式表现出更快的学习效率和性能,其精度可达88%,标准偏差为0.0167而SVM和DBN分类器的精度分别只有76%和6... 

【文章来源】:东华大学上海市 211工程院校 教育部直属院校

【文章页数】:70 页

【学位级别】:硕士

【文章目录】:
Dedication
Acknowledgement1
摘要
Abstract
Chapter1 Introduction and Background
    1.1 Introduction of Accent Identification(AID)
    1.2 Work Background
        1.2.1 Binery vs multiple Accent Identification(AID)
        1.2.2 Methods used in AID
        1.2.3 Linguistic characteristics
        1.2.4 Features Extraction level
    1.3 Research Outline
    1.4 Thesis Outline
Chapter2 Literature Review
    2.1 Linguistic basics
        2.1.1 Language Transfer
        2.1.2 Acoustic Analysis of Consonants
    2.2 Features Extraction Technique
        2.2.1 MFCC(Mel-frequency Cepstral Coefficients)
        2.2.2 Prosodic Speech Features
    2.3 Classification Techniques Overview
        2.3.1 Support Vector Machine Model(SVM)
        2.3.2 LSTM classifier Model
        2.3.3 Extreme Learning Machine Model(ELM)
        2.3.4 Kernel based Extreme Learning Machine Model
Chapter3 The System Model
    3.1 Introduction
    3.2 AID Model Design
        3.2.1 Speech Corpus
    3.3 Pre-processing
    3.4 Features Extraction
        3.4.1 Mel-frequency Cepstral Coefficients(MFCCs)
        3.4.2 Prosodic features
    3.5 Features Reduction
    3.6 Classification Phase
        3.6.1 Classification algorithms
        3.6.2 Training and Testing Phase
        3.6.3 Evaluating a classifier
        3.6.4 Performance index
    3.7 Summary
Chapter4 Consonant Phonemes based ELM Model for Foreign Accent Identification
    4.1 Introduction
    4.2 Consonant Phonemes based discriminative features
        4.2.1 Bilabial stop/b/vs/p/pronunciation
        4.2.2 Alveolar plosive/t/vs/d/pronunciation
        4.2.3 Pronunciation of Velar plosive/k/vs/g/
    4.3 ELM Model Framework
    4.4 Experimental setup
        4.4.1 Algorithm Implementation and Classification
    4.5 Comparative experiments
        4.5.1 Time-consuming performance
        4.5.2 Accuracies Comparison
    4.6 Conclusion
    4.7 Summary of the chapter
Chapter5 MKELM based Multi-Classification Model for Foreign Accent Identification
    5.1 Introduction
    5.2 Model Design
    5.3 Weighted scheme for Multi-classfication
        5.3.1 Weighted classification
        5.3.2 Accent decision
    5.4 Derivation of Multi-Kernel ELM
    5.5 Experimental Setup
        5.5.1 Software hardware setup
        5.5.2 Experimental procedure and Results
        5.5.3 Comparitive experiments
        5.5.4 Time-consuming performance comparision
        5.5.5 Comparison of accent classificaiton results
    5.6 Conclusion
Chapter6 Conclusions and Future work
    6.1 Conclusions
    6.2 Future work
ACKNOWLEDGEMENT2
Bibliography



本文编号:2963248

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2963248.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户fbf3b***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com