Sentence-embedding and Similarity via Hybrid Bidirectional-L
发布时间:2022-01-19 14:08
在过去的十年中,文本理解和信息检索等领域以及对自然语言处理中句子相似性的分析引起了研究者们的巨大关注。尽管用于操作相似性系统的传统方法完全取决于手工制作的特征。最近,由于神经网络在处理语义合成方面的成功,它们在句子相似性测量系统中受到了相当大的关注。然而,现有的神经网络方法在捕获隐藏在句子中的最重要的语义信息方面不够有效。此外,越来越多的深度神经网络的应用已经将兴趣从词级别转移到粒度更大的文本一级上,例如句子嵌入。为了解决这个问题,本文提出了一种新的加权池注意层,以保留最显著的注意力向量,有序模式和忽略不相关的词。已经确定长短期记忆网络和卷积神经网络具有很强的累积整个句子语义表示的丰富模式的能力。两种模型的组合提高了模型提取综合上下文信息的能力。首先,通过采用基于双向长短期记忆网络和卷积神经网络的模型结构来生成句子表示。随后,应用加权池注意层以获得关注向量。最后,利用关注向量的信息来计算句子相似度的得分。研究表明,本文所提出的方法优于现有两个任务的数据集上的最新方法,即语义相关性和微软研究释义识别。通过对LSTM细胞单元的不同参数值进行了实验,包括dropout概率,并且与其他现有的注...
【文章来源】:大连理工大学辽宁省 211工程院校 985工程院校 教育部直属院校
【文章页数】:60 页
【学位级别】:硕士
【文章目录】:
摘要
Abstract
1 Introduction
1.1 Research Background
1.2 Research Motivation and Objective
1.2.1 Research Motivation
1.2.2 Research Objective
1.3 Domestic and Overseas Progress
1.3.1 Domestic Progress
1.3.2 Overseas Progress
1.4 Problem Statement
1.5 Main Content and Research Methods
1.5.1 Main Content
1.5.2 Research Methods
1.6 The Structure of Thesis
2 Theoretical and model Analysis
2.1 Semantic Similarity
2.1.1 Vector Space Model
2.1.2 Corpus and Knowledge-based Methods
2.2 Machine Learning
2.2.1 Support Vector Machine
2.2.2 Artificial Neural Network(ANN)
2.2.3 Deep Learning
2.3 Word2Vec
2.3.1 Continuous bag-of-words
2.3.2 Skip-gram Model
2.3.3 Word Mover’s Distance Model
2.4 Related Work
3 Proposed Framework for Sentence Similarity
3.1 Proposed Model
3.1.1 Input Layer
3.1.2 Embedding Layer
3.1.3 Bidirectional LSTM
3.1.4 Convolutional neural network
3.1.5 Weighted-pooling attention
3.2 Proposed Algorithm
4 Experiments and Discussion
4.1 Experimental Setup
4.1.1 Datasets
4.1.2 Pre-Trained-Embedding
4.1.3 Comparison systems
4.1.4 Experimental Parameters
4.2 Results and Discussion
5 Conclusion and Future Direction
5.1 Main Contributions
5.2 Conclusion
5.3 Future Direction
References
Research Projects and Publications in Master Study
Acknowledgement
【参考文献】:
期刊论文
[1]基于同义词词林的词语相似度计算方法[J]. 田久乐,赵蔚. 吉林大学学报(信息科学版). 2010(06)
本文编号:3596992
【文章来源】:大连理工大学辽宁省 211工程院校 985工程院校 教育部直属院校
【文章页数】:60 页
【学位级别】:硕士
【文章目录】:
摘要
Abstract
1 Introduction
1.1 Research Background
1.2 Research Motivation and Objective
1.2.1 Research Motivation
1.2.2 Research Objective
1.3 Domestic and Overseas Progress
1.3.1 Domestic Progress
1.3.2 Overseas Progress
1.4 Problem Statement
1.5 Main Content and Research Methods
1.5.1 Main Content
1.5.2 Research Methods
1.6 The Structure of Thesis
2 Theoretical and model Analysis
2.1 Semantic Similarity
2.1.1 Vector Space Model
2.1.2 Corpus and Knowledge-based Methods
2.2 Machine Learning
2.2.1 Support Vector Machine
2.2.2 Artificial Neural Network(ANN)
2.2.3 Deep Learning
2.3 Word2Vec
2.3.1 Continuous bag-of-words
2.3.2 Skip-gram Model
2.3.3 Word Mover’s Distance Model
2.4 Related Work
3 Proposed Framework for Sentence Similarity
3.1 Proposed Model
3.1.1 Input Layer
3.1.2 Embedding Layer
3.1.3 Bidirectional LSTM
3.1.4 Convolutional neural network
3.1.5 Weighted-pooling attention
3.2 Proposed Algorithm
4 Experiments and Discussion
4.1 Experimental Setup
4.1.1 Datasets
4.1.2 Pre-Trained-Embedding
4.1.3 Comparison systems
4.1.4 Experimental Parameters
4.2 Results and Discussion
5 Conclusion and Future Direction
5.1 Main Contributions
5.2 Conclusion
5.3 Future Direction
References
Research Projects and Publications in Master Study
Acknowledgement
【参考文献】:
期刊论文
[1]基于同义词词林的词语相似度计算方法[J]. 田久乐,赵蔚. 吉林大学学报(信息科学版). 2010(06)
本文编号:3596992
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/3596992.html