Improved Attentional Seq2seq with Policy Gradient for Text S
发布时间:2023-04-10 18:59
随着数字信息的爆炸式增长,文本摘要生成技术已经渗透到我们生活的每一个角落,当我们打开头条或者腾讯新闻app的时候,经常会看到这样的标题:“BAT领先,市值8000亿……”,“今年,中国将完成8项重大任务”,“下飞机!飞机要爆炸了!飞机已紧急迫降俄罗斯”等。当我们看一眼这些标题,我们就可以很容易的了解到新闻的内容,不需要打开新闻然后逐行逐字阅读。此外,自动文本摘要的应用场景很多,如新闻标题生成,科学文档摘要生成,搜索结果段生成,产品评论摘要等。在互联网信息爆炸的时代,如果能用简短的文字来+表达信息的主要内容,这无疑将有助于缓解信息过载的问题。文本摘要生成的主流技术包括压缩式,抽取式和生成式三种方法。其中压缩式摘要是通过抽取并简化原文中的重要句子构成摘要;抽取式摘要是通过直接从原文中抽取已有的句子组成摘要;生成式摘要则是通过改写或重新组织原文内容形成最终的摘要。可见压缩式和抽取式比较相近,而生成式摘要更符合人类的逻辑思维。传统的文本摘要生成方法主要集中于抽取式文本摘要-生成,如TextRank和PageRank。这些方法可以利用BM25,TF-IDF算法计算词频或语义相似度来生成摘要。所有...
【文章页数】:60 页
【学位级别】:硕士
【文章目录】:
Acknowledgements
Abstract
Chapter1 Introduction
1.1 Definition of Text Summarization
1.2 Why We Need Text Summarization
1.3 Mainstream Methods for Text Summarization
1.3.1 Extractive Summarization
1.3.2 Compressive Summarization
1.3.3 Abstractive Summarization
1.4 Related Work
1.5 Challenges
Chapter2 Basic Techniques
2.1 TextRank
2.2 Sequence to Sequence
2.3 The Evaluation Methods
Chapter3 Improved Attentional Seq2seq with Policy Gradient for Text Summarization
3.1 Word Embedding
3.2 Add Drop Out in LSTM Cells of Encoder
3.3 Use Soft Attention Mechanism in Decoder
3.4 Mini-Batch Gradient Descent
3.5 Use Beam Search Algorithm to Generate the Summary
3.6 Add the Policy Gradient in Attentional Seq2seq Model
3.7 Use Scheduled Sampling in Decoder
Chapter4 Experiments
4.1 Experimental Environment
4.2 Experimental Design
4.3 Dataset
4.3.1 Data Visualization
4.3.2 Data Preprocessing
4.4 Analysis the Mini-batch Gradient Descent Method
4.5 Parameter Setting
4.6 Results
Chapter5 Conclusion and Future Work
5.1 Conclusion
5.2 Future Work
References
Appendix A
摘要
本文编号:3788653
【文章页数】:60 页
【学位级别】:硕士
【文章目录】:
Acknowledgements
Abstract
Chapter1 Introduction
1.1 Definition of Text Summarization
1.2 Why We Need Text Summarization
1.3 Mainstream Methods for Text Summarization
1.3.1 Extractive Summarization
1.3.2 Compressive Summarization
1.3.3 Abstractive Summarization
1.4 Related Work
1.5 Challenges
Chapter2 Basic Techniques
2.1 TextRank
2.2 Sequence to Sequence
2.3 The Evaluation Methods
Chapter3 Improved Attentional Seq2seq with Policy Gradient for Text Summarization
3.1 Word Embedding
3.2 Add Drop Out in LSTM Cells of Encoder
3.3 Use Soft Attention Mechanism in Decoder
3.4 Mini-Batch Gradient Descent
3.5 Use Beam Search Algorithm to Generate the Summary
3.6 Add the Policy Gradient in Attentional Seq2seq Model
3.7 Use Scheduled Sampling in Decoder
Chapter4 Experiments
4.1 Experimental Environment
4.2 Experimental Design
4.3 Dataset
4.3.1 Data Visualization
4.3.2 Data Preprocessing
4.4 Analysis the Mini-batch Gradient Descent Method
4.5 Parameter Setting
4.6 Results
Chapter5 Conclusion and Future Work
5.1 Conclusion
5.2 Future Work
References
Appendix A
摘要
本文编号:3788653
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/3788653.html