%0 Journal Article %T 基于T5-PEGASUS-PGN模型的中文新闻文本摘要生成方法
A Method of Generating Chinese News Text Summaries Based on the T5-PEGASUS-PGN Model %A 曹一平 %A 张胜男 %J Computer Science and Application %P 10-19 %@ 2161-881X %D 2024 %I Hans Publishing %R 10.12677/CSA.2024.143053 %X 针对预训练模型训练任务与下游摘要生成任务存在差异、生成文本存在重复内容造成摘要可读性差的问题,基于T5-PEGASUS和指针生成网络,提出了一种自动摘要模型——T5-PEGASUS-PGN。首先利用T5-PEGASUS获取最符合原文语义的词向量表示,然后借助引入覆盖机制的指针生成网络,生成高质量、高可读的最终摘要。在公开的长文本数据集NLPCC2017的实验结果表明,与PGN模型、BERT-PGN等模型相比,结合更贴合下游摘要任务的预训练模型的T5-PEGASUS-PGN模型能够生成更符合原文语义、内容更加丰富的摘要并且能有效的抑制重复内容生成,同时Rouge评价指标Rouge-1提升至44.26%、Rouge-2提升至23.97%以及Rouge-L提至34.81%。
To address the challenges of differences between the training tasks of pretrained models and the downstream summary generation tasks, as well as the poor readability caused by repeated content in the generated texts, an automatic summary model called T5-PEGASUS-PGN is proposed based on T5-PEGASUS and pointer generation networks. This model first utilizes T5-PEGASUS to obtain the most semantically consistent word vector representation. Then, with the help of the pointer gener-ation network that applies the coverage mechanism, high-quality and readable final summaries are generated. Experimental results on the public long-text dataset NLPCC2017 show that compared with models such as PGN and BERT-PGN, the T5-PEGASUS-PGN model, which combines a pretrained model that fits the downstream summary task better, can generate summaries that are more con-sistent with the original text semantics, contain richer content, and effectively suppresses repeated content generation. At the same time, we have raised the Rouge-1 metric to 44.26%, the Rouge-2 metric to 23.97%, and the Rouge-L metric to 34.81%. %K 生成式摘要模型,预训练模型,PGN,Coverage机制
Abstractive Summarization Model %K Pre-Trained Language Model %K Pointer Generator Network (PGN) %K Coverage Mechanism %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=82451