基于事件异构图表示的文本去重算法_艾玮.pdfVIP免费

下载本文档

阅读 0
下载 0
格式 pdf
大小 1.71 MB
约12页
2023-05-04
收藏
评论
点赞(0)
海报
举报

/12

第50卷第2期2023年2月Vol.50，No.2Feb.2023湖南大学学报（自然科学版）JournalofHunanUniversity（NaturalSciences）基于事件异构图表示的文本去重算法艾玮，许佳，谢灿豪，孟涛†（中南林业科技大学计算机与信息工程学院，湖南长沙410018）摘要：基于图结构的文本表示方法在新闻文本去重中具有更好的效果.但是，目前该表示方法还不能完整地表示文本的全部信息，并且忽略了图的语义信息，降低了新闻文本的去重效果.为此，本研究提出基于事件异构图表示的文本去重算法，该算法首先通过事件异构图表示新闻文本的全局语义与结构信息，然后提出双标签图核算法表征事件异构图，实现深度表征图的结构及语义信息.实验结果表明，该研究提出的去重算法比现有的基于图结构的文本表示去重方法在F1-score指标上提升了10%.最后，该算法能提高新闻文本的去重效果.关键词：图核算法；事件异构图；新闻文本；文本表示；文本去重中图分类号：T391.1文献标志码：ATextDeduplicationAlgorithmBasedonEventHeterogeneousGraphRepresentationAIWei，XUJia，XIECanhao，MENGTao†（SchoolofComputerandInformationEngineering，CentralSouthUniversityofForestryandTechnology，Changsha410018，China）Abstract：Thetextrepresentationmethodbasedongraphstructurehasabettereffectinnewstextdeduplica⁃tion.However，atpresent，thisrepresentationmethodcannotfullyrepresentthecompleteinformationofthetext，andignoresthesemanticinformationofthegraph，whichreducesthededuplicationeffectofnewstext.Tothisend，thisstudyproposesatextdeduplicationalgorithmbasedoneventheterogeneousgraphrepresentation.Thealgorithmfirstrepresentstheglobalsemanticandstructuralinformationofnewstextthrougheventheterogeneousgraph，andthenproposesadual-labelgraphkernelalgorithmtorepresenteventheterogeneousgraphtorealizethestructureandsemanticinformationofthedeeprepresentationgraph.Theexperimentalresultsshowthatthededuplicationalgo⁃rithmproposedimprovestheF1-scoreindexby10%，comparedwiththeexistingtextrepresentationdeduplicationmethodbasedongraphstructure.Finally，thealgorithmcanimprovethededuplicationeffectofnewstext.Keywords：graphkernelalgorithm；eventheterogeneousgraph；newstext；textrepresentation；textdeduplica⁃ti...

1、当您付费下载文档后，您只拥有了使用权限，并不意味着购买了版权，文档只能用于自身使用，不得用于其他商业用途（如 [转卖]进行直接盈利或[编辑后售卖]进行间接盈利）。
2、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。
3、如文档内容存在违规，或者侵犯商业秘密、侵犯著作权等，请点击“违规举报”。

碎片内容