王莉军,刘洢颖,郑明,李雪,王鑫月.基于机器阅读理解的科技文献三元组抽取模型研究[J].数字图书馆论坛,2025,21(4):21~32 |
基于机器阅读理解的科技文献三元组抽取模型研究 |
Triple Extraction Model of Scientific and Technical Literature Based on Machine Reading Comprehension |
投稿时间:2025-02-24 |
DOI:10.3772/j.issn.1673-2286.2025.04.003 |
中文关键词: 科技文献;开放信息;事实三元组;关键三元组;机器阅读理解 |
英文关键词: Scientific and Technical Literature; Open Information; Fact Triple; Key Triple; Machine Reading Comprehension |
基金项目:本研究得到中信所重点工作项目“面向战略决策的智能情报技术引擎研究及应用”(编号:ZD2025-08)资助。 |
作者 | 单位 | 王莉军 | 中国科学技术信息研究所;富媒体数字出版内容组织与知识服务重点实验室 | 刘洢颖 | 中国科学技术信息研究所;富媒体数字出版内容组织与知识服务重点实验室 | 郑明 | 中国科学技术信息研究所;富媒体数字出版内容组织与知识服务重点实验室 | 李雪 | 北京科技大学计算机与通信工程学院 | 王鑫月 | 北京科技大学计算机与通信工程学院 |
|
摘要点击次数: 26 |
全文下载次数: 43 |
中文摘要: |
科技文献是推动科学研究和技术进步的重要资源,然而随着文献数量的激增,科研人员面临着从海量文献中快速获取关键信息的挑战。提出基于机器阅读理解的开放信息抽取模型MMOIE(Multi-AnswerMachine-Reading-Comprehension Open Information Extraction),用于高效提取科技文献中的三元组。该模型通过结合SIFRank+模型与ELMo预训练语言模型,精确计算关键词的关键性权重,进而筛选出包含至少一个关键词的事实三元组。实验结果表明,与ZORE、SpanOIE、MGD-GNN、TPOIE等方法相比,MMOIE模型在三元组抽取中的召回率达到64.78%,F1分数达到55.62%,显著提升了关键信息的提取效率和质量,有效捕捉了文献中的实体关系,为科研人员快速获取关键信息提供了有力支持。 |
英文摘要: |
Scientific and technical literature is an important resource for promoting scientific research and technological progress. However, with the proliferation of literature, researchers are faced with the challenge of quickly obtaining key information from the massive amount of literature. In this paper, we propose an open information extraction model based on machine reading comprehension, namely MMOIE (Multi-Answer Machine-Reading-Comprehension Open Information Extraction), for efficiently extracting triples from scientific and technical literature. The model accurately calculates the critical weights of keywords by combining the SIFRank+ model with the ELMo pre-trained language model, and then filters out fact triples containing at least one keyword. The experimental results show that, compared with the existing methods such as ZORE, SpanOIE, MGD-GNN, and TPOIE, the MMOIE model achieves a recall rate of 64.78% and an F1 score of 55.62% for key triple extraction, which significantly improves the efficiency and quality of key information extraction, effectively captures the entity relationships in the literature, and provides strong support for researchers to quickly obtain key information. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |
|
|
|