文章摘要
黄怡,隗玲,张凯.基于PaECTER-BERTopic与大模型的专利技术主题识别及演化分析——以生成式人工智能领域为例[J].数字图书馆论坛,2025,21(2):1~11
基于PaECTER-BERTopic与大模型的专利技术主题识别及演化分析——以生成式人工智能领域为例
Patent Technology Topic Identification and Evolution Analysis Based on PaECTER-BERTopic and Large Model: A Case Study of Generative Artificial Intelligence
投稿时间:2024-10-15  
DOI:10.3772/j.issn.1673-2286.2025.02.001
中文关键词: 专利文本;技术主题识别;技术演化分析;PaECTER-BERTopic;大模型
英文关键词: Patent Text; Technology Topic Identification; Technology Evolution Analysis; PaECTER-BERTopic; Large Model
基金项目:本研究得到国家自然科学基金青年科学基金项目“基于多视角科技知识图谱融合的新兴技术演化路径识别与预测方法研究”(编号:72304176)资助。
作者单位
黄怡 山西财经大学信息学院 
隗玲 山西财经大学信息学院 
张凯 山西财经大学信息学院 
摘要点击次数: 40
全文下载次数: 202
中文摘要:
      为解决目前专利文本向量化表征效果不佳、专利技术主题识别结果可解释性不够等问题,提出一种基于PaECTER专利预训练语言模型、BERTopic与大模型的专利技术主题识别及演化分析方法。首先,采用PaECTER专利预训练语言模型对专利文本进行向量化表示;其次,基于BERTopic模型结合KeyBERT对专利技术主题进行识别,并使用GPT-4o大模型对技术主题进行体系化分析;再次,基于PaECTER对专利技术主题进行相似度关联计算,生成专利技术演化路径;最后,以生成式人工智能领域为例,验证所提方法的有效性。实验结果表明,对比传统的BERTopic模型,所提方法提高了专利技术主题的可解释性、一致性和多样性,实现了准确的专利技术演化路径识别,同时揭示了生成式人工智能领域技术的发展状态和演进路径,为相关领域研究提供理论参考。
英文摘要:
      To solve the current problems of poor vectorized representation of patent texts and insufficient interpretability of patent technology topic identification results, a method of patent technology topic identification and evolution analysis based on PaECTER patent pre-trained language model, BERTopic, and large model is proposed. Firstly, the PaECTER patent pre-trained language model is used to vectorize the patent texts. Secondly, the patent technology topics are identified based on the BERTopic model combined with KeyBERT, and the systematic analysis is carried out on the identified patent technology topics using the GPT-4o large model. Then, the similarity correlation calculation is performed on the patent technology topics based on PaECTER to generate the patent technology evolution path. Finally, taking the domain of generative artificial intelligence as an example, we verify the effectiveness of the proposed method. The experimental results show that compared with the traditional BERTopic model, the method proposed in this paper improves the interpretability, consistency, and diversity of patent technology topics, realizes the accurate identification of patent technology evolution path, and at the same time reveals the development status and evolution trend of technologies in the domain of generative artificial intelligence, which can provide theoretical reference for related research.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮