李长荣,纪雪梅.面向突发公共事件网络舆情分析的领域情感词典构建研究[J].数字图书馆论坛,2020,(9):32~40 |
面向突发公共事件网络舆情分析的领域情感词典构建研究 |
Construction of Domain Sentiment Lexicon for Online Public Opinion Analysis in Public Emergencies |
投稿时间:2020-07-20 |
DOI:10.3772/j.issn.1673-2286.2020.09.005 |
中文关键词: 突发公共事件;情感词典;网络舆情;Word2Vec模型 |
英文关键词: Public Emergencies; Sentiment Lexicon; Online Public Opinion; Word2Vec |
基金项目:本研究得到国家社会科学基金青年项目“突发事件情境下社交媒体用户情感表达行为的特征与驱动因素研究”(编号:16CTQ027)资助。 |
作者 | 单位 | 李长荣 | 山东理工大学科技信息研究所 | 纪雪梅 | 山东理工大学科技信息研究所 |
|
摘要点击次数: 2345 |
全文下载次数: 3922 |
中文摘要: |
为了对突发公共事件网络舆情中的公众情感进行分析,本文构建了一种具有较好准确性和可靠性的面向网络舆情分析的领域情感词典。首先,基于现有通用情感词典在大规模网络舆论语料中进行情感词的识别和修正,将情感词分为7个大类和21个小类,并对情感词进行极性和强度标注,得到情感种子词典;其次,在情感种子词典的基础上利用Word2Vec模型和余弦相似度计算进行情感词扩展,得到新增情感词;再次,对新增情感词进行分类、极性和强度标注,最终构建一个领域情感词典;最后,选取新冠肺炎疫情事件的微博评论作为语料进行实验验证。结果,本文构建的词典对情感词的识别准确率为0.85,召回率为0.90,F1值为0.87,能够有效用于识别突发公共事件网络舆论中的情感类型和强度。 |
英文摘要: |
In order to analyze the public sentiment in online public opinion of the public emergencies, this paper constructs a domain sentiment lexicon for online public opinion analysis. Firstly, based on the existing general sentiment lexicon, the emotional words are identified and corrected through the large-scale public opinion corpus. The emotional words are divided into 7 major categories and 21 subcategories, and they are marked with polarity and intensity to construct a seed lexicon. Then, the Word2Vec model and cosine similarity algorithm are used to expand the number of emotional words on the basis of emotional seed dictionary. Thirdly, the classification, polarity and intensity of new sentiment words are marked, and a domain sentiment lexicon is constructed. Finally, the microblog comments of the COVID-19 was selected as the corpus for experimental verification. The precision of the dictionary constructed in this paper is 0.85, the recall is 0.90, and the F1-measure is 0.87, which can be effectively used to identify the type and intensity of emotions in the online public opinion of the public emergencies. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |