摘要随着互联网的广泛普及,社交媒体完成了飞速的发展变化,实现了人与人之间随时随地的多方式的相互交流,同时也造成了网络文本信息的大量化及繁杂化。由此我们设计了一个简单的情感语料标注及预测系统,通过对大量不同的情感语料样本进行人工处理和分析,将情感词汇的倾向值存入数据库,以作为情感倾向值的预测基础。其基本思想是:(1)创建情感标注样本库,以进行人工的情感倾向标注;(2)通过分词组件分离各样本词汇,并将词汇与其相关极性值存入相应的情感词汇数据库中;(3)通过对输入的待测文本语料与相关数据库的对比,计算得到合适的情感倾向预测值。在本文中,我们倾向于对语料样本的人工标注及分析,以创建足够数据量的用于与待测文本相对比的情感词汇数据库。49137
毕业论文关键词 标注及预测系统 情感倾向值预测 情感词数据库
毕业设计说明书外文摘要
Title An Annotation and Prediction System of Emotional Classification Corpus
Abstract
With the widespread popularity of the Internet, social media has developed rapidly, which realizes the mutual communications between people at any space and at any time. It also makes the network text information become complex and complicated. So we designed a simple annotation and prediction system. Through the artificial processing and analysis of a large number of different emotional corpus samples. We save the tendency of emotional vocabulary into the database so as to make it the basis of prediction of emotional tendency. The basic idea is that: (1) Create an emotional annotation sample library in order to carry out the artificial emotional tendency tagging. (2) Separate each sample vocabulary through the word segmentation module, and put the vocabulary and its relative polarity into the corresponding emotion vocabulary database. (3) By comparing the input data of the tested text with the related database, the appropriate prediction value of the emotional tendency is calculated. In this text, we tend to manually label and analyze data samples so as to create enough database to compare with the emotional vocabulary of the tested text.
Keywords an annotation and prediction system ,the appropriate prediction value of the emotional tendency ,the emotional vocabulary database
目 次
1 引言1
1.1情感分类的研究背景1
1.2情感分类的研究意义1
1.3测试(训练)数据集的重要性2
1.4中文文本情感分类的研究现状3
2 系统的相关概念及技术分析5
2.1情感倾向分类5
2.2中文文本分词技术5
2.3情感倾向分类预测7
2.4技术介绍8
3 系统的设计分析9
3.1系统设计的要求9
3.2需求分析10
3.3流程分析12
4 系统的详细设计15
4.1数据库介绍15
4.3语料标注系统的界面实现24
4.4预测算法的实现及意义25
5 结论29
6 致谢30
7 参考文献31
1 引言
近几年来,随着电子商务网站、社交网站等互联网服务迅速占领了我们的生活论文网,互联网用户的数量呈现了爆炸性的增长。同时网上关于多种产品、热点信息等方面的评判和分析率也呈现日益增长的趋势,由此,对于网上用户发表信息的情感倾向的掌握有助于商家确定商品的改进方向,同时也有助于政府对于舆情信息的掌控并及时制定相关的舆论引导策略。