Semi-Supervised Method for Multi-Category Emotion Recognition in Tweets
Each tweet is limited to 140 characters. This constraint surprisingly makes Twitter a more spontaneous platform to express our emotions. Detecting emotions and correctly classifying them automatically is an increasingly important task if we want to understand how large groups of people feel about an event or relevant topic. However, constructing supervised classifiers can be a daunting task because of the high manual annotation costs. We propose constructing emotion classifiers with a minimal amount of initial knowledge (e.g. a generalpurpose emotion lexicon) and using a semi-supervised learning method to extend it to correctly detect more emotional tweets within a specific domain. Additionally, we show that our algorithm, Balanced Weighted Voting (or BWV) is able to overcome the imbalanced distribution of emotions in the initial labeled data. Our validation experiments show that BWV improves the performance of three initial classifiers, at least in the specific domain of sports. Furthermore, its comparison with other two learning strategies reveals its superiority in terms of macro F1-score, as well as more stable performance among different emotion categories.
Record created on 2015-08-24, modified on 2016-08-09