Twitter Sentiment Analysis (Almost) from Scratch

Lebret, Rémi; Pinheiro, Pedro H. O.; Collobert, Ronan

Lebret, Rémi; Pinheiro, Pedro H. O.; Collobert, Ronan

2016

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

A popular application in Natural Language Processing (NLP) is the Sentiment Analysis (SA), i.e., the task of extracting contextual polarity from a given text. The social network Twitter provides an immense amount of text (called tweets) generated by users with a maximum number of 140 characters. In this project, we plan to learn a tweet representation from publicly provided data from Tweets in order to infer sentiment from them. One challenge on this task is the fact that tweets are generated from very different users, making the data very heterogeneous (different from regular data which is written in proper English). Another challenge is, clearly, the large scale of the problem. We propose a deep learning sentence representation (called tweet representation) from user generated data to infer sentiment from tweets. This representation is learned from scratch (directly from the words in tweet) over a large unlabeled corpus of tweets. We demonstrate that we achieve state-of-the-art results for SA on tweets.

Details

Title Twitter Sentiment Analysis (Almost) from Scratch

Author(s) Lebret, Rémi ; Pinheiro, Pedro H. O. ; Collobert, Ronan

Date 2016

Publisher Idiap

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports

Record creation date 2016-05-19

Files

Abstract

Details

PDF