Infoscience

Working paper

It was easy, when apples and blackberries were only fruits

Ambiguities in company names are omnipresent. This is not accidental, companies deliberately chose ambiguous brand names, as part of their marketing and branding strategy. This procedure leads to new challenges, when it comes to finding information about the company on the Web. This paper is concerned with the task of classifying Twitter messages, whether they are related to a given company: for example, we classify a set of twitter messages containing a keyword apple, whether a message is related to the company Apple Inc. Our technique is essentially an SVM classier, which uses a simple representation of relevant and irrelevant information in the form of keywords, grouped in specic profiles. We developed a simple technique to construct such classiers for previously unseen companies, where no training set is available, by training the meta-features of the classier with the help of a general testset. Our techniques show high accuracy figures over the WePS-3 dataset.

Related material