Mass Opinion - Extraction, Classification, and Measurement

Diermeier 

Wednesdays@NICO Seminar, Noon, February 2 2007, Chambers Hall, Lower Level

Prof. Daniel Diermeier, Bei Yu, and Stefan Kaufmann, Northwestern University

Abstract 

In this talk we introduce opinion classification, a subarea of text classification, and our exploratory study of opinion classification in political and business domains. Different from topic classification, opinion classification aims to assign tones (positive, negative, orneutral) to text pieces. We define three dimensions for a taxonomy of opinion texts: 1) single or multiple subjects of comment; 2) subjective or objective writing style; and 3) consistent or conflicting ideologies in the author community. The pioneering opinion classification research used the online customer reviews as the testbed, which were characterized by single subject of comment, subjective writing and consistent ideology among authors. In consequence the previous research focused on the extraction of subjective adjectives and the construction of document opinions based on a simple additive model of individual opinion indicators. Recently opinion classification has been extended to other domains, such as the public comments in eRulemaking and business news. Our preliminary work in opinion classification of congressional speeches and business news articles demonstrate that the characteristics of opinion expression in these domains (e.g. objective writing and conflicting ideologies) strongly affect the classification strategies. After comparing the classifiers learned from movie reviews and speeches in the Senate and House debates, we found that "issue words" mostly represented by nouns are better indicators for classifying opinions of conflicting ideologies and objective writing styles. This result also poses the difficulties of learning general-purpose opinion classifier based on bag-of-words representation. We need to explore more complex linguistic features in order to catch generic opinion expressions.