How Words Move Hearts: Interpretable Machine Learning Models of Bias, Engagement, and Influence in Socio-Political Systems
We study socio-political systems in representative democracies. Motivated by problems that affect the proper functioning of the system, we build computational methods to answer research questions regarding the phenomena occurring in them. For each phenomenon, we curate novel datasets and propose interpretable models that build upon prior work on distributional representations of text, topic models, and discrete choice models. These models provide valuable insights and enable the construction of tools that could help solve some of the problems affecting the systems. First, we look at the problem of subjective bias in documents on the Web and in media. We curate a novel dataset based on Wikipedia's revision history that contains pairs of versions of the same Wikipedia article, where the subjective bias in one version has been corrected to generate the other version. We train a Bradley-Terry model that uses text features to perform a pairwise comparison of bias between these versions. We show that we can interpret the parameters of the model to discover the words most indicative of bias. Our model also learns to compute a real-valued bias score for documents. We show that this score can be used as a measure of bias across topics and domains not seen in training, including in the media, political speeches, law amendments, and tweets. Second, we infer effective strategies for improving user engagement in social media campaigns, taking the example of tweets about climate change. We build an interpretable model to rank tweets on the basis of predicted engagement by using their topic and metadata features. The ranking framework enables us to avoid the influence of confounding factors such as author popularity. We make several recommendations for the optimization of engagement, based on the learned model parameters, such as talking about mitigation and adaptation strategies, instead of projections and effects. Third, we study the influence of interest groups (lobbies) on parliaments, taking the European Parliament (EP) as an example. We curate novel datasets of the position papers of the lobbies and speeches of the members of the EP (MEPs), and we match them to discover interpretable links between lobbies and MEPs. In the absence of ground-truth data, we indirectly validate the discovered links by comparing them with a dataset, which we curate, of retweet links between lobbies and MEPs and with the publicly disclosed meetings of MEPs. An aggregate analysis of the discovered links reveals patterns that follow ideology (e.g., the center-left political group is more associated with social causes). Finally, we study the law-making process within the EP. We mine a rich dataset of edits to law proposals and develop models that predict their acceptance by parliamentary committees. Our models use textual and metadata features of the edit, and latent features to capture the interaction between parliamentarians and laws. We show that the model accurately predicts the acceptance of the edits. Furthermore, we interpret the parameters of the learned model to derive interesting insights into the legislative process. We show that, among other observations, edits that narrow the scope of the law and insert recommendations for actions are more likely to be accepted than those that insert obligations.
EPFL_TH9138.pdf
n/a
openaccess
copyright
1.91 MB
Adobe PDF
5e6f6991dbbbb91fcd69f0ac770d69ae