Library Header Image
LSE Research Online LSE Library Services

Multi-label prediction for political text-as-data

Erlich, Aaron, Dantas, Stefano G., Bagozzi, Benjamin E., Berliner, Daniel ORCID: 0000-0002-0285-0215 and Palmer-Rubin, Brian (2022) Multi-label prediction for political text-as-data. Political Analysis, 30 (4). 463 - 480. ISSN 1047-1987

[img] Text (Multitarget_Prediction_final_beforetypesetting) - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (436kB)

Identification Number: 10.1017/pan.2021.15


Political scientists increasingly use supervised machine learning to code multiple relevant labels from a single set of texts. The current "best practice"of individually applying supervised machine learning to each label ignores information on inter-label association(s), and is likely to under-perform as a result. We introduce multi-label prediction as a solution to this problem. After reviewing the multi-label prediction framework, we apply it to code multiple features of (i) access to information requests made to the Mexican government and (ii) country-year human rights reports. We find that multi-label prediction outperforms standard supervised learning approaches, even in instances where the correlations among one's multiple labels are low.

Item Type: Article
Official URL:
Additional Information: © 2021 The Authors
Divisions: Government
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
J Political Science > JA Political science (General)
Date Deposited: 02 Jul 2021 10:09
Last Modified: 12 Jul 2024 17:06

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics