TY - JOUR
T1 - Toward a framework for creating trustworthy measures with supervised machine learning for text
AU - Park, Ju Yeon
AU - Montgomery, Jacob M.
N1 - Publisher Copyright:
© The Author(s), 2025. Published by Cambridge University Press on behalf of EPS Academic Ltd.
PY - 2025
Y1 - 2025
N2 - Supervised learning is increasingly used in social science research to quantify abstract concepts in textual data. However, a review of recent studies reveals inconsistencies in reporting practices and validation standards. To address this issue, we propose a framework that systematically outlines the process of transforming text into a quantitative measure, emphasizing key reporting decisions at each stage. Clear and comprehensive validation is crucial, enabling readers to critically evaluate both the methodology and the resulting measure. To illustrate our framework, we develop and validate a measure assessing the tone of questions posed to nominees during U.S. Senate confirmation hearings. This study contributes to the growing literature advocating for transparency and rigor in applying machine learning methods within computational social sciences.
AB - Supervised learning is increasingly used in social science research to quantify abstract concepts in textual data. However, a review of recent studies reveals inconsistencies in reporting practices and validation standards. To address this issue, we propose a framework that systematically outlines the process of transforming text into a quantitative measure, emphasizing key reporting decisions at each stage. Clear and comprehensive validation is crucial, enabling readers to critically evaluate both the methodology and the resulting measure. To illustrate our framework, we develop and validate a measure assessing the tone of questions posed to nominees during U.S. Senate confirmation hearings. This study contributes to the growing literature advocating for transparency and rigor in applying machine learning methods within computational social sciences.
KW - confirmation hearings
KW - Senate
KW - supervised learning
KW - text analysis
KW - US Congress
UR - https://www.scopus.com/pages/publications/105017464792
U2 - 10.1017/psrm.2025.10042
DO - 10.1017/psrm.2025.10042
M3 - Article
AN - SCOPUS:105017464792
SN - 2049-8470
JO - Political Science Research and Methods
JF - Political Science Research and Methods
ER -