Toward a framework for creating trustworthy measures with supervised machine learning for text

  • Ju Yeon Park
  • , Jacob M. Montgomery

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Supervised learning is increasingly used in social science research to quantify abstract concepts in textual data. However, a review of recent studies reveals inconsistencies in reporting practices and validation standards. To address this issue, we propose a framework that systematically outlines the process of transforming text into a quantitative measure, emphasizing key reporting decisions at each stage. Clear and comprehensive validation is crucial, enabling readers to critically evaluate both the methodology and the resulting measure. To illustrate our framework, we develop and validate a measure assessing the tone of questions posed to nominees during U.S. Senate confirmation hearings. This study contributes to the growing literature advocating for transparency and rigor in applying machine learning methods within computational social sciences.

    Original languageEnglish
    JournalPolitical Science Research and Methods
    DOIs
    StateAccepted/In press - 2025

    Keywords

    • confirmation hearings
    • Senate
    • supervised learning
    • text analysis
    • US Congress

    Fingerprint

    Dive into the research topics of 'Toward a framework for creating trustworthy measures with supervised machine learning for text'. Together they form a unique fingerprint.

    Cite this