America's racial framework of superiority and Americanness embedded in natural language

  • Messi H.J. Lee
  • , Jacob M. Montgomery
  • , Calvin K. Lai

    Research output: Contribution to journalArticlepeer-review

    6 Scopus citations

    Abstract

    America's racial framework can be summarized using two distinct dimensions: superiority/inferiority and Americanness/foreignness. We investigated America's racial framework in a corpus of spoken and written language using word embeddings. Word embeddings place words on a low-dimensional space where words with similar meanings are proximate, allowing researchers to test whether the positions of group and attribute words in a semantic space reflect stereotypes. We trained a word embedding model on the Corpus of Contemporary American English - a corpus of 1 billion words that span 30 years and 8 text categories - and compared the positions of racial/ethnic groups with respect to superiority and Americanness. We found that America's racial framework is embedded in American English. We also captured an additional nuance: Asian people were stereotyped as more American than Hispanic people. These results are empirical evidence that America's racial framework is embedded in American English.

    Original languageEnglish
    Article numberpgad485
    JournalPNAS Nexus
    Volume3
    Issue number1
    DOIs
    StatePublished - Jan 1 2024

    Keywords

    • ethnicity
    • natural language processing
    • race
    • stereotypes
    • word embeddings

    Fingerprint

    Dive into the research topics of 'America's racial framework of superiority and Americanness embedded in natural language'. Together they form a unique fingerprint.

    Cite this