Population modeling with machine learning can enhance measures of mental health - Open-data replication

Ty Easley, Ruiqi Chen, Kayla Hannon, Rosie Dutt, Janine Bijsterbosch

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution through undersampling of majority scores, and identified data-driven subtypes to investigate the impact of between-participant heterogeneity. Our results replicated prior results from Dadi et al. (2021) in a larger sample. Each data manipulation further led to small but consistent improvements in prediction accuracy, which were largely additive when combining multiple data manipulations. Combining data manipulations (i.e., extended fMRI features, averaged target phenotype, balanced target phenotype distribution) led to a three-fold increase in prediction accuracy for fluid intelligence compared to prior work. These findings highlight the benefit of several relatively easy and low-cost data manipulations, which may positively impact future work.

Original languageEnglish
Article number100163
JournalNeuroimage: Reports
Issue number2
StatePublished - Jun 2023


  • Data pollution
  • Intelligence
  • Neuroticism
  • Prediction
  • Replication
  • Resting state fMRI


Dive into the research topics of 'Population modeling with machine learning can enhance measures of mental health - Open-data replication'. Together they form a unique fingerprint.

Cite this