Is the reduction of dimensionality to a small number of features always necessary in constructing predictive models for analysis of complex diseases or behaviours?

Amin Zollanvari, Nancy L. Saccone, Laura J. Bierut, Marco F. Ramoni, Gil Alterovitz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Gene expression and genome wide association data have provided researchers the opportunity to study many complex traits and diseases. When designing prognostic and predictive models capable of phenotypic classification in this area, significant reduction of dimensionality through stringent filtering and/or feature selection is often deemed imperative. Here, this work challenges this presumption through both theoretical and empirical analysis. This work demonstrates that by a proper compromise between structure of the selected model and the number of features, one is able to achieve better performance even in large dimensionality. The inclusion of many genes/variants in the classification rules can help shed new light on the analysis of complex traitstraits that are typically determined by many causal variants with small effect size.

Original languageEnglish
Title of host publication33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS 2011
Pages3573-3576
Number of pages4
DOIs
StatePublished - 2011
Event33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS 2011 - Boston, MA, United States
Duration: Aug 30 2011Sep 3 2011

Publication series

NameProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
ISSN (Print)1557-170X

Conference

Conference33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS 2011
Country/TerritoryUnited States
CityBoston, MA
Period08/30/1109/3/11

Fingerprint

Dive into the research topics of 'Is the reduction of dimensionality to a small number of features always necessary in constructing predictive models for analysis of complex diseases or behaviours?'. Together they form a unique fingerprint.

Cite this