Correlative hierarchical clustering-based low-rank dimensionality reduction of radiomics-driven phenotype in non-small cell lung cancer

Bardia Yousefi, Nariman Jahani, Michael J. Lariviere, Eric Cohen, Meng Kang Hsieh, José Marcio Luna, Rhea D. Chitalia, Jeffrey C. Thompson, Erica L. Carpenter, Sharyn I. Katz, Despina Kontos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations


Background: Lung cancer is one of the most common cancers in the United States and the most fatal, with 142,670 deaths in 2019. Accurately determining tumor response is critical to clinical treatment decisions, ultimately impacting patient survival. To better differentiate between non-small cell lung cancer (NSCLC) responders and non-responders to therapy, radiomic analysis is emerging as a promising approach to identify associated imaging features undetectable by the human eye. However, the plethora of variables extracted from an image may actually undermine the performance of computer-Aided prognostic assessment, known as the curse of dimensionality. In the present study, we show that correlative-driven hierarchical clustering improves high-dimensional radiomics-based feature selection and dimensionality reduction, ultimately predicting overall survival in NSCLC patients. Methods: To select features for high-dimensional radiomics data, a correlation-incorporated hierarchical clustering algorithm automatically categorizes features into several groups. The truncation distance in the resulting dendrogram graph is used to control the categorization of the features, initiating low-rank dimensionality reduction in each cluster, and providing descriptive features for Cox proportional hazards (CPH)-based survival analysis. Using a publicly available non-NSCLC radiogenomic dataset of 204 patients' CT images, 429 established radiomics features were extracted. Low-rank dimensionality reduction via principal component analysis (PCA) was employed (o= o, o < o) to find the representative components of each cluster of features and calculate cluster robustness using the relative weighted consistency metric. Results: Hierarchical clustering categorized radiomic features into several groups without primary initialization of cluster numbers using the correlation distance metric (as a function) to truncate the resulting dendrogram into different distances. The dimensionality was reduced from 429 to 67 features (for truncation distance of 0.1). The robustness within the features in clusters was varied from-1.12 to-30.02 for truncation distances of 0.1 to 1.8, respectively, which indicated that the robustness decreases with increasing truncation distance when smaller number of feature classes (i.e., clusters) are selected. The best multivariate CPH survival model had a C-statistic of 0.71 for truncation distance of 0.1, outperforming conventional PCA approaches by 0.04, even when the same number of principal components was considered for feature dimensionality. Conclusions: Correlative hierarchical clustering algorithm truncation distance is directly associated with robustness of the clusters of features selected and can effectively reduce feature dimensionality while improving outcome prediction.

Original languageEnglish
Title of host publicationMedical Imaging 2019
Subtitle of host publicationImaging Informatics for Healthcare, Research, and Applications
EditorsPo-Hao Chen, Peter R. Bak
ISBN (Electronic)9781510625556
StatePublished - 2019
EventMedical Imaging 2019: Imaging Informatics for Healthcare, Research, and Applications - San Diego, United States
Duration: Feb 17 2019Feb 18 2019

Publication series

NameProgress in Biomedical Optics and Imaging - Proceedings of SPIE
ISSN (Print)1605-7422


ConferenceMedical Imaging 2019: Imaging Informatics for Healthcare, Research, and Applications
Country/TerritoryUnited States
CitySan Diego


  • Cox proportional hazard (CPH) model
  • Dimensionality reduction
  • Feature robustness
  • Feature selection
  • Hierarchical clustering
  • Non-small cell lung cancer
  • Radiomic features
  • Survival analysis.


Dive into the research topics of 'Correlative hierarchical clustering-based low-rank dimensionality reduction of radiomics-driven phenotype in non-small cell lung cancer'. Together they form a unique fingerprint.

Cite this