Chronic obstructive pulmonary disease phenotypes using cluster analysis of electronic medical records

Rodrigo Vazquez Guillamet, Oleg Ursu, Gary Iwamoto, Pope L. Moseley, Tudor Oprea

Research output: Contribution to journalArticlepeer-review

20 Scopus citations


Chronic obstructive pulmonary disease is a heterogeneous disease. In this retrospective study, we hypothesize that it is possible to identify clinically relevant phenotypes by applying clustering methods to electronic medical records. We included all the patients >40 years with a diagnosis of chronic obstructive pulmonary disease admitted to the University of New Mexico Hospital between 1 January 2011 and 1 May 2014. We collected admissions, demographics, comorbidities, severity markers and treatments. A total of 3144 patients met the inclusion criteria: 46 percent were >65 years and 52 percent were males. The median Charlson score was 2 (interquartile range: 1–4) and the most frequent comorbidities were depression (36%), congestive heart failure (25%), obesity (19%), cancer (19%) and mild liver disease (18%). Using the sphere exclusion method, nine clusters were obtained: depression–chronic obstructive pulmonary disease, coronary artery disease–chronic obstructive pulmonary disease, cerebrovascular disease–chronic obstructive pulmonary disease, malignancy–chronic obstructive pulmonary disease, advanced malignancy–chronic obstructive pulmonary disease, diabetes mellitus–chronic kidney disease–chronic obstructive pulmonary disease, young age–few comorbidities–high readmission rates–chronic obstructive pulmonary disease, atopy–chronic obstructive pulmonary disease, and advanced disease–chronic obstructive pulmonary disease. These clusters will need to be validated prospectively.

Original languageEnglish
Pages (from-to)394-409
Number of pages16
JournalHealth Informatics Journal
Issue number4
StatePublished - Dec 1 2018


  • asthma
  • chronic obstructive pulmonary disease
  • comorbidity
  • epidemiology
  • factor analysis
  • phenotype


Dive into the research topics of 'Chronic obstructive pulmonary disease phenotypes using cluster analysis of electronic medical records'. Together they form a unique fingerprint.

Cite this