Objectives: This study seeks to (1) demonstrate how machine learning (ML) can be used for prediction modeling by predicting the treatment patients with T1-2, N0-N1 oropharyngeal squamous cell carcinoma (OPSCC) receive and (2) assess the impact patient, socioeconomic, regional, and institutional factors have in the treatment of this population. Methods: A retrospective cohort of adults diagnosed with T1-2, N0-N1 OPSCC from 2004 to 2013 was obtained using the National Cancer Database. The data was split into 80/20 distribution for training and testing, respectively. Various ML algorithms were explored for development. Area under the curve (AUC), accuracy, precision, and recall were calculated for the final model. Results: Among the 19,111 patients in the study, the mean (standard deviation) age was 61.3 (10.8) years, 14,034 (73%) were male, and 17,292 (91%) were white. Surgery was the primary treatment in 9,533 (50%) cases and radiation in 9,578 (50%) cases. The model heavily utilized T-stage, primary site, N-stage, grade, and type of treatment facility to predict the primary treatment modality. The final model yielded an AUC of 78% (95% CI, 77-79%), accuracy of 71%, precision of 72%, and recall of 71%. Conclusion: This study created a ML model utilizing clinical variables to predict primary treatment modality for T1-2, N0-N1 OPSCC. This study demonstrates how ML can be used for prediction modeling while also highlighting that tumor and facility realted variables impact the decision making process on a national level.
- Artificial intelligence
- Decision forest
- Machine learning
- Oropharyngeal squamous cell carcinoma