Purpose: Using machine learning approaches, dosimetric variables predictive of severity of radiation pneumonitis (RP) were investigated and the performance of predictive models in conjunction with those variables was assessed. Methods: We analyzed 209 non‐small cell lung carcinoma (NSCLC) patients treated at Washington University School of Medicine in 1991–2001. All RP events were graded according to Washington University Lung Toxicity Criteria. The patients were categorized into three groups. We employed machine learning‐based multi‐categorical classification methods to predict the severity of RP. For the unbiased performance test, leave‐one‐out cross validation (LOOCV) was used in machine learning methods including support vector machine (SVM), linear discriminant analysis (LDA), kernel LDA, and fast kernel discriminant analysis (FKDA) published in our previous study (IEEE/ACM TCBB 2011, Oh et al.). At each iteration of LOOCV, analysis of variance (ANOVA) test was used to rank variables. After the whole iteration, the variables were ranked based on the frequency of occurrence in best models. Results: The number of patients in each grade was 102 (48.8%), 59 (28.2%), 26 (12.5%), 13 (6.2%), 5(2.4%), and 4 (1.9%) from grade 0 to 5, respectively. Among several scenarios, the categorization of 0,1 vs 2 vs 3,4,5 in grade using FKDA yielded the best performance (Spearman correlation coefficient (Rs_cv) = 0.356, p < 0.001) with lung D30 (minimum dose to the hottest 30% volume of lung) and heart D10. Another categorization (0,1 vs 2,3 vs 4,5) produced slightly worse performance (Rs_cv = 0.336, p < 0.001) with lung D30, heart D10, and heart V75 (percentage volume of heart receiving at least 75 Gy). Conclusion: Multi‐categorical RP status was predicted using machine learning methods that helped identify relevant variables and build predictive models.