A comparative evaluation of machine learning algorithms for predicting syngas fermentation outcomes

Garrett W. Roell, Ashik Sathish, Ni Wan, Qianshun Cheng, Zhiyou Wen, Yinjie J. Tang, Forrest Sheng Bao

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Clostridium carboxidivorans can use syngas to produce acids and alcohols. However, simulating gas fermentation dynamics remains challenging. This study employed data transformation and machine learning (ML) approaches to predict syngas fermentation behavior. Syngas composition and fermentative metabolite concentrations (features) were paired with the production rates (prediction targets) of acetate, ethanol, butyrate, and butanol at each time point. This transformation avoided the use of time as a feature. Data augmentation by polynomial smoothing of experimental measurements was used to create a database for supervised learning of 836 rate instances from 10 gas compositions. Seven families of ML algorithms were compared, including neural networks, support vector machines, random forests, elastic nets, lasso regressors, k-nearest neighbors, and Bayesian ridge regressors. These algorithms predicted production rates for training data with Pearson correlation coefficients (R2 > 0.9), but they showed poorer performance for predicting unseen test data. Among the algorithms, random forests and support vector machines produced the most accurate predictions for the test data, which could regenerate product concentration curves (R2 ≈ 0.85). In contrast, neural networks had a higher risk of overfitting. Additionally, ML-based feature importance analysis highlighted the significant impacts of CO and H2 on alcohol production, which offersguidance for model predictive control. Together, these findings can help direct future applications of ML algorithms to complex bioprocesses with limited data.

Original languageEnglish
Article number108578
JournalBiochemical Engineering Journal
StatePublished - Aug 2022


  • Clostridium carboxidivorans
  • Data transformation
  • Model predictive control
  • Neural network
  • Random forest
  • Support vector machine


Dive into the research topics of 'A comparative evaluation of machine learning algorithms for predicting syngas fermentation outcomes'. Together they form a unique fingerprint.

Cite this