A machine learning approach for clustered data

Research output: Contribution to journalArticlepeer-review

Abstract

Artificial neural networks (NNs) are a machine learning algorithm that have been used as a convenient alternative of conventional statistical models, such as regression in prediction and classification because of their capability of modeling complex relationships between dependent and independent variables without a priori assumptions about the model form and variable distributions. However, traditional NNs cannot incorporate dependencies of data with a clustering or nesting structure involved in longitudinal studies and cluster sampling. This research is intended to fill this literature gap by integrating the random-effects structure into NNs to account for within-cluster correlations. The proposed NN method incorporating random effects (NNRE) is trained by minimizing the cost function using the backpropagation algorithm combined with the quasi-Newton and gradient descent algorithms. Model overfitting is controlled by using the L 2 regularization method. The trained NNRE model is evaluated for prediction accuracy by using the leaving-one-out cross-validation for both simulated and real data. Prediction accuracy is compared between NNRE and two existing models, the conventional generalized linear mixed model (GLIMMIX) and the generalized neural network mixed model (GNMM), using simulations and real data. Results show that the proposed NNRE results in higher accuracy than both the GLIMMIX and GNMM.

Original languageEnglish
Pages (from-to)406-416
Number of pages11
JournalCommunications in Statistics: Simulation and Computation
Volume54
Issue number2
DOIs
StatePublished - 2025

Keywords

  • Artificial neural networks
  • Backpropagation algorithm
  • Clustered data
  • Generalized linear mixed model
  • Leaving-one-out-cross-validation
  • Overfitting regularization

Fingerprint

Dive into the research topics of 'A machine learning approach for clustered data'. Together they form a unique fingerprint.

Cite this