Sampling-based estimation for massive survival data with additive hazards model

Lulu Zuo, Haixiang Zhang, Hai Ying Wang, Lei Liu

Research output: Contribution to journalArticlepeer-review

14 Scopus citations


For massive survival data, we propose a subsampling algorithm to efficiently approximate the estimates of regression parameters in the additive hazards model. We establish consistency and asymptotic normality of the subsample-based estimator given the full data. The optimal subsampling probabilities are obtained via minimizing asymptotic variance of the resulting estimator. The subsample-based procedure can largely reduce the computational cost compared with the full data method. In numerical simulations, our method has low bias and satisfactory coverage probabilities. We provide an illustrative example on the survival analysis of patients with lymphoma cancer from the Surveillance, Epidemiology, and End Results Program.

Original languageEnglish
Pages (from-to)441-450
Number of pages10
JournalStatistics in medicine
Issue number2
StatePublished - Jan 30 2021


  • additive hazards model
  • big data
  • subsample-based estimator
  • subsampling probabilities
  • survival analysis


Dive into the research topics of 'Sampling-based estimation for massive survival data with additive hazards model'. Together they form a unique fingerprint.

Cite this