Direct regression modelling of high-order moments in big data

Ruibin Xi, Nan Lin

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Big data problems present great challenges to statistical analyses, especially from the computational side. In this paper, we consider regression estimation of high-order moments in big data problems based on the U-statistic-based Functional Regression Model (U-FRM) model. The U-FRM model is a nonparametric method that allows direct estimation of higher-order moments without imposing parametric assumptions on the high order-moments. Despite this modeling advantage, its estimation relies on a U-statisticsbased estimating equation whose computational complexity is generally too high for big data. In this paper, we propose using the "divide-and-conquer" strategy to construct a computationally more succinct surrogate estimating equation. Through both theoretical proof and simulations, we show that our method significantly reduces the computational time and meanwhile enjoys the same asymptotic behavior as the original estimation method. We then apply our method to a genomic problem to illustrate its performance on real data.

Original languageEnglish
Pages (from-to)445-452
Number of pages8
JournalStatistics and its Interface
Volume9
Issue number4
DOIs
StatePublished - 2016

Keywords

  • Aggregation
  • Asymptotic normality
  • Big data
  • Consistency
  • Data cube
  • Divide-and-conquer
  • Estimating equation
  • Higher-order moment
  • U-statistics

Fingerprint

Dive into the research topics of 'Direct regression modelling of high-order moments in big data'. Together they form a unique fingerprint.

Cite this