Aggregated estimating equation estimation

  • Nan Lin
  • , Ruibin Xi

Research output: Contribution to journalArticlepeer-review

Abstract

Motivated by the recent active research on online analytical processing (OLAP), we develop a computation and storage efficient algorithm for estimating equation (EE) estimation in massive data sets using a "divide-and-conquer" strategy. In each partition of the data set, we compress the raw data into some low dimensional statistics and then discard the raw data. Then, we obtain an approximation to the EE estimator, the aggregated EE (AEE) estimator, by solving an equation aggregated from the saved low dimensional statistics in all partitions. Such low dimensional statistics are taken as the EE estimates and first-order derivatives of the estimating equations in each partition. We show that, under proper partitioning and some regularity conditions, the AEE estimator is strongly consistent and asymptotically equivalent to the EE estimator. A major application of the AEE technique is to support fast OLAP of EE estimations for data warehousing technologies such as data cubes and data streams. It can also be used to reduce the computation time and conquer the memory constraint problem posed by massive data sets. Simulation studies show that the AEE estimator provides efficient storage and remarkable deduction in computational time, especially in its applications to data cubes and data streams.

Original languageEnglish
Pages (from-to)73-84
Number of pages12
JournalStatistics and its Interface
Volume4
Issue number1
DOIs
StatePublished - 2011

Keywords

  • Aggregation
  • Asymptotic normality
  • Consistency
  • Data compression
  • Data cube
  • Estimating equation
  • Massive data sets

Fingerprint

Dive into the research topics of 'Aggregated estimating equation estimation'. Together they form a unique fingerprint.

Cite this