A fast and efficient algorithm for finding frequent items over data stream

  • Ling Chen
  • , Yixin Chen
  • , Li Tu

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

We investigate the problem of finding the frequent items in a continuous data stream. We present an algorithm called λ-Count for computing frequency counts over a user specified threshold on a data stream. To emphasize the importance of the more recent data items, a fading factor λ is used. Our algorithm can detect ε-approximate frequent items of a data stream using O(logλε) memory space and O(1) time to process each data record. The computation time for answering each query is O(logλ ε), and for answering a query about the frequentness of a given data item is O(1). Experimental study shows that λ-Count outperforms other methods in terms of accuracy, memory requirement, and processing speed.

Original languageEnglish
Pages (from-to)1545-1554
Number of pages10
JournalJournal of Computers (Finland)
Volume7
Issue number7
DOIs
StatePublished - 2012

Keywords

  • Data mining
  • Data stream
  • Frequent items

Fingerprint

Dive into the research topics of 'A fast and efficient algorithm for finding frequent items over data stream'. Together they form a unique fingerprint.

Cite this