TY - JOUR
T1 - Detection and Analysis of Spikes in a Random Sequence
AU - Dasgupta, Anirban
AU - Li, Bo
N1 - Publisher Copyright:
© 2018, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Motivated by the more frequent natural and anthropogenic hazards, we revisit the problem of assessing whether an apparent temporal clustering in a sequence of randomly occurring events is a genuine surprise and should call for an examination. We study the problem in both discrete and continuous time formulation. In the discrete formulation, the problem reduces to deriving the probability that p independent people all have birthdays within d days of each other. We provide an analytical expression for a warning limit such that if a subset of p people among n are observed to have birthdays within d days of each other and d is smaller than our warning limit, then it should be treated as a surprising cluster. In the continuous time framework, three different sets of results are given. First, we provide an asymptotic analysis of the problem by embedding it into an extreme value problem for high order spacings of iid samples from the U[0, 1] density. Second, a novel analytical nonasymptotic bound is derived by using certain tools of empirical process theory. Finally, the required probability is approximated by using various bounds and asymptotic results on the supremum of the scanning process of a one dimensional stationary Poisson process. We apply the theories to climate change related datasets, datasets on temperatures, and mass shooting records in the United States. These real data applications of our theoretical methods lead to supporting evidence for climate change and recent spikes in gun violence.
AB - Motivated by the more frequent natural and anthropogenic hazards, we revisit the problem of assessing whether an apparent temporal clustering in a sequence of randomly occurring events is a genuine surprise and should call for an examination. We study the problem in both discrete and continuous time formulation. In the discrete formulation, the problem reduces to deriving the probability that p independent people all have birthdays within d days of each other. We provide an analytical expression for a warning limit such that if a subset of p people among n are observed to have birthdays within d days of each other and d is smaller than our warning limit, then it should be treated as a surprising cluster. In the continuous time framework, three different sets of results are given. First, we provide an asymptotic analysis of the problem by embedding it into an extreme value problem for high order spacings of iid samples from the U[0, 1] density. Second, a novel analytical nonasymptotic bound is derived by using certain tools of empirical process theory. Finally, the required probability is approximated by using various bounds and asymptotic results on the supremum of the scanning process of a one dimensional stationary Poisson process. We apply the theories to climate change related datasets, datasets on temperatures, and mass shooting records in the United States. These real data applications of our theoretical methods lead to supporting evidence for climate change and recent spikes in gun violence.
KW - Poisson process
KW - Probability
KW - Random sequence
KW - Scan statistic
UR - https://www.scopus.com/pages/publications/85045926425
U2 - 10.1007/s11009-018-9637-0
DO - 10.1007/s11009-018-9637-0
M3 - Article
AN - SCOPUS:85045926425
SN - 1387-5841
VL - 20
SP - 1429
EP - 1451
JO - Methodology and Computing in Applied Probability
JF - Methodology and Computing in Applied Probability
IS - 4
ER -