TY - JOUR
T1 - Conditional or unconditional logistic regression for frequency matched case-control design?
AU - Wan, Fei
N1 - Funding Information:
We thank Dr. Peter Imrey from Department of Quantitative Health Sciences at Cleveland Clinic for his critical review of the manuscript. We also thank the anonymous referees for their useful suggestions.
Publisher Copyright:
© 2022 John Wiley & Sons Ltd.
PY - 2022/3/15
Y1 - 2022/3/15
N2 - Frequency matching is commonly used in epidemiological case control studies to balance the distributions of the matching factors between the case and control groups and to improve the efficiency of case-control designs. Applied researchers have held a common opinion that unconditional logistic regression should be used to analyze frequency matched designs and conditional logistic regression is unnecessary. However, the justification of this view is unclear. To compare the performances of ULR and CLR in terms of simplicity, unbiasedness, and efficiency in a more intuitive way, we viewed frequency matching from the perspective of weighted sampling and derived the outcome models describing how the exposure and matching factors are associated with the outcome in the matched data separately in two scenarios: (1) only categorical variables are used for matching; (2) continuous variables are categorized for matching. In either scenario the derived outcome model is a logit model with stratum-specific intercepts. Correctly specified unconditional logistic regression can be more efficient than conditional logistic regression, particularly when continuous matching factors are used, whereas conditional logistic regression is a more practical approach because it is less dependent on modeling choices.
AB - Frequency matching is commonly used in epidemiological case control studies to balance the distributions of the matching factors between the case and control groups and to improve the efficiency of case-control designs. Applied researchers have held a common opinion that unconditional logistic regression should be used to analyze frequency matched designs and conditional logistic regression is unnecessary. However, the justification of this view is unclear. To compare the performances of ULR and CLR in terms of simplicity, unbiasedness, and efficiency in a more intuitive way, we viewed frequency matching from the perspective of weighted sampling and derived the outcome models describing how the exposure and matching factors are associated with the outcome in the matched data separately in two scenarios: (1) only categorical variables are used for matching; (2) continuous variables are categorized for matching. In either scenario the derived outcome model is a logit model with stratum-specific intercepts. Correctly specified unconditional logistic regression can be more efficient than conditional logistic regression, particularly when continuous matching factors are used, whereas conditional logistic regression is a more practical approach because it is less dependent on modeling choices.
KW - bias
KW - case-control design
KW - conditional logistic regression
KW - frequency matching
KW - unconditional logistic regression
UR - http://www.scopus.com/inward/record.url?scp=85123470166&partnerID=8YFLogxK
U2 - 10.1002/sim.9313
DO - 10.1002/sim.9313
M3 - Article
C2 - 35067958
AN - SCOPUS:85123470166
SN - 0277-6715
VL - 41
SP - 1023
EP - 1041
JO - Statistics in medicine
JF - Statistics in medicine
IS - 6
ER -