TY - JOUR
T1 - A strategy for assembling samples of adult twin pairs in the United States
AU - Goldberg, Jack
AU - Henderson, William G.
AU - Eisen, Seth A.
AU - True, William
AU - Ramakrishnan, Viswanathan
AU - Lyons, Michael J.
AU - Tsuang, Ming T.
PY - 1993/9/30
Y1 - 1993/9/30
N2 - In this paper we develop a methodology for the identification of large numbers of U.S. adult twin pairs. Data for this study derive from the U.S. Department of Defense and the Vietnam Era Twin )VET( Registry. The Department of Defense identified potential male twins )n = 10,002( using a computerized record linkage algorithm based on the same last name, same date of birth, and the same first five digits of the Social Security number. Twinship was confirmed by comparison with the Vietnam Era Twin Registry. We developed a logistic regression model that predicts the probability that a paired record identifies twins based on the absolute difference in the last four digits in the Social Security number, the age of issuance of the Social Security number, and the frequency of occurrence of the last name. We used the estimated coefficients derived from this regression model to assign predicted probabilities of being a twin to each matched record. There is a close correspondence between the observed and expected number of twins when evaluated across deciles of predicted probabilities of being a twin; the value of the Harrell's c index )c = 0·68 ∓ 0·0004( indicates the overall predictive accuracy of the regression equation. The results from this study demonstrate the feasibility of identifying adult male–male twin pairs from any large computerized database that contains name, date of birth and Social Security number. However, the selection criteria used in the creation of the computer database must be clearly specified to avoid constructing a biased sample of twins.
AB - In this paper we develop a methodology for the identification of large numbers of U.S. adult twin pairs. Data for this study derive from the U.S. Department of Defense and the Vietnam Era Twin )VET( Registry. The Department of Defense identified potential male twins )n = 10,002( using a computerized record linkage algorithm based on the same last name, same date of birth, and the same first five digits of the Social Security number. Twinship was confirmed by comparison with the Vietnam Era Twin Registry. We developed a logistic regression model that predicts the probability that a paired record identifies twins based on the absolute difference in the last four digits in the Social Security number, the age of issuance of the Social Security number, and the frequency of occurrence of the last name. We used the estimated coefficients derived from this regression model to assign predicted probabilities of being a twin to each matched record. There is a close correspondence between the observed and expected number of twins when evaluated across deciles of predicted probabilities of being a twin; the value of the Harrell's c index )c = 0·68 ∓ 0·0004( indicates the overall predictive accuracy of the regression equation. The results from this study demonstrate the feasibility of identifying adult male–male twin pairs from any large computerized database that contains name, date of birth and Social Security number. However, the selection criteria used in the creation of the computer database must be clearly specified to avoid constructing a biased sample of twins.
UR - http://www.scopus.com/inward/record.url?scp=0027270817&partnerID=8YFLogxK
U2 - 10.1002/sim.4780121805
DO - 10.1002/sim.4780121805
M3 - Article
C2 - 8248662
AN - SCOPUS:0027270817
SN - 0277-6715
VL - 12
SP - 1693
EP - 1702
JO - Statistics in Medicine
JF - Statistics in Medicine
IS - 18
ER -