This study demonstrates a novel approach to test associations between highly heterogeneous genetic loci and complex phenotypes. Previous investigations of the relationship between Cytochrome P450 2A6 (CYP2A6) genotype and smoking phenotypes made comparisons by dividing subjects into broad categories based on assumptions that simplify the range of function of different CYP2A6 alleles, their numerous possible diplotype combinations and non-additive allele effects. A predictive model that translates CYP2A6 diplotype into a single continuous variable was previously derived from an in vivo metabolism experiment in 189 European Americans. Here, we apply this model to assess associations between genotype, inferred nicotine metabolism and smoking behaviors in larger samples without direct nicotine metabolism measurements. CYP2A6 genotype is not associated with nicotine dependence, as defined by the Fagerström Test of Nicotine Dependence, demonstrating that cigarettes smoked per day (CPD) and nicotine dependence have distinct genetic correlates. The predicted metric is significantly associated with CPD among African Americans and European American dependent smokers. Individual slow metabolizing genotypes are associated with lower CPD, but the predicted metric is the best predictor of CPD. Furthermore, optimizing the predictive model by including additional CYP2A6 alleles improves the fit of the model in an independent data set and provides a novel method of predicting the functional impact of alleles without direct metabolism measurements. Lastly, comprehensive genotyping and in vivo metabolism data are used to demonstrate that genome-wide significant associations between CPD and single nucleotide polymorphisms are the result of synthetic associations.