Background: Data used for evaluating quality of medical care need to be of high reliability to ensure valid quality assessment and benchmarking. The American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) has continually emphasized the collection of highly reliable clinical data through its program infrastructure. Study Design: We provide a detailed description of the various mechanisms used in ACS NSQIP to assure collection of high quality data, including training of data collectors (surgical clinical reviewers) and ongoing audits of data reliability. For the 2005 through 2008 calendar years, inter-rater reliability was calculated overall and for individual variables using percentages of agreement between the data collector and the auditor. Variables with > 5% disagreement are flagged for educational efforts to improve accurate collection. Cohen's kappa was estimated for selected variables from the 2007 audit year. Results: Inter-rater reliability audits show that overall disagreement rates on variables have fallen from 3.15% in 2005 (the first year of public enrollment in ACS NSQIP) to 1.56% in 2008. In addition, disagreement levels for individual variables have continually improved, with 26 individual variables demonstrating > 5% disagreement in 2005, to only 2 such variables in 2008. Estimated kappa values suggest substantial or almost perfect agreement for most variables. Conclusions: The ACS NSQIP has implemented training and audit procedures for its hospital participants that are highly effective in collecting robust data. Audit results show that data have been reliable since the program's inception and that reliability has improved every year.