TY - GEN
T1 - Affymetrix® mismatch (MM) probes
T2 - 2012 ASE International Conference on BioMedical Computing, BioMedCom 2012
AU - Flight, Robert M.
AU - Eteleeb, Abdallah M.
AU - Rouchka, Eric C.
PY - 2012
Y1 - 2012
N2 - Affymetrix® GeneChip® micro array design defines probe sets consisting of 11, 16, or 20 distinct 25 base pair (BP) probes for determining mRNA expression for a specific gene, which may be covered by one or more probe sets. Each probe has a corresponding perfect match (PM) and mismatch (MM) set. Traditional analytical techniques have either used the MM probes to determine the level of cross-hybridization or reliability of the PM probe, or have been completely ignored. Given the availability of reference genome sequences, we have reanalyzed the mapping of both PM and MM probes to reference genomes in transcript regions. Our results suggest that depending of the species of interest, 66%-93% of the PM probes can be used reliably in terms of single unique matches to the genome, while a small number of the MM probes (typically less than 1%) could be incorporated into the analysis. In addition, we have examined the mapping of PM and MM probes to five different human genome projects, resulting in approximately a 70% overlap of uniquely mapping PM probes, and a subset of 51 uniquely mapping MM probes commonly found in all five projects, 24 of which are found within annotated exonic regions. These results suggest that individual variation in transcriptome regions provides an additional complexity to micro array data analysis. Given these results, we conclude that the development of custom chip definition files (CDFs) should include MM probe sequences to provide the most effective means of transcriptome analysis of Affymetrix® GeneChip® arrays.
AB - Affymetrix® GeneChip® micro array design defines probe sets consisting of 11, 16, or 20 distinct 25 base pair (BP) probes for determining mRNA expression for a specific gene, which may be covered by one or more probe sets. Each probe has a corresponding perfect match (PM) and mismatch (MM) set. Traditional analytical techniques have either used the MM probes to determine the level of cross-hybridization or reliability of the PM probe, or have been completely ignored. Given the availability of reference genome sequences, we have reanalyzed the mapping of both PM and MM probes to reference genomes in transcript regions. Our results suggest that depending of the species of interest, 66%-93% of the PM probes can be used reliably in terms of single unique matches to the genome, while a small number of the MM probes (typically less than 1%) could be incorporated into the analysis. In addition, we have examined the mapping of PM and MM probes to five different human genome projects, resulting in approximately a 70% overlap of uniquely mapping PM probes, and a subset of 51 uniquely mapping MM probes commonly found in all five projects, 24 of which are found within annotated exonic regions. These results suggest that individual variation in transcriptome regions provides an additional complexity to micro array data analysis. Given these results, we conclude that the development of custom chip definition files (CDFs) should include MM probe sequences to provide the most effective means of transcriptome analysis of Affymetrix® GeneChip® arrays.
KW - Bioinformatics
KW - custom definition files
KW - microarray
KW - probe set
UR - http://www.scopus.com/inward/record.url?scp=84878716955&partnerID=8YFLogxK
U2 - 10.1109/BioMedCom.2012.8
DO - 10.1109/BioMedCom.2012.8
M3 - Conference contribution
AN - SCOPUS:84878716955
SN - 9780769549385
T3 - Proceedings of the 2012 ASE International Conference on BioMedical Computing, BioMedCom 2012
SP - 6
EP - 13
BT - Proceedings of the 2012 ASE International Conference on BioMedical Computing, BioMedCom 2012
PB - IEEE Computer Society
Y2 - 14 December 2012 through 16 December 2012
ER -