TY - GEN
T1 - Detecting coevolution of functionally related proteins for automated protein annotation
AU - Kwan, Alan L.
AU - Dutcher, Susan K.
AU - Stormo, Gary D.
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2010
Y1 - 2010
N2 - Sequence similarity based protein clustering methods organize proteins into families of similar sequences, a task that continues to be critical for automated protein characterization. However, many protein families cannot be automatically characterized further because little is known about the function of any protein in a family of similar sequences. We present a novel phylogenetic profile comparison (PPC) method called Automated Protein Annotation by Coordinate Evolution (APACE) that facilitates the automated characterization of proteins beyond their homology to other similar sequences. Our method implements a new approach for the normalization of similarity scores among multiple species and automates the characterization of proteins by their patterns of co-evolution with other proteins that do not necessarily share a similar sequence. We demonstrate that our method is able to recapitulate the topology of the latest, unresolved, composite deep eukaryotic phylogeny and is able to quantify the as yet unresolved branch lengths. We further demonstrate that our method is able to detect more functionally related proteins, given the same starting data, than existing methods. Finally, we demonstrate that our method can be successfully applied to much larger comparative genomic problem instances where existing methods often fail.
AB - Sequence similarity based protein clustering methods organize proteins into families of similar sequences, a task that continues to be critical for automated protein characterization. However, many protein families cannot be automatically characterized further because little is known about the function of any protein in a family of similar sequences. We present a novel phylogenetic profile comparison (PPC) method called Automated Protein Annotation by Coordinate Evolution (APACE) that facilitates the automated characterization of proteins beyond their homology to other similar sequences. Our method implements a new approach for the normalization of similarity scores among multiple species and automates the characterization of proteins by their patterns of co-evolution with other proteins that do not necessarily share a similar sequence. We demonstrate that our method is able to recapitulate the topology of the latest, unresolved, composite deep eukaryotic phylogeny and is able to quantify the as yet unresolved branch lengths. We further demonstrate that our method is able to detect more functionally related proteins, given the same starting data, than existing methods. Finally, we demonstrate that our method can be successfully applied to much larger comparative genomic problem instances where existing methods often fail.
UR - http://www.scopus.com/inward/record.url?scp=77956156585&partnerID=8YFLogxK
U2 - 10.1109/BIBE.2010.24
DO - 10.1109/BIBE.2010.24
M3 - Conference contribution
C2 - 21655203
AN - SCOPUS:77956156585
SN - 9780769540832
T3 - 10th IEEE International Conference on Bioinformatics and Bioengineering 2010, BIBE 2010
SP - 99
EP - 105
BT - 10th IEEE International Conference on Bioinformatics and Bioengineering 2010, BIBE 2010
T2 - 10th IEEE International Conference on Bioinformatics and Bioengineering, BIBE-2010
Y2 - 31 May 2010 through 3 June 2010
ER -