TY - JOUR
T1 - A domain-based model for predicting large and complex pseudoknotted structures
AU - Cao, Song
AU - Chen, Shi Jie
N1 - Funding Information:
This research was supported by NIH through grant GM063732 and NSF through grants MCB0920411 and MCB0920067. Most of numerical calculations involved in this research were performed on the HPC resources at the University of Missouri Bioinformatics Consortium (UMBC).
PY - 2012/2
Y1 - 2012/2
N2 - Pseudoknotted structures play important structural and functional roles in RNA cellular functions at the level of transcription, splicing and translation. However, the problem of computational prediction for large pseudoknotted folds remains. Here we develop a domain-based method for predicting complex and large pseudoknotted structures from RNA sequences. The model is based on the observation that large RNAs can be separated into different structural domains. The basic idea is to first identify the domains and then predict the structures for each domain. Assembly of the domain structures gives the full structure. The use of the domain-based approach leads to a reduction of computational time by a factor of about ∼N2 for an N-nt sequence. As applications of the model, we predict structures for a variety of RNA systems, such as regions in human telomerase RNA (hTR), internal ribosome entry site (IRES) and HIV genome. The lengths of these sequences range from 200-nt to 400-nt. The results show good agreements with the experiments.
AB - Pseudoknotted structures play important structural and functional roles in RNA cellular functions at the level of transcription, splicing and translation. However, the problem of computational prediction for large pseudoknotted folds remains. Here we develop a domain-based method for predicting complex and large pseudoknotted structures from RNA sequences. The model is based on the observation that large RNAs can be separated into different structural domains. The basic idea is to first identify the domains and then predict the structures for each domain. Assembly of the domain structures gives the full structure. The use of the domain-based approach leads to a reduction of computational time by a factor of about ∼N2 for an N-nt sequence. As applications of the model, we predict structures for a variety of RNA systems, such as regions in human telomerase RNA (hTR), internal ribosome entry site (IRES) and HIV genome. The lengths of these sequences range from 200-nt to 400-nt. The results show good agreements with the experiments.
KW - Hepatitis delta virus (HDV)
KW - Human immunodeficiency virus (HIV)
KW - Human telomerase RNA (hTR)
KW - Internal ribosome entry site (IRES)
KW - Large RNAs
KW - Pseudoknots
KW - Structural predictions
UR - http://www.scopus.com/inward/record.url?scp=84863229452&partnerID=8YFLogxK
U2 - 10.4161/rna.9.2.18488
DO - 10.4161/rna.9.2.18488
M3 - Article
C2 - 22418848
AN - SCOPUS:84863229452
SN - 1547-6286
VL - 9
SP - 201
EP - 212
JO - RNA Biology
JF - RNA Biology
IS - 2
ER -