TY - JOUR
T1 - Splicing signals in Drosophila
T2 - Intron size, information content, and consensus sequences
AU - Mount, Stephen M.
AU - Burks, Christian
AU - Herts, Gerald
AU - Stormo, Gary D.
AU - White, Owen
AU - Fields, Chris
N1 - Funding Information:
We are grateful to G.Hartzell for systems support, to R. Farber for providing us with an updated version of the ExtractGenBank software, to Ted Dunning at the Computer Research Laboratory of New Mexico State University for advice on statistical uncertainties, and to Nicole Kawachi for assistance with the tetranucleotide analysis. S.M. was supported by NIH grant GM 37991, by a NSF Presidential Young Investigator award, and by Basil O'Conner Starter Scholar Research award 5-630 from the March of Dimes Birth Defects Foundation. G.S. and G.H. were supported by NIH grants GM 28755 and HG 00249, O.W. and C.F. were supported by U.S. Department of Energy Genome Program Grant 89ER60865, and C.B was supported by NTH grant GM 37812. This work was in part done under the auspices of the Aspen Center for Physics under a grant from the NSF.
PY - 1992/8/25
Y1 - 1992/8/25
N2 - A database of 209 Drosophila Introns was extracted from Genbank (release number 64.0) and examined by a number of methods in order to characterize features that might serve as signals for messenger RNA splicing. A tight distribution of sizes was observed: while thesmallest introns in the database are 51 nucleotides, more than half are less than 80 nucleotides in length, and most of these have lengths in the range of 59 - 67 nucleotides. Drosophila splice sites found in large and small introns differ in only minor ways from each other and from those found in vertebrate Introns. However, larger introns have greater pyrimidlne-richness in the region between 11 and 21 nucleotides upstream of 3′ splice sites. The Drosophila branchpoint consensus matrix resembles C T A A T (in which branch formation occurs at the underlined A), and differs from the corresponding mammalian signal in the absence of G at the position immediately preceding the branchpoint. The distribution of occurrences of this sequence suggests a minimum distance between 5′ splice shies and branchpoints of about 38 nucleotides, and a minimum distance between 3′ splice sites and branchpoints of 15 nucleotides. The methods we have used detect no information in exon sequences other than in the few nucleotides immediately adjacent to the splice sites. However, Drosophila resembles many other species in that there is a discontinuity in A + T content between exons and introns, which are A + T rich.
AB - A database of 209 Drosophila Introns was extracted from Genbank (release number 64.0) and examined by a number of methods in order to characterize features that might serve as signals for messenger RNA splicing. A tight distribution of sizes was observed: while thesmallest introns in the database are 51 nucleotides, more than half are less than 80 nucleotides in length, and most of these have lengths in the range of 59 - 67 nucleotides. Drosophila splice sites found in large and small introns differ in only minor ways from each other and from those found in vertebrate Introns. However, larger introns have greater pyrimidlne-richness in the region between 11 and 21 nucleotides upstream of 3′ splice sites. The Drosophila branchpoint consensus matrix resembles C T A A T (in which branch formation occurs at the underlined A), and differs from the corresponding mammalian signal in the absence of G at the position immediately preceding the branchpoint. The distribution of occurrences of this sequence suggests a minimum distance between 5′ splice shies and branchpoints of about 38 nucleotides, and a minimum distance between 3′ splice sites and branchpoints of 15 nucleotides. The methods we have used detect no information in exon sequences other than in the few nucleotides immediately adjacent to the splice sites. However, Drosophila resembles many other species in that there is a discontinuity in A + T content between exons and introns, which are A + T rich.
UR - http://www.scopus.com/inward/record.url?scp=0026657738&partnerID=8YFLogxK
U2 - 10.1093/nar/20.16.4255
DO - 10.1093/nar/20.16.4255
M3 - Article
C2 - 1508718
AN - SCOPUS:0026657738
SN - 0305-1048
VL - 20
SP - 4255
EP - 4262
JO - Nucleic acids research
JF - Nucleic acids research
IS - 16
ER -