Differentiating the protein coding and noncoding rna segments of dna using shannon entropy

P. Mazaheri, A. H. Shirazi, N. Saeedi, G. Reza Jafari, Muhammad Sahimi

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

The complexity of DNA sequences is evaluated in order to differentiate between protein-coding and noncoding RNA segments. The method is based on computing the Shannon entropy of the sequences. By comparing the entropy of the original sequence with that of its shuffled one, we identify the source of the difference between the two segments and their relative contributions to the sequence. To demonstrate the method, the DNA sequences of the bacterium Clostridium difficile 630 (G + C = 29.1%) and Bdellovibrio bacteriovorus (G + C = 50.6%) are analyzed, which are representatives of bacteria with unbalanced and balanced nucleotide content, respectively. It is shown that in both bacteria, regardless of nucleotide content, ΔrS - the relative difference of the two entropies - is significantly greater in protein-coding regions, when compared with noncoding RNA segments.

Original languageEnglish
Pages (from-to)1-9
Number of pages9
JournalInternational Journal of Modern Physics C
Volume21
Issue number1
DOIs
StatePublished - Jan 2010

Keywords

  • DNA Sequence
  • Shannon Entropy

Fingerprint

Dive into the research topics of 'Differentiating the protein coding and noncoding rna segments of dna using shannon entropy'. Together they form a unique fingerprint.

Cite this