Novel approaches to the prediction of CpG islands and their methylation status

Christopher Previti, Oscar Harari, Igor Zwir, Coral Del Val

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

A DNA sequence can be described as a string composed of four symbols: A, T, C and G, each representing a chemically distinct nucleotide molecule. Combinations of two nucleotides are called dinucleotides and CpG islands represent regions of a DNA sequence, certain substrings, which are enriched in CpG dinucleotides (C followed by G). CpG islands represent an enigmatic feature of vertebrate genomes. They a critical target for transcriptional control, since methylation of these CpG islands leads to structural changes in the DNA that stops the expression of any associated gene. The factors that provoke or impede methylation are currently unknown. In general, the maintenance of a particular pattern of methylated CpG dinucleotides represents a critical regulatory system during a host of normal developmental processes, but the erroneous methylation of CpG islands and the resulting genesilencing can lead to the development of cancer. We present here a novel unsupervised machine learning method that is capable of distinguishing biologically significant classes of CpG islands, including the separation of methylated and unmethylated CpG islands. This method represents an important novel approach that will aid in the computational prediction of methylation, which is commonly used in the preselection of worthwhile sequences for methylation experiments.

Original languageEnglish
Title of host publicationSummer Computer Simulation Conference 2007, SCSC'07, Part of the 2007 Summer Simulation Multiconference, SummerSim'07
Pages833-840
Number of pages8
StatePublished - 2007
EventSummer Computer Simulation Conference 2007, SCSC 2007, Part of the 2007 Summer Simulation Multiconference, SummerSim 2007 - San Diego, CA, United States
Duration: Jul 15 2007Jul 18 2007

Publication series

NameSummer Computer Simulation Conference 2007, SCSC'07, Part of the 2007 Summer Simulation Multiconference, SummerSim'07
Volume2

Conference

ConferenceSummer Computer Simulation Conference 2007, SCSC 2007, Part of the 2007 Summer Simulation Multiconference, SummerSim 2007
Country/TerritoryUnited States
CitySan Diego, CA
Period07/15/0707/18/07

Keywords

  • Classification
  • CpG islands
  • Data mining
  • Methylation
  • Unsupervised learning

Fingerprint

Dive into the research topics of 'Novel approaches to the prediction of CpG islands and their methylation status'. Together they form a unique fingerprint.

Cite this