Bias Correction in RNA-Seq Short-Read Counts Using Penalized Regression

  • David Dalpiaz
  • , Xuming He
  • , Ping Ma

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

RNA-Seq produces tens of millions of short reads. When mapped to the genome and/or to the reference transcripts, RNA-Seq data can be summarized by a very large number of short-read counts. Accurate transcript quantification, such as gene expression calculation, relies on proper correction of sequence bias in the RNA-Seq short-read counts. We use a linear model for the sequence bias, which is much more flexible than the popular Poisson model. We fit the model using a penalized regression method, which allows for a significant dimension reduction. The algorithm is scalable for modeling RNA-Seq data. We demonstrate the excellent performance of our proposed method by applying it to real examples. The methods are implemented in open-source code, which is available in the R package lmbc.

Original languageEnglish
Pages (from-to)88-99
Number of pages12
JournalStatistics in Biosciences
Volume5
Issue number1
DOIs
StatePublished - May 2013

Keywords

  • Gene expression
  • LASSO
  • Next-generation sequencing
  • Penalized likelihood
  • Regularization
  • RNA-Seq

Fingerprint

Dive into the research topics of 'Bias Correction in RNA-Seq Short-Read Counts Using Penalized Regression'. Together they form a unique fingerprint.

Cite this