IPM - Institute for Research in Fundamental Sciences

“School of Biological”

Back to Papers Home
Back to Papers of School of Biological

Paper IPM / Biological / 14424

School of Biological Sciences

Title:

Missing Value Imputation for RNA-Sequencing Data Using Statistical Models: A Comparative Study

Author(s):

1.	Taban Baghfalaki
2.	Mojtaba Ganjali
3.	Damon Berridge

Status:

inProgress

Journal:

J. Appl. Statist.

Year:

2016

Supported by:

IPM

Abstract:

RNA-seq technology has been widely used as an alternative approach to traditional microarrays in transcript analysis. Sometimes gene expression by sequencing, which generates RNA-seq data set, may have missing read counts. These missing values can adversely affect downstream analyses. Most of the methods for analysing the RNA-seq data sets require a complete matrix of RNA-seq data. In the past few years, researchers have been putting a great deal of effort into presenting evaluations of the different imputation algorithms in microarray gene expression data sets, However, these are limited works for RNA-seq data sets and a comparative study for investigating the performance of the missing value imputation for RNA-seq data is essential. In this paper, we propose the use of some parametric models such as Regression imputation, Bayesian generalized linear model, Poisson mixture model, EM approach , Bayesian Poisson regression, Bayesian quasi-Poisson regression and the Bootstrap version of two latter for single imputation of missing values in RNA-seq count data sets. The approaches are also applied for identifying differentially expressed genes in the presence of missing values. Multiple imputation, proposed by Rubin (1978), is also used for multiple imputation of missing RNA-seq counts. This approach allows appropriate assessment of imputation uncertainty for missing values. The performance of the single and multiple imputations are investigated using some simulation studies. Also, some real data sets are analyzed using the proposed approaches.

Download TeX format

“School of Biological”

People

Schools

Centers

Groups

E-Services

Publications