Cargando…

ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data

BACKGROUND: Data from discovery proteomic and phosphoproteomic experiments typically include missing values that correspond to proteins that have not been identified in the analyzed sample. Replacing the missing values with random numbers, a process known as “imputation”, avoids apparent infinite fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Medo, Matúš, Aebersold, Daniel M., Medová, Michaela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6842221/
https://www.ncbi.nlm.nih.gov/pubmed/31706265
http://dx.doi.org/10.1186/s12859-019-3144-3
_version_ 1783468008173207552
author Medo, Matúš
Aebersold, Daniel M.
Medová, Michaela
author_facet Medo, Matúš
Aebersold, Daniel M.
Medová, Michaela
author_sort Medo, Matúš
collection PubMed
description BACKGROUND: Data from discovery proteomic and phosphoproteomic experiments typically include missing values that correspond to proteins that have not been identified in the analyzed sample. Replacing the missing values with random numbers, a process known as “imputation”, avoids apparent infinite fold-change values. However, the procedure comes at a cost: Imputing a large number of missing values has the potential to significantly impact the results of the subsequent differential expression analysis. RESULTS: We propose a method that identifies differentially expressed proteins by ranking their observed changes with respect to the changes observed for other proteins. Missing values are taken into account by this method directly, without the need to impute them. We illustrate the performance of the new method on two distinct datasets and show that it is robust to missing values and, at the same time, provides results that are otherwise similar to those obtained with edgeR which is a state-of-art differential expression analysis method. CONCLUSIONS: The new method for the differential expression analysis of proteomic data is available as an easy to use Python package.
format Online
Article
Text
id pubmed-6842221
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68422212019-11-14 ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data Medo, Matúš Aebersold, Daniel M. Medová, Michaela BMC Bioinformatics Methodology Article BACKGROUND: Data from discovery proteomic and phosphoproteomic experiments typically include missing values that correspond to proteins that have not been identified in the analyzed sample. Replacing the missing values with random numbers, a process known as “imputation”, avoids apparent infinite fold-change values. However, the procedure comes at a cost: Imputing a large number of missing values has the potential to significantly impact the results of the subsequent differential expression analysis. RESULTS: We propose a method that identifies differentially expressed proteins by ranking their observed changes with respect to the changes observed for other proteins. Missing values are taken into account by this method directly, without the need to impute them. We illustrate the performance of the new method on two distinct datasets and show that it is robust to missing values and, at the same time, provides results that are otherwise similar to those obtained with edgeR which is a state-of-art differential expression analysis method. CONCLUSIONS: The new method for the differential expression analysis of proteomic data is available as an easy to use Python package. BioMed Central 2019-11-09 /pmc/articles/PMC6842221/ /pubmed/31706265 http://dx.doi.org/10.1186/s12859-019-3144-3 Text en © Medo et al. 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Medo, Matúš
Aebersold, Daniel M.
Medová, Michaela
ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data
title ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data
title_full ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data
title_fullStr ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data
title_full_unstemmed ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data
title_short ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data
title_sort protrank: bypassing the imputation of missing values in differential expression analysis of proteomic data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6842221/
https://www.ncbi.nlm.nih.gov/pubmed/31706265
http://dx.doi.org/10.1186/s12859-019-3144-3
work_keys_str_mv AT medomatus protrankbypassingtheimputationofmissingvaluesindifferentialexpressionanalysisofproteomicdata
AT aebersolddanielm protrankbypassingtheimputationofmissingvaluesindifferentialexpressionanalysisofproteomicdata
AT medovamichaela protrankbypassingtheimputationofmissingvaluesindifferentialexpressionanalysisofproteomicdata