Cargando…
ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data
BACKGROUND: Data from discovery proteomic and phosphoproteomic experiments typically include missing values that correspond to proteins that have not been identified in the analyzed sample. Replacing the missing values with random numbers, a process known as “imputation”, avoids apparent infinite fo...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6842221/ https://www.ncbi.nlm.nih.gov/pubmed/31706265 http://dx.doi.org/10.1186/s12859-019-3144-3 |
_version_ | 1783468008173207552 |
---|---|
author | Medo, Matúš Aebersold, Daniel M. Medová, Michaela |
author_facet | Medo, Matúš Aebersold, Daniel M. Medová, Michaela |
author_sort | Medo, Matúš |
collection | PubMed |
description | BACKGROUND: Data from discovery proteomic and phosphoproteomic experiments typically include missing values that correspond to proteins that have not been identified in the analyzed sample. Replacing the missing values with random numbers, a process known as “imputation”, avoids apparent infinite fold-change values. However, the procedure comes at a cost: Imputing a large number of missing values has the potential to significantly impact the results of the subsequent differential expression analysis. RESULTS: We propose a method that identifies differentially expressed proteins by ranking their observed changes with respect to the changes observed for other proteins. Missing values are taken into account by this method directly, without the need to impute them. We illustrate the performance of the new method on two distinct datasets and show that it is robust to missing values and, at the same time, provides results that are otherwise similar to those obtained with edgeR which is a state-of-art differential expression analysis method. CONCLUSIONS: The new method for the differential expression analysis of proteomic data is available as an easy to use Python package. |
format | Online Article Text |
id | pubmed-6842221 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-68422212019-11-14 ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data Medo, Matúš Aebersold, Daniel M. Medová, Michaela BMC Bioinformatics Methodology Article BACKGROUND: Data from discovery proteomic and phosphoproteomic experiments typically include missing values that correspond to proteins that have not been identified in the analyzed sample. Replacing the missing values with random numbers, a process known as “imputation”, avoids apparent infinite fold-change values. However, the procedure comes at a cost: Imputing a large number of missing values has the potential to significantly impact the results of the subsequent differential expression analysis. RESULTS: We propose a method that identifies differentially expressed proteins by ranking their observed changes with respect to the changes observed for other proteins. Missing values are taken into account by this method directly, without the need to impute them. We illustrate the performance of the new method on two distinct datasets and show that it is robust to missing values and, at the same time, provides results that are otherwise similar to those obtained with edgeR which is a state-of-art differential expression analysis method. CONCLUSIONS: The new method for the differential expression analysis of proteomic data is available as an easy to use Python package. BioMed Central 2019-11-09 /pmc/articles/PMC6842221/ /pubmed/31706265 http://dx.doi.org/10.1186/s12859-019-3144-3 Text en © Medo et al. 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Medo, Matúš Aebersold, Daniel M. Medová, Michaela ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data |
title | ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data |
title_full | ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data |
title_fullStr | ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data |
title_full_unstemmed | ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data |
title_short | ProtRank: bypassing the imputation of missing values in differential expression analysis of proteomic data |
title_sort | protrank: bypassing the imputation of missing values in differential expression analysis of proteomic data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6842221/ https://www.ncbi.nlm.nih.gov/pubmed/31706265 http://dx.doi.org/10.1186/s12859-019-3144-3 |
work_keys_str_mv | AT medomatus protrankbypassingtheimputationofmissingvaluesindifferentialexpressionanalysisofproteomicdata AT aebersolddanielm protrankbypassingtheimputationofmissingvaluesindifferentialexpressionanalysisofproteomicdata AT medovamichaela protrankbypassingtheimputationofmissingvaluesindifferentialexpressionanalysisofproteomicdata |