Cargando…

Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication

BACKGROUND: Mass spectrometry (MS) data are often generated from various biological or chemical experiments and there may exist outlying observations, which are extreme due to technical reasons. The determination of outlying observations is important in the analysis of replicated MS data because ela...

Descripción completa

Detalles Bibliográficos
Autores principales: Eo, Soo-Heang, Pak, Daewoo, Choi, Jeea, Cho, HyungJun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3514222/
https://www.ncbi.nlm.nih.gov/pubmed/22587344
http://dx.doi.org/10.1186/1756-0500-5-236
_version_ 1782251992578523136
author Eo, Soo-Heang
Pak, Daewoo
Choi, Jeea
Cho, HyungJun
author_facet Eo, Soo-Heang
Pak, Daewoo
Choi, Jeea
Cho, HyungJun
author_sort Eo, Soo-Heang
collection PubMed
description BACKGROUND: Mass spectrometry (MS) data are often generated from various biological or chemical experiments and there may exist outlying observations, which are extreme due to technical reasons. The determination of outlying observations is important in the analysis of replicated MS data because elaborate pre-processing is essential for successful analysis with reliable results and manual outlier detection as one of pre-processing steps is time-consuming. The heterogeneity of variability and low replication are often obstacles to successful analysis, including outlier detection. Existing approaches, which assume constant variability, can generate many false positives (outliers) and/or false negatives (non-outliers). Thus, a more powerful and accurate approach is needed to account for the heterogeneity of variability and low replication. FINDINGS: We proposed an outlier detection algorithm using projection and quantile regression in MS data from multiple experiments. The performance of the algorithm and program was demonstrated by using both simulated and real-life data. The projection approach with linear, nonlinear, or nonparametric quantile regression was appropriate in heterogeneous high-throughput data with low replication. CONCLUSION: Various quantile regression approaches combined with projection were proposed for detecting outliers. The choice among linear, nonlinear, and nonparametric regressions is dependent on the degree of heterogeneity of the data. The proposed approach was illustrated with MS data with two or more replicates.
format Online
Article
Text
id pubmed-3514222
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35142222012-12-06 Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication Eo, Soo-Heang Pak, Daewoo Choi, Jeea Cho, HyungJun BMC Res Notes Technical Note BACKGROUND: Mass spectrometry (MS) data are often generated from various biological or chemical experiments and there may exist outlying observations, which are extreme due to technical reasons. The determination of outlying observations is important in the analysis of replicated MS data because elaborate pre-processing is essential for successful analysis with reliable results and manual outlier detection as one of pre-processing steps is time-consuming. The heterogeneity of variability and low replication are often obstacles to successful analysis, including outlier detection. Existing approaches, which assume constant variability, can generate many false positives (outliers) and/or false negatives (non-outliers). Thus, a more powerful and accurate approach is needed to account for the heterogeneity of variability and low replication. FINDINGS: We proposed an outlier detection algorithm using projection and quantile regression in MS data from multiple experiments. The performance of the algorithm and program was demonstrated by using both simulated and real-life data. The projection approach with linear, nonlinear, or nonparametric quantile regression was appropriate in heterogeneous high-throughput data with low replication. CONCLUSION: Various quantile regression approaches combined with projection were proposed for detecting outliers. The choice among linear, nonlinear, and nonparametric regressions is dependent on the degree of heterogeneity of the data. The proposed approach was illustrated with MS data with two or more replicates. BioMed Central 2012-05-15 /pmc/articles/PMC3514222/ /pubmed/22587344 http://dx.doi.org/10.1186/1756-0500-5-236 Text en Copyright ©2012 Eo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Eo, Soo-Heang
Pak, Daewoo
Choi, Jeea
Cho, HyungJun
Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication
title Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication
title_full Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication
title_fullStr Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication
title_full_unstemmed Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication
title_short Outlier Detection using Projection Quantile Regression for Mass Spectrometry Data with Low Replication
title_sort outlier detection using projection quantile regression for mass spectrometry data with low replication
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3514222/
https://www.ncbi.nlm.nih.gov/pubmed/22587344
http://dx.doi.org/10.1186/1756-0500-5-236
work_keys_str_mv AT eosooheang outlierdetectionusingprojectionquantileregressionformassspectrometrydatawithlowreplication
AT pakdaewoo outlierdetectionusingprojectionquantileregressionformassspectrometrydatawithlowreplication
AT choijeea outlierdetectionusingprojectionquantileregressionformassspectrometrydatawithlowreplication
AT chohyungjun outlierdetectionusingprojectionquantileregressionformassspectrometrydatawithlowreplication