Cargando…

Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm

A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high...

Descripción completa

Detalles Bibliográficos
Autores principales: Abbas, Ahmed, Kong, Xin-Bing, Liu, Zhi, Jing, Bing-Yi, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3538655/
https://www.ncbi.nlm.nih.gov/pubmed/23308147
http://dx.doi.org/10.1371/journal.pone.0053112
_version_ 1782254986037559296
author Abbas, Ahmed
Kong, Xin-Bing
Liu, Zhi
Jing, Bing-Yi
Gao, Xin
author_facet Abbas, Ahmed
Kong, Xin-Bing
Liu, Zhi
Jing, Bing-Yi
Gao, Xin
author_sort Abbas, Ahmed
collection PubMed
description A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into [Image: see text]-values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx.
format Online
Article
Text
id pubmed-3538655
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35386552013-01-10 Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm Abbas, Ahmed Kong, Xin-Bing Liu, Zhi Jing, Bing-Yi Gao, Xin PLoS One Research Article A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into [Image: see text]-values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx. Public Library of Science 2013-01-07 /pmc/articles/PMC3538655/ /pubmed/23308147 http://dx.doi.org/10.1371/journal.pone.0053112 Text en © 2013 Abbas et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Abbas, Ahmed
Kong, Xin-Bing
Liu, Zhi
Jing, Bing-Yi
Gao, Xin
Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm
title Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm
title_full Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm
title_fullStr Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm
title_full_unstemmed Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm
title_short Automatic Peak Selection by a Benjamini-Hochberg-Based Algorithm
title_sort automatic peak selection by a benjamini-hochberg-based algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3538655/
https://www.ncbi.nlm.nih.gov/pubmed/23308147
http://dx.doi.org/10.1371/journal.pone.0053112
work_keys_str_mv AT abbasahmed automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT kongxinbing automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT liuzhi automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT jingbingyi automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT gaoxin automaticpeakselectionbyabenjaminihochbergbasedalgorithm