Cargando…

Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics

BACKGROUND: The hypergeometric enrichment analysis approach typically fares poorly in feature-selection stability due to its upstream reliance on the t-test to generate differential protein lists before testing for enrichment on a protein complex, subnetwork or gene group. METHODS: Swapping the t-te...

Descripción completa

Detalles Bibliográficos
Autor principal: Goh, Wilson Wen Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260792/
https://www.ncbi.nlm.nih.gov/pubmed/28117654
http://dx.doi.org/10.1186/s12920-016-0228-z
_version_ 1782499470940831744
author Goh, Wilson Wen Bin
author_facet Goh, Wilson Wen Bin
author_sort Goh, Wilson Wen Bin
collection PubMed
description BACKGROUND: The hypergeometric enrichment analysis approach typically fares poorly in feature-selection stability due to its upstream reliance on the t-test to generate differential protein lists before testing for enrichment on a protein complex, subnetwork or gene group. METHODS: Swapping the t-test in favour of a fuzzy rank-based weight system similar to that used in network-based methods like Quantitative Proteomics Signature Profiling (QPSP), Fuzzy SubNets (FSNET) and paired FSNET (PFSNET) produces dramatic improvements. RESULTS: This approach, Fuzzy-FishNET, exhibits high precision-recall over three sets of simulated data (with simulated protein complexes) while excelling in feature-selection reproducibility on real data (based on evaluation with real protein complexes). Overlap comparisons with PFSNET shows Fuzzy-FishNET selects the most significant complexes, which are also strongly class-discriminative. Cross-validation further demonstrates Fuzzy-FishNET selects class-relevant protein complexes. CONCLUSIONS: Based on evaluation with simulated and real datasets, Fuzzy-FishNET is a significant upgrade of the traditional hypergeometric enrichment approach and a powerful new entrant amongst comparative proteomics analysis methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12920-016-0228-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5260792
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52607922017-01-30 Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics Goh, Wilson Wen Bin BMC Med Genomics Research BACKGROUND: The hypergeometric enrichment analysis approach typically fares poorly in feature-selection stability due to its upstream reliance on the t-test to generate differential protein lists before testing for enrichment on a protein complex, subnetwork or gene group. METHODS: Swapping the t-test in favour of a fuzzy rank-based weight system similar to that used in network-based methods like Quantitative Proteomics Signature Profiling (QPSP), Fuzzy SubNets (FSNET) and paired FSNET (PFSNET) produces dramatic improvements. RESULTS: This approach, Fuzzy-FishNET, exhibits high precision-recall over three sets of simulated data (with simulated protein complexes) while excelling in feature-selection reproducibility on real data (based on evaluation with real protein complexes). Overlap comparisons with PFSNET shows Fuzzy-FishNET selects the most significant complexes, which are also strongly class-discriminative. Cross-validation further demonstrates Fuzzy-FishNET selects class-relevant protein complexes. CONCLUSIONS: Based on evaluation with simulated and real datasets, Fuzzy-FishNET is a significant upgrade of the traditional hypergeometric enrichment approach and a powerful new entrant amongst comparative proteomics analysis methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12920-016-0228-z) contains supplementary material, which is available to authorized users. BioMed Central 2016-12-05 /pmc/articles/PMC5260792/ /pubmed/28117654 http://dx.doi.org/10.1186/s12920-016-0228-z Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Goh, Wilson Wen Bin
Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics
title Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics
title_full Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics
title_fullStr Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics
title_full_unstemmed Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics
title_short Fuzzy-FishNET: a highly reproducible protein complex-based approach for feature selection in comparative proteomics
title_sort fuzzy-fishnet: a highly reproducible protein complex-based approach for feature selection in comparative proteomics
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260792/
https://www.ncbi.nlm.nih.gov/pubmed/28117654
http://dx.doi.org/10.1186/s12920-016-0228-z
work_keys_str_mv AT gohwilsonwenbin fuzzyfishnetahighlyreproducibleproteincomplexbasedapproachforfeatureselectionincomparativeproteomics