Cargando…

DeepSec: a deep learning framework for secreted protein discovery in human body fluids

MOTIVATION: Human proteins that are secreted into different body fluids from various cells and tissues can be promising disease indicators. Modern proteomics research empowered by both qualitative and quantitative profiling techniques has made great progress in protein discovery in various human flu...

Descripción completa

Detalles Bibliográficos
Autores principales: Shao, Dan, Huang, Lan, Wang, Yan, He, Kai, Cui, Xueteng, Wang, Yao, Ma, Qin, Cui, Juan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696095/
https://www.ncbi.nlm.nih.gov/pubmed/34398224
http://dx.doi.org/10.1093/bioinformatics/btab545
_version_ 1784619730275926016
author Shao, Dan
Huang, Lan
Wang, Yan
He, Kai
Cui, Xueteng
Wang, Yao
Ma, Qin
Cui, Juan
author_facet Shao, Dan
Huang, Lan
Wang, Yan
He, Kai
Cui, Xueteng
Wang, Yao
Ma, Qin
Cui, Juan
author_sort Shao, Dan
collection PubMed
description MOTIVATION: Human proteins that are secreted into different body fluids from various cells and tissues can be promising disease indicators. Modern proteomics research empowered by both qualitative and quantitative profiling techniques has made great progress in protein discovery in various human fluids. However, due to the large number of proteins and diverse modifications present in the fluids, as well as the existing technical limits of major proteomics platforms (e.g. mass spectrometry), large discrepancies are often generated from different experimental studies. As a result, a comprehensive proteomics landscape across major human fluids are not well determined. RESULTS: To bridge this gap, we have developed a deep learning framework, named DeepSec, to identify secreted proteins in 12 types of human body fluids. DeepSec adopts an end-to-end sequence-based approach, where a Convolutional Neural Network is built to learn the abstract sequence features followed by a Bidirectional Gated Recurrent Unit with fully connected layer for protein classification. DeepSec has demonstrated promising performances with average area under the ROC curves of 0.85–0.94 on testing datasets in each type of fluids, which outperforms existing state-of-the-art methods available mostly on blood proteins. As an illustration of how to apply DeepSec in biomarker discovery research, we conducted a case study on kidney cancer by using genomics data from the cancer genome atlas and have identified 104 possible marker proteins. AVAILABILITY: DeepSec is available at https://bmbl.bmi.osumc.edu/deepsec/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8696095
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-86960952022-01-04 DeepSec: a deep learning framework for secreted protein discovery in human body fluids Shao, Dan Huang, Lan Wang, Yan He, Kai Cui, Xueteng Wang, Yao Ma, Qin Cui, Juan Bioinformatics Original Papers MOTIVATION: Human proteins that are secreted into different body fluids from various cells and tissues can be promising disease indicators. Modern proteomics research empowered by both qualitative and quantitative profiling techniques has made great progress in protein discovery in various human fluids. However, due to the large number of proteins and diverse modifications present in the fluids, as well as the existing technical limits of major proteomics platforms (e.g. mass spectrometry), large discrepancies are often generated from different experimental studies. As a result, a comprehensive proteomics landscape across major human fluids are not well determined. RESULTS: To bridge this gap, we have developed a deep learning framework, named DeepSec, to identify secreted proteins in 12 types of human body fluids. DeepSec adopts an end-to-end sequence-based approach, where a Convolutional Neural Network is built to learn the abstract sequence features followed by a Bidirectional Gated Recurrent Unit with fully connected layer for protein classification. DeepSec has demonstrated promising performances with average area under the ROC curves of 0.85–0.94 on testing datasets in each type of fluids, which outperforms existing state-of-the-art methods available mostly on blood proteins. As an illustration of how to apply DeepSec in biomarker discovery research, we conducted a case study on kidney cancer by using genomics data from the cancer genome atlas and have identified 104 possible marker proteins. AVAILABILITY: DeepSec is available at https://bmbl.bmi.osumc.edu/deepsec/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-08-16 /pmc/articles/PMC8696095/ /pubmed/34398224 http://dx.doi.org/10.1093/bioinformatics/btab545 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Shao, Dan
Huang, Lan
Wang, Yan
He, Kai
Cui, Xueteng
Wang, Yao
Ma, Qin
Cui, Juan
DeepSec: a deep learning framework for secreted protein discovery in human body fluids
title DeepSec: a deep learning framework for secreted protein discovery in human body fluids
title_full DeepSec: a deep learning framework for secreted protein discovery in human body fluids
title_fullStr DeepSec: a deep learning framework for secreted protein discovery in human body fluids
title_full_unstemmed DeepSec: a deep learning framework for secreted protein discovery in human body fluids
title_short DeepSec: a deep learning framework for secreted protein discovery in human body fluids
title_sort deepsec: a deep learning framework for secreted protein discovery in human body fluids
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696095/
https://www.ncbi.nlm.nih.gov/pubmed/34398224
http://dx.doi.org/10.1093/bioinformatics/btab545
work_keys_str_mv AT shaodan deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids
AT huanglan deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids
AT wangyan deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids
AT hekai deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids
AT cuixueteng deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids
AT wangyao deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids
AT maqin deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids
AT cuijuan deepsecadeeplearningframeworkforsecretedproteindiscoveryinhumanbodyfluids