Cargando…

ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures

Predicting protein function and structure from sequence remains an unsolved problem in bioinformatics. The best performing methods rely heavily on evolutionary information from multiple sequence alignments, which means their accuracy deteriorates for sequences with a few homologs, and given the incr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kotowski, Krzysztof, Smolarczyk, Tomasz, Roterman‐Konieczna, Irena, Stapor, Katarzyna
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley & Sons, Inc. 2020
Materias:	Full Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7756333/ https://www.ncbi.nlm.nih.gov/pubmed/33058261 http://dx.doi.org/10.1002/jcc.26432

_version_	1783626517869232128
author	Kotowski, Krzysztof Smolarczyk, Tomasz Roterman‐Konieczna, Irena Stapor, Katarzyna
author_facet	Kotowski, Krzysztof Smolarczyk, Tomasz Roterman‐Konieczna, Irena Stapor, Katarzyna
author_sort	Kotowski, Krzysztof
collection	PubMed
description	Predicting protein function and structure from sequence remains an unsolved problem in bioinformatics. The best performing methods rely heavily on evolutionary information from multiple sequence alignments, which means their accuracy deteriorates for sequences with a few homologs, and given the increasing sequence database sizes requires long computation times. Here, a single‐sequence‐based prediction method is presented, called ProteinUnet, leveraging an U‐Net convolutional network architecture. It is compared to SPIDER3‐Single model, based on long short‐term memory‐bidirectional recurrent neural networks architecture. Both methods achieve similar results for prediction of secondary structures (both three‐ and eight‐state), half‐sphere exposure, and contact number, but ProteinUnet has two times fewer parameters, 17 times shorter inference time, and can be trained 11 times faster. Moreover, ProteinUnet tends to be better for short sequences and residues with a low number of local contacts. Additionally, the method of loss weighting is presented as an effective way of increasing accuracy for rare secondary structures.
format	Online Article Text
id	pubmed-7756333
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	John Wiley & Sons, Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-77563332020-12-28 ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures Kotowski, Krzysztof Smolarczyk, Tomasz Roterman‐Konieczna, Irena Stapor, Katarzyna J Comput Chem Full Papers Predicting protein function and structure from sequence remains an unsolved problem in bioinformatics. The best performing methods rely heavily on evolutionary information from multiple sequence alignments, which means their accuracy deteriorates for sequences with a few homologs, and given the increasing sequence database sizes requires long computation times. Here, a single‐sequence‐based prediction method is presented, called ProteinUnet, leveraging an U‐Net convolutional network architecture. It is compared to SPIDER3‐Single model, based on long short‐term memory‐bidirectional recurrent neural networks architecture. Both methods achieve similar results for prediction of secondary structures (both three‐ and eight‐state), half‐sphere exposure, and contact number, but ProteinUnet has two times fewer parameters, 17 times shorter inference time, and can be trained 11 times faster. Moreover, ProteinUnet tends to be better for short sequences and residues with a low number of local contacts. Additionally, the method of loss weighting is presented as an effective way of increasing accuracy for rare secondary structures. John Wiley & Sons, Inc. 2020-10-15 2021-01-05 /pmc/articles/PMC7756333/ /pubmed/33058261 http://dx.doi.org/10.1002/jcc.26432 Text en © 2020 The Authors. Journal of Computational Chemistry published by Wiley Periodicals LLC. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle	Full Papers Kotowski, Krzysztof Smolarczyk, Tomasz Roterman‐Konieczna, Irena Stapor, Katarzyna ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures
title	ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures
title_full	ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures
title_fullStr	ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures
title_full_unstemmed	ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures
title_short	ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures
title_sort	proteinunet—an efficient alternative to spider3‐single for sequence‐based prediction of protein secondary structures
topic	Full Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7756333/ https://www.ncbi.nlm.nih.gov/pubmed/33058261 http://dx.doi.org/10.1002/jcc.26432
work_keys_str_mv	AT kotowskikrzysztof proteinunetanefficientalternativetospider3singleforsequencebasedpredictionofproteinsecondarystructures AT smolarczyktomasz proteinunetanefficientalternativetospider3singleforsequencebasedpredictionofproteinsecondarystructures AT rotermankoniecznairena proteinunetanefficientalternativetospider3singleforsequencebasedpredictionofproteinsecondarystructures AT staporkatarzyna proteinunetanefficientalternativetospider3singleforsequencebasedpredictionofproteinsecondarystructures

ProteinUnet—An efficient alternative to SPIDER3‐single for sequence‐based prediction of protein secondary structures

Ejemplares similares