Cargando…

DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning

Members of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Zhenya, Ren, Zirui, Yan, Lunyi, Li, Feng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8796025/
https://www.ncbi.nlm.nih.gov/pubmed/35009139
http://dx.doi.org/10.3390/plants11010136
_version_ 1784641211580022784
author Liu, Zhenya
Ren, Zirui
Yan, Lunyi
Li, Feng
author_facet Liu, Zhenya
Ren, Zirui
Yan, Lunyi
Li, Feng
author_sort Liu, Zhenya
collection PubMed
description Members of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods still need to be improved, especially when it comes to similarity-based methods. We introduce our DeepLRR method based on a convolutional neural network (CNN) model and LRR features to predict the number and location of LRR units in proteins. We compared DeepLRR with six existing methods using a dataset containing 572 LRR proteins and it outperformed all of them when it comes to overall F1 score. In addition, DeepLRR has integrated identifying plant disease-resistance proteins (NLR, LRR-RLK, LRR-RLP) and non-canonical domains. With DeepLRR, 223, 191 and 183 LRR-RLK genes in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa ssp. Japonica) and tomato (Solanum lycopersicum) genomes were re-annotated, respectively. Chromosome mapping and gene cluster analysis revealed that 24.2% (54/223), 29.8% (57/191) and 16.9% (31/183) of LRR-RLK genes formed gene cluster structures in Arabidopsis, rice and tomato, respectively. Finally, we explored the evolutionary relationship and domain composition of LRR-RLK genes in each plant and distributions of known receptor and co-receptor pairs. This provides a new perspective for the identification of potential receptors and co-receptors.
format Online
Article
Text
id pubmed-8796025
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-87960252022-01-29 DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning Liu, Zhenya Ren, Zirui Yan, Lunyi Li, Feng Plants (Basel) Article Members of the leucine-rich repeat (LRR) superfamily play critical roles in multiple biological processes. As the LRR unit sequence is highly variable, accurately predicting the number and location of LRR units in proteins is a highly challenging task in the field of bioinformatics. Existing methods still need to be improved, especially when it comes to similarity-based methods. We introduce our DeepLRR method based on a convolutional neural network (CNN) model and LRR features to predict the number and location of LRR units in proteins. We compared DeepLRR with six existing methods using a dataset containing 572 LRR proteins and it outperformed all of them when it comes to overall F1 score. In addition, DeepLRR has integrated identifying plant disease-resistance proteins (NLR, LRR-RLK, LRR-RLP) and non-canonical domains. With DeepLRR, 223, 191 and 183 LRR-RLK genes in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa ssp. Japonica) and tomato (Solanum lycopersicum) genomes were re-annotated, respectively. Chromosome mapping and gene cluster analysis revealed that 24.2% (54/223), 29.8% (57/191) and 16.9% (31/183) of LRR-RLK genes formed gene cluster structures in Arabidopsis, rice and tomato, respectively. Finally, we explored the evolutionary relationship and domain composition of LRR-RLK genes in each plant and distributions of known receptor and co-receptor pairs. This provides a new perspective for the identification of potential receptors and co-receptors. MDPI 2022-01-04 /pmc/articles/PMC8796025/ /pubmed/35009139 http://dx.doi.org/10.3390/plants11010136 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Zhenya
Ren, Zirui
Yan, Lunyi
Li, Feng
DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_full DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_fullStr DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_full_unstemmed DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_short DeepLRR: An Online Webserver for Leucine-Rich-Repeat Containing Protein Characterization Based on Deep Learning
title_sort deeplrr: an online webserver for leucine-rich-repeat containing protein characterization based on deep learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8796025/
https://www.ncbi.nlm.nih.gov/pubmed/35009139
http://dx.doi.org/10.3390/plants11010136
work_keys_str_mv AT liuzhenya deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning
AT renzirui deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning
AT yanlunyi deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning
AT lifeng deeplrranonlinewebserverforleucinerichrepeatcontainingproteincharacterizationbasedondeeplearning