Cargando…

TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning

Disordered flexible linkers (DFLs) are the functional disordered regions in proteins, which are the sub-regions of intrinsically disordered regions (IDRs) and play important roles in connecting domains and maintaining inter-domain interactions. Trained with the limited available DFLs, the existing D...

Descripción completa

Detalles Bibliográficos
Autores principales: Pang, Yihe, Liu, Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10626177/
https://www.ncbi.nlm.nih.gov/pubmed/36272675
http://dx.doi.org/10.1016/j.gpb.2022.10.004
_version_ 1785131289131614208
author Pang, Yihe
Liu, Bin
author_facet Pang, Yihe
Liu, Bin
author_sort Pang, Yihe
collection PubMed
description Disordered flexible linkers (DFLs) are the functional disordered regions in proteins, which are the sub-regions of intrinsically disordered regions (IDRs) and play important roles in connecting domains and maintaining inter-domain interactions. Trained with the limited available DFLs, the existing DFL predictors based on the machine learning techniques tend to predict the ordered residues as DFLs, leading to a high falsepositive rate (FPR) and low prediction accuracy. Previous studies have shown that DFLs are extremely flexible disordered regions, which are usually predicted as disordered residues with high confidence [P(D) > 0.9] by an IDR predictor. Therefore, transferring an IDR predictor to an accurate DFL predictor is of great significance for understanding the functions of IDRs. In this study, we proposed a new predictor called TransDFL for identifying DFLs by transferring the RFPR-IDP predictor for IDR identification to the DFL prediction. The RFPR-IDP was pre-trained with IDR sequences to learn the general features between IDRs and DFLs, which is helpful to reduce the false positives in the ordered regions. RFPR-IDP was fine-tuned with the DFL sequences to capture the specific features of DFLs so as to be transferred into the TransDFL. Experimental results of two application scenarios (prediction of DFLs only in IDRs or prediction of DFLs in entire proteins) showed that TransDFL consistently outperformed other existing DFL predictors with higher accuracy. The corresponding web server of TransDFL can be freely accessed at http://bliulab.net/TransDFL/.
format Online
Article
Text
id pubmed-10626177
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-106261772023-11-07 TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning Pang, Yihe Liu, Bin Genomics Proteomics Bioinformatics Method Disordered flexible linkers (DFLs) are the functional disordered regions in proteins, which are the sub-regions of intrinsically disordered regions (IDRs) and play important roles in connecting domains and maintaining inter-domain interactions. Trained with the limited available DFLs, the existing DFL predictors based on the machine learning techniques tend to predict the ordered residues as DFLs, leading to a high falsepositive rate (FPR) and low prediction accuracy. Previous studies have shown that DFLs are extremely flexible disordered regions, which are usually predicted as disordered residues with high confidence [P(D) > 0.9] by an IDR predictor. Therefore, transferring an IDR predictor to an accurate DFL predictor is of great significance for understanding the functions of IDRs. In this study, we proposed a new predictor called TransDFL for identifying DFLs by transferring the RFPR-IDP predictor for IDR identification to the DFL prediction. The RFPR-IDP was pre-trained with IDR sequences to learn the general features between IDRs and DFLs, which is helpful to reduce the false positives in the ordered regions. RFPR-IDP was fine-tuned with the DFL sequences to capture the specific features of DFLs so as to be transferred into the TransDFL. Experimental results of two application scenarios (prediction of DFLs only in IDRs or prediction of DFLs in entire proteins) showed that TransDFL consistently outperformed other existing DFL predictors with higher accuracy. The corresponding web server of TransDFL can be freely accessed at http://bliulab.net/TransDFL/. Elsevier 2023-04 2022-10-19 /pmc/articles/PMC10626177/ /pubmed/36272675 http://dx.doi.org/10.1016/j.gpb.2022.10.004 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Method
Pang, Yihe
Liu, Bin
TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning
title TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning
title_full TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning
title_fullStr TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning
title_full_unstemmed TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning
title_short TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning
title_sort transdfl: identification of disordered flexible linkers in proteins by transfer learning
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10626177/
https://www.ncbi.nlm.nih.gov/pubmed/36272675
http://dx.doi.org/10.1016/j.gpb.2022.10.004
work_keys_str_mv AT pangyihe transdflidentificationofdisorderedflexiblelinkersinproteinsbytransferlearning
AT liubin transdflidentificationofdisorderedflexiblelinkersinproteinsbytransferlearning