Cargando…
Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations
Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulat...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8336924/ https://www.ncbi.nlm.nih.gov/pubmed/34355210 http://dx.doi.org/10.1016/j.crmeth.2021.100014 |
_version_ | 1783733403944747008 |
---|---|
author | Zheng, Wei Zhang, Chengxin Li, Yang Pearce, Robin Bell, Eric W. Zhang, Yang |
author_facet | Zheng, Wei Zhang, Chengxin Li, Yang Pearce, Robin Bell, Eric W. Zhang, Yang |
author_sort | Zheng, Wei |
collection | PubMed |
description | Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulations. Large-scale benchmark tests showed that C-I-TASSER can fold more than twice the number of non-homologous proteins than the I-TASSER, which does not use contacts. When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB. Furthermore, it created correct folds for 85% of proteins in the SARS-CoV-2 genome, despite the quick mutation rate of the virus and sparse sequence profiles. The results demonstrated the critical importance of coupling whole-genome and metagenome-based evolutionary information with optimal structure assembly simulations for solving the problem of non-homologous protein structure prediction. |
format | Online Article Text |
id | pubmed-8336924 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-83369242021-08-04 Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations Zheng, Wei Zhang, Chengxin Li, Yang Pearce, Robin Bell, Eric W. Zhang, Yang Cell Rep Methods Article Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulations. Large-scale benchmark tests showed that C-I-TASSER can fold more than twice the number of non-homologous proteins than the I-TASSER, which does not use contacts. When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB. Furthermore, it created correct folds for 85% of proteins in the SARS-CoV-2 genome, despite the quick mutation rate of the virus and sparse sequence profiles. The results demonstrated the critical importance of coupling whole-genome and metagenome-based evolutionary information with optimal structure assembly simulations for solving the problem of non-homologous protein structure prediction. Elsevier 2021-06-21 /pmc/articles/PMC8336924/ /pubmed/34355210 http://dx.doi.org/10.1016/j.crmeth.2021.100014 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Zheng, Wei Zhang, Chengxin Li, Yang Pearce, Robin Bell, Eric W. Zhang, Yang Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations |
title | Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations |
title_full | Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations |
title_fullStr | Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations |
title_full_unstemmed | Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations |
title_short | Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations |
title_sort | folding non-homologous proteins by coupling deep-learning contact maps with i-tasser assembly simulations |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8336924/ https://www.ncbi.nlm.nih.gov/pubmed/34355210 http://dx.doi.org/10.1016/j.crmeth.2021.100014 |
work_keys_str_mv | AT zhengwei foldingnonhomologousproteinsbycouplingdeeplearningcontactmapswithitasserassemblysimulations AT zhangchengxin foldingnonhomologousproteinsbycouplingdeeplearningcontactmapswithitasserassemblysimulations AT liyang foldingnonhomologousproteinsbycouplingdeeplearningcontactmapswithitasserassemblysimulations AT pearcerobin foldingnonhomologousproteinsbycouplingdeeplearningcontactmapswithitasserassemblysimulations AT bellericw foldingnonhomologousproteinsbycouplingdeeplearningcontactmapswithitasserassemblysimulations AT zhangyang foldingnonhomologousproteinsbycouplingdeeplearningcontactmapswithitasserassemblysimulations |