Cargando…

THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes

New methodology must be developed to improve the ability to characterize the growing number of amino acid sequences, which vastly exceeds the number of experimentally determined protein structures. Homologous proteins can be used as structural templates for modeling proteins that do not have experim...

Descripción completa

Detalles Bibliográficos
Autores principales: Diamond, Justin S, Zhang, Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6146127/
https://www.ncbi.nlm.nih.gov/pubmed/30239678
http://dx.doi.org/10.1093/database/bay090
_version_ 1783356345981861888
author Diamond, Justin S
Zhang, Yang
author_facet Diamond, Justin S
Zhang, Yang
author_sort Diamond, Justin S
collection PubMed
description New methodology must be developed to improve the ability to characterize the growing number of amino acid sequences, which vastly exceeds the number of experimentally determined protein structures. Homologous proteins can be used as structural templates for modeling proteins that do not have experimentally determined structures. However, in many cases, there are no homologous proteins (typically <30% sequence identity) with determined structures from which a query sequence can be reliably modeled. The aim of protein threading is to use features, such as secondary structure, solvent accessibility and torsional angles, in addition to sequence patterns to identify structural templates from the protein databank to assist for full-length atomic-level structural modeling. However, there are still numerous protein sequences for which correct templates cannot be recognized. This raises the question as to what attributes allow query sequences to be matched to the correct but distantly homologous templates. To aid the investigation into this question and to provide genome-score protein structure for the biological community, a database called THE-DB (threading hard and easy protein database) has been developed in which it becomes possible to analyze over 15 000 query sequences from the Escherichia coli (E. coli) K12 and human proteomes, as well as to find their three-dimensional templates derived from the state-of-the-art threading algorithms which is not feasible with existing protein template databases. The E. coli K12 and human data can be downloaded in bulk from the THE-DB page.
format Online
Article
Text
id pubmed-6146127
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-61461272018-09-25 THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes Diamond, Justin S Zhang, Yang Database (Oxford) Original Article New methodology must be developed to improve the ability to characterize the growing number of amino acid sequences, which vastly exceeds the number of experimentally determined protein structures. Homologous proteins can be used as structural templates for modeling proteins that do not have experimentally determined structures. However, in many cases, there are no homologous proteins (typically <30% sequence identity) with determined structures from which a query sequence can be reliably modeled. The aim of protein threading is to use features, such as secondary structure, solvent accessibility and torsional angles, in addition to sequence patterns to identify structural templates from the protein databank to assist for full-length atomic-level structural modeling. However, there are still numerous protein sequences for which correct templates cannot be recognized. This raises the question as to what attributes allow query sequences to be matched to the correct but distantly homologous templates. To aid the investigation into this question and to provide genome-score protein structure for the biological community, a database called THE-DB (threading hard and easy protein database) has been developed in which it becomes possible to analyze over 15 000 query sequences from the Escherichia coli (E. coli) K12 and human proteomes, as well as to find their three-dimensional templates derived from the state-of-the-art threading algorithms which is not feasible with existing protein template databases. The E. coli K12 and human data can be downloaded in bulk from the THE-DB page. Oxford University Press 2018-09-18 /pmc/articles/PMC6146127/ /pubmed/30239678 http://dx.doi.org/10.1093/database/bay090 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Diamond, Justin S
Zhang, Yang
THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes
title THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes
title_full THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes
title_fullStr THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes
title_full_unstemmed THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes
title_short THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes
title_sort the-db: a threading model database for comparative protein structure analysis of the e. coli k12 and human proteomes
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6146127/
https://www.ncbi.nlm.nih.gov/pubmed/30239678
http://dx.doi.org/10.1093/database/bay090
work_keys_str_mv AT diamondjustins thedbathreadingmodeldatabaseforcomparativeproteinstructureanalysisoftheecolik12andhumanproteomes
AT zhangyang thedbathreadingmodeldatabaseforcomparativeproteinstructureanalysisoftheecolik12andhumanproteomes