Cargando…

Coverage of whole proteome by structural genomics observed through protein homology modeling database

We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains model...

Descripción completa

Detalles Bibliográficos
Autores principales: Yura, Kei, Yamaguchi, Akihiro, Go, Mitiko
Formato: Texto
Lenguaje:English
Publicado: Kluwer Academic Publishers 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1769342/
https://www.ncbi.nlm.nih.gov/pubmed/17146617
http://dx.doi.org/10.1007/s10969-006-9010-3
_version_ 1782131673242009600
author Yura, Kei
Yamaguchi, Akihiro
Go, Mitiko
author_facet Yura, Kei
Yamaguchi, Akihiro
Go, Mitiko
author_sort Yura, Kei
collection PubMed
description We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics.
format Text
id pubmed-1769342
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Kluwer Academic Publishers
record_format MEDLINE/PubMed
spelling pubmed-17693422007-01-12 Coverage of whole proteome by structural genomics observed through protein homology modeling database Yura, Kei Yamaguchi, Akihiro Go, Mitiko J Struct Funct Genomics Original Paper We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics. Kluwer Academic Publishers 2006-12-05 2006-06 /pmc/articles/PMC1769342/ /pubmed/17146617 http://dx.doi.org/10.1007/s10969-006-9010-3 Text en © Springer Science+Business Media B.V. 2006
spellingShingle Original Paper
Yura, Kei
Yamaguchi, Akihiro
Go, Mitiko
Coverage of whole proteome by structural genomics observed through protein homology modeling database
title Coverage of whole proteome by structural genomics observed through protein homology modeling database
title_full Coverage of whole proteome by structural genomics observed through protein homology modeling database
title_fullStr Coverage of whole proteome by structural genomics observed through protein homology modeling database
title_full_unstemmed Coverage of whole proteome by structural genomics observed through protein homology modeling database
title_short Coverage of whole proteome by structural genomics observed through protein homology modeling database
title_sort coverage of whole proteome by structural genomics observed through protein homology modeling database
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1769342/
https://www.ncbi.nlm.nih.gov/pubmed/17146617
http://dx.doi.org/10.1007/s10969-006-9010-3
work_keys_str_mv AT yurakei coverageofwholeproteomebystructuralgenomicsobservedthroughproteinhomologymodelingdatabase
AT yamaguchiakihiro coverageofwholeproteomebystructuralgenomicsobservedthroughproteinhomologymodelingdatabase
AT gomitiko coverageofwholeproteomebystructuralgenomicsobservedthroughproteinhomologymodelingdatabase