Cargando…

An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life

Previous studies of protein fold space suggest that fold coverage is plateauing. However, sequence sampling has been -and remains to a large extent- heavily biased, focusing on culturable phyla. Sustained technological developments have fuelled the advent of metagenomics and single-cell sequencing,...

Descripción completa

Detalles Bibliográficos
Autores principales: Barry Roche, Daniel, Brüls, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4592975/
https://www.ncbi.nlm.nih.gov/pubmed/26434770
http://dx.doi.org/10.1038/srep14717
_version_ 1782393263419817984
author Barry Roche, Daniel
Brüls, Thomas
author_facet Barry Roche, Daniel
Brüls, Thomas
author_sort Barry Roche, Daniel
collection PubMed
description Previous studies of protein fold space suggest that fold coverage is plateauing. However, sequence sampling has been -and remains to a large extent- heavily biased, focusing on culturable phyla. Sustained technological developments have fuelled the advent of metagenomics and single-cell sequencing, which might correct the current sequencing bias. The extent to which these efforts affect structural diversity remains unclear, although preliminary results suggest that uncultured organisms could constitute a source of new folds. We investigate to what extent genomes from uncultured and under-sampled phyla accessed through single cell sequencing, metagenomics and high-throughput culturing efforts have the potential to increase protein fold space, and conclude that i) genomes from under-sampled phyla appear enriched in sequences not covered by current protein family and fold profile libraries, ii) this enrichment is linked to an excess of short (and possibly partly spurious) sequences in some of the datasets, iii) the discovery rate of novel folds among sequences uncovered by current fold and family profile libraries may be as high as 36%, but would ultimately translate into a marginal increase in global discovery of novel folds. Thus, genomes from under-sampled phyla should have a rather limited impact on increasing coarse grained tertiary structure level novelty.
format Online
Article
Text
id pubmed-4592975
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-45929752015-10-19 An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life Barry Roche, Daniel Brüls, Thomas Sci Rep Article Previous studies of protein fold space suggest that fold coverage is plateauing. However, sequence sampling has been -and remains to a large extent- heavily biased, focusing on culturable phyla. Sustained technological developments have fuelled the advent of metagenomics and single-cell sequencing, which might correct the current sequencing bias. The extent to which these efforts affect structural diversity remains unclear, although preliminary results suggest that uncultured organisms could constitute a source of new folds. We investigate to what extent genomes from uncultured and under-sampled phyla accessed through single cell sequencing, metagenomics and high-throughput culturing efforts have the potential to increase protein fold space, and conclude that i) genomes from under-sampled phyla appear enriched in sequences not covered by current protein family and fold profile libraries, ii) this enrichment is linked to an excess of short (and possibly partly spurious) sequences in some of the datasets, iii) the discovery rate of novel folds among sequences uncovered by current fold and family profile libraries may be as high as 36%, but would ultimately translate into a marginal increase in global discovery of novel folds. Thus, genomes from under-sampled phyla should have a rather limited impact on increasing coarse grained tertiary structure level novelty. Nature Publishing Group 2015-10-05 /pmc/articles/PMC4592975/ /pubmed/26434770 http://dx.doi.org/10.1038/srep14717 Text en Copyright © 2015, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Barry Roche, Daniel
Brüls, Thomas
An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life
title An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life
title_full An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life
title_fullStr An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life
title_full_unstemmed An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life
title_short An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life
title_sort assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4592975/
https://www.ncbi.nlm.nih.gov/pubmed/26434770
http://dx.doi.org/10.1038/srep14717
work_keys_str_mv AT barryrochedaniel anassessmentoftheamountofuntappedfoldlevelnoveltyinundersampledareasofthetreeoflife
AT brulsthomas anassessmentoftheamountofuntappedfoldlevelnoveltyinundersampledareasofthetreeoflife
AT barryrochedaniel assessmentoftheamountofuntappedfoldlevelnoveltyinundersampledareasofthetreeoflife
AT brulsthomas assessmentoftheamountofuntappedfoldlevelnoveltyinundersampledareasofthetreeoflife