Cargando…

Using metagenomic data to boost protein structure prediction and discovery

Over the past decade, metagenomic sequencing approaches have been providing an ever-increasing amount of protein sequence data at an astonishing rate. These constitute an invaluable source of information which has been exploited in various research fields such as the study of the role of the gut mic...

Descripción completa

Detalles Bibliográficos
Autores principales: Hou, Qingzhen, Pucci, Fabrizio, Pan, Fengming, Xue, Fuzhong, Rooman, Marianne, Feng, Qiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8760478/
https://www.ncbi.nlm.nih.gov/pubmed/35070166
http://dx.doi.org/10.1016/j.csbj.2021.12.030
_version_ 1784633328764190720
author Hou, Qingzhen
Pucci, Fabrizio
Pan, Fengming
Xue, Fuzhong
Rooman, Marianne
Feng, Qiang
author_facet Hou, Qingzhen
Pucci, Fabrizio
Pan, Fengming
Xue, Fuzhong
Rooman, Marianne
Feng, Qiang
author_sort Hou, Qingzhen
collection PubMed
description Over the past decade, metagenomic sequencing approaches have been providing an ever-increasing amount of protein sequence data at an astonishing rate. These constitute an invaluable source of information which has been exploited in various research fields such as the study of the role of the gut microbiota in human diseases and aging. However, only a small fraction of all metagenomic sequences collected have been functionally or structurally characterized, leaving much of them completely unexplored. Here, we review how this information has been used in protein structure prediction and protein discovery. We begin by presenting some widely used metagenomic databases and analyze in detail how metagenomic data has contributed to the impressive improvement in the accuracy of structure prediction methods in recent years. We then examine how metagenomic information can be exploited to annotate protein sequences. More specifically, we focus on the role of metagenomes in the discovery of enzymes and new CRISPR-Cas systems, and in the identification of antibiotic resistance genes. With this review, we provide an overview of how metagenomic data is currently revolutionizing our understanding of protein science.
format Online
Article
Text
id pubmed-8760478
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-87604782022-01-21 Using metagenomic data to boost protein structure prediction and discovery Hou, Qingzhen Pucci, Fabrizio Pan, Fengming Xue, Fuzhong Rooman, Marianne Feng, Qiang Comput Struct Biotechnol J Mini Review Over the past decade, metagenomic sequencing approaches have been providing an ever-increasing amount of protein sequence data at an astonishing rate. These constitute an invaluable source of information which has been exploited in various research fields such as the study of the role of the gut microbiota in human diseases and aging. However, only a small fraction of all metagenomic sequences collected have been functionally or structurally characterized, leaving much of them completely unexplored. Here, we review how this information has been used in protein structure prediction and protein discovery. We begin by presenting some widely used metagenomic databases and analyze in detail how metagenomic data has contributed to the impressive improvement in the accuracy of structure prediction methods in recent years. We then examine how metagenomic information can be exploited to annotate protein sequences. More specifically, we focus on the role of metagenomes in the discovery of enzymes and new CRISPR-Cas systems, and in the identification of antibiotic resistance genes. With this review, we provide an overview of how metagenomic data is currently revolutionizing our understanding of protein science. Research Network of Computational and Structural Biotechnology 2022-01-03 /pmc/articles/PMC8760478/ /pubmed/35070166 http://dx.doi.org/10.1016/j.csbj.2021.12.030 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Mini Review
Hou, Qingzhen
Pucci, Fabrizio
Pan, Fengming
Xue, Fuzhong
Rooman, Marianne
Feng, Qiang
Using metagenomic data to boost protein structure prediction and discovery
title Using metagenomic data to boost protein structure prediction and discovery
title_full Using metagenomic data to boost protein structure prediction and discovery
title_fullStr Using metagenomic data to boost protein structure prediction and discovery
title_full_unstemmed Using metagenomic data to boost protein structure prediction and discovery
title_short Using metagenomic data to boost protein structure prediction and discovery
title_sort using metagenomic data to boost protein structure prediction and discovery
topic Mini Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8760478/
https://www.ncbi.nlm.nih.gov/pubmed/35070166
http://dx.doi.org/10.1016/j.csbj.2021.12.030
work_keys_str_mv AT houqingzhen usingmetagenomicdatatoboostproteinstructurepredictionanddiscovery
AT puccifabrizio usingmetagenomicdatatoboostproteinstructurepredictionanddiscovery
AT panfengming usingmetagenomicdatatoboostproteinstructurepredictionanddiscovery
AT xuefuzhong usingmetagenomicdatatoboostproteinstructurepredictionanddiscovery
AT roomanmarianne usingmetagenomicdatatoboostproteinstructurepredictionanddiscovery
AT fengqiang usingmetagenomicdatatoboostproteinstructurepredictionanddiscovery