Cargando…

Genomic Fishing and Data Processing for Molecular Evolution Research

Molecular evolution analyses, such as detection of adaptive/purifying selection or ancestral protein reconstruction, typically require three inputs for a target gene (or gene family) in a particular group of organisms: sequence alignment, model of evolution, and phylogenetic tree. While modern advan...

Descripción completa

Detalles Bibliográficos
Autores principales: Lorente-Martínez, Héctor, Agorreta, Ainhoa, San Mauro, Diego
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8938851/
https://www.ncbi.nlm.nih.gov/pubmed/35314663
http://dx.doi.org/10.3390/mps5020026
_version_ 1784672636888940544
author Lorente-Martínez, Héctor
Agorreta, Ainhoa
San Mauro, Diego
author_facet Lorente-Martínez, Héctor
Agorreta, Ainhoa
San Mauro, Diego
author_sort Lorente-Martínez, Héctor
collection PubMed
description Molecular evolution analyses, such as detection of adaptive/purifying selection or ancestral protein reconstruction, typically require three inputs for a target gene (or gene family) in a particular group of organisms: sequence alignment, model of evolution, and phylogenetic tree. While modern advances in high-throughput sequencing techniques have led to rapid accumulation of genomic-scale data in public repositories and databases, mining such vast amount of information often remains a challenging enterprise. Here, we describe a comprehensive, versatile workflow aimed at the preparation of genome-extracted datasets readily available for molecular evolution research. The workflow involves: (1) fishing (searching and capturing) specific gene sequences of interest from taxonomically diverse genomic data available in databases at variable levels of annotation, (2) processing and depuration of retrieved sequences, (3) production of a multiple sequence alignment, (4) selection of best-fit model of evolution, and (5) solid reconstruction of a phylogenetic tree.
format Online
Article
Text
id pubmed-8938851
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-89388512022-03-23 Genomic Fishing and Data Processing for Molecular Evolution Research Lorente-Martínez, Héctor Agorreta, Ainhoa San Mauro, Diego Methods Protoc Protocol Molecular evolution analyses, such as detection of adaptive/purifying selection or ancestral protein reconstruction, typically require three inputs for a target gene (or gene family) in a particular group of organisms: sequence alignment, model of evolution, and phylogenetic tree. While modern advances in high-throughput sequencing techniques have led to rapid accumulation of genomic-scale data in public repositories and databases, mining such vast amount of information often remains a challenging enterprise. Here, we describe a comprehensive, versatile workflow aimed at the preparation of genome-extracted datasets readily available for molecular evolution research. The workflow involves: (1) fishing (searching and capturing) specific gene sequences of interest from taxonomically diverse genomic data available in databases at variable levels of annotation, (2) processing and depuration of retrieved sequences, (3) production of a multiple sequence alignment, (4) selection of best-fit model of evolution, and (5) solid reconstruction of a phylogenetic tree. MDPI 2022-03-07 /pmc/articles/PMC8938851/ /pubmed/35314663 http://dx.doi.org/10.3390/mps5020026 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Protocol
Lorente-Martínez, Héctor
Agorreta, Ainhoa
San Mauro, Diego
Genomic Fishing and Data Processing for Molecular Evolution Research
title Genomic Fishing and Data Processing for Molecular Evolution Research
title_full Genomic Fishing and Data Processing for Molecular Evolution Research
title_fullStr Genomic Fishing and Data Processing for Molecular Evolution Research
title_full_unstemmed Genomic Fishing and Data Processing for Molecular Evolution Research
title_short Genomic Fishing and Data Processing for Molecular Evolution Research
title_sort genomic fishing and data processing for molecular evolution research
topic Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8938851/
https://www.ncbi.nlm.nih.gov/pubmed/35314663
http://dx.doi.org/10.3390/mps5020026
work_keys_str_mv AT lorentemartinezhector genomicfishinganddataprocessingformolecularevolutionresearch
AT agorretaainhoa genomicfishinganddataprocessingformolecularevolutionresearch
AT sanmaurodiego genomicfishinganddataprocessingformolecularevolutionresearch