Cargando…

DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes

Adaptive evolution has shaped major biological processes. Finding the protein-coding genes and the sites that have been subjected to adaptation during evolutionary time is a major endeavor. However, very few methods fully automate the identification of positively selected genes, and widespread sourc...

Descripción completa

Detalles Bibliográficos
Autores principales: Picard, Lea, Ganivet, Quentin, Allatif, Omran, Cimarelli, Andrea, Guéguen, Laurent, Etienne, Lucie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7544217/
https://www.ncbi.nlm.nih.gov/pubmed/32941639
http://dx.doi.org/10.1093/nar/gkaa680
_version_ 1783591814922502144
author Picard, Lea
Ganivet, Quentin
Allatif, Omran
Cimarelli, Andrea
Guéguen, Laurent
Etienne, Lucie
author_facet Picard, Lea
Ganivet, Quentin
Allatif, Omran
Cimarelli, Andrea
Guéguen, Laurent
Etienne, Lucie
author_sort Picard, Lea
collection PubMed
description Adaptive evolution has shaped major biological processes. Finding the protein-coding genes and the sites that have been subjected to adaptation during evolutionary time is a major endeavor. However, very few methods fully automate the identification of positively selected genes, and widespread sources of genetic innovations such as gene duplication and recombination are absent from most pipelines. Here, we developed DGINN, a highly-flexible and public pipeline to Detect Genetic INNovations and adaptive evolution in protein-coding genes. DGINN automates, from a gene's sequence, all steps of the evolutionary analyses necessary to detect the aforementioned innovations, including the search for homologs in databases, assignation of orthology groups, identification of duplication and recombination events, as well as detection of positive selection using five methods to increase precision and ranking of genes when a large panel is analyzed. DGINN was validated on nineteen genes with previously-characterized evolutionary histories in primates, including some engaged in host-pathogen arms-races. Our results confirm and also expand results from the literature, including novel findings on the Guanylate-binding protein family, GBPs. This establishes DGINN as an efficient tool to automatically detect genetic innovations and adaptive evolution in diverse datasets, from the user's gene of interest to a large gene list in any species range.
format Online
Article
Text
id pubmed-7544217
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-75442172020-10-15 DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes Picard, Lea Ganivet, Quentin Allatif, Omran Cimarelli, Andrea Guéguen, Laurent Etienne, Lucie Nucleic Acids Res Methods Online Adaptive evolution has shaped major biological processes. Finding the protein-coding genes and the sites that have been subjected to adaptation during evolutionary time is a major endeavor. However, very few methods fully automate the identification of positively selected genes, and widespread sources of genetic innovations such as gene duplication and recombination are absent from most pipelines. Here, we developed DGINN, a highly-flexible and public pipeline to Detect Genetic INNovations and adaptive evolution in protein-coding genes. DGINN automates, from a gene's sequence, all steps of the evolutionary analyses necessary to detect the aforementioned innovations, including the search for homologs in databases, assignation of orthology groups, identification of duplication and recombination events, as well as detection of positive selection using five methods to increase precision and ranking of genes when a large panel is analyzed. DGINN was validated on nineteen genes with previously-characterized evolutionary histories in primates, including some engaged in host-pathogen arms-races. Our results confirm and also expand results from the literature, including novel findings on the Guanylate-binding protein family, GBPs. This establishes DGINN as an efficient tool to automatically detect genetic innovations and adaptive evolution in diverse datasets, from the user's gene of interest to a large gene list in any species range. Oxford University Press 2020-09-17 /pmc/articles/PMC7544217/ /pubmed/32941639 http://dx.doi.org/10.1093/nar/gkaa680 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Picard, Lea
Ganivet, Quentin
Allatif, Omran
Cimarelli, Andrea
Guéguen, Laurent
Etienne, Lucie
DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes
title DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes
title_full DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes
title_fullStr DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes
title_full_unstemmed DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes
title_short DGINN, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes
title_sort dginn, an automated and highly-flexible pipeline for the detection of genetic innovations on protein-coding genes
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7544217/
https://www.ncbi.nlm.nih.gov/pubmed/32941639
http://dx.doi.org/10.1093/nar/gkaa680
work_keys_str_mv AT picardlea dginnanautomatedandhighlyflexiblepipelineforthedetectionofgeneticinnovationsonproteincodinggenes
AT ganivetquentin dginnanautomatedandhighlyflexiblepipelineforthedetectionofgeneticinnovationsonproteincodinggenes
AT allatifomran dginnanautomatedandhighlyflexiblepipelineforthedetectionofgeneticinnovationsonproteincodinggenes
AT cimarelliandrea dginnanautomatedandhighlyflexiblepipelineforthedetectionofgeneticinnovationsonproteincodinggenes
AT gueguenlaurent dginnanautomatedandhighlyflexiblepipelineforthedetectionofgeneticinnovationsonproteincodinggenes
AT etiennelucie dginnanautomatedandhighlyflexiblepipelineforthedetectionofgeneticinnovationsonproteincodinggenes