Cargando…

GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species

BACKGROUND: Many wild species have suffered drastic population size declines over the past centuries, which have led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day po...

Descripción completa

Detalles Bibliográficos
Autores principales: Kutschera, Verena E., Kierczak, Marcin, van der Valk, Tom, von Seth, Johanna, Dussex, Nicolas, Lord, Edana, Dehasque, Marianne, Stanton, David W. G., Khoonsari, Payam Emami, Nystedt, Björn, Dalén, Love, Díez-del-Molino, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9195343/
https://www.ncbi.nlm.nih.gov/pubmed/35698034
http://dx.doi.org/10.1186/s12859-022-04757-0
_version_ 1784726945635762176
author Kutschera, Verena E.
Kierczak, Marcin
van der Valk, Tom
von Seth, Johanna
Dussex, Nicolas
Lord, Edana
Dehasque, Marianne
Stanton, David W. G.
Khoonsari, Payam Emami
Nystedt, Björn
Dalén, Love
Díez-del-Molino, David
author_facet Kutschera, Verena E.
Kierczak, Marcin
van der Valk, Tom
von Seth, Johanna
Dussex, Nicolas
Lord, Edana
Dehasque, Marianne
Stanton, David W. G.
Khoonsari, Payam Emami
Nystedt, Björn
Dalén, Love
Díez-del-Molino, David
author_sort Kutschera, Verena E.
collection PubMed
description BACKGROUND: Many wild species have suffered drastic population size declines over the past centuries, which have led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day populations often lack concordance with dwindling population sizes and conservation status of threatened species. One way to directly quantify the genomic consequences of population declines is to compare genome-wide data from pre-decline museum samples and modern samples. However, doing so requires computational data processing and analysis tools specifically adapted to comparative analyses of degraded, ancient or historical, DNA data with modern DNA data as well as personnel trained to perform such analyses. RESULTS: Here, we present a highly flexible, scalable, and modular pipeline to compare patterns of genomic erosion using samples from disparate time periods. The GenErode pipeline uses state-of-the-art bioinformatics tools to simultaneously process whole-genome re-sequencing data from ancient/historical and modern samples, and to produce comparable estimates of several genomic erosion indices. No programming knowledge is required to run the pipeline and all bioinformatic steps are well-documented, making the pipeline accessible to users with different backgrounds. GenErode is written in Snakemake and Python3 and uses Conda and Singularity containers to achieve reproducibility on high-performance compute clusters. The source code is freely available on GitHub (https://github.com/NBISweden/GenErode). CONCLUSIONS: GenErode is a user-friendly and reproducible pipeline that enables the standardization of genomic erosion indices from temporally sampled whole genome re-sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04757-0.
format Online
Article
Text
id pubmed-9195343
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91953432022-06-15 GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species Kutschera, Verena E. Kierczak, Marcin van der Valk, Tom von Seth, Johanna Dussex, Nicolas Lord, Edana Dehasque, Marianne Stanton, David W. G. Khoonsari, Payam Emami Nystedt, Björn Dalén, Love Díez-del-Molino, David BMC Bioinformatics Software BACKGROUND: Many wild species have suffered drastic population size declines over the past centuries, which have led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day populations often lack concordance with dwindling population sizes and conservation status of threatened species. One way to directly quantify the genomic consequences of population declines is to compare genome-wide data from pre-decline museum samples and modern samples. However, doing so requires computational data processing and analysis tools specifically adapted to comparative analyses of degraded, ancient or historical, DNA data with modern DNA data as well as personnel trained to perform such analyses. RESULTS: Here, we present a highly flexible, scalable, and modular pipeline to compare patterns of genomic erosion using samples from disparate time periods. The GenErode pipeline uses state-of-the-art bioinformatics tools to simultaneously process whole-genome re-sequencing data from ancient/historical and modern samples, and to produce comparable estimates of several genomic erosion indices. No programming knowledge is required to run the pipeline and all bioinformatic steps are well-documented, making the pipeline accessible to users with different backgrounds. GenErode is written in Snakemake and Python3 and uses Conda and Singularity containers to achieve reproducibility on high-performance compute clusters. The source code is freely available on GitHub (https://github.com/NBISweden/GenErode). CONCLUSIONS: GenErode is a user-friendly and reproducible pipeline that enables the standardization of genomic erosion indices from temporally sampled whole genome re-sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04757-0. BioMed Central 2022-06-13 /pmc/articles/PMC9195343/ /pubmed/35698034 http://dx.doi.org/10.1186/s12859-022-04757-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Kutschera, Verena E.
Kierczak, Marcin
van der Valk, Tom
von Seth, Johanna
Dussex, Nicolas
Lord, Edana
Dehasque, Marianne
Stanton, David W. G.
Khoonsari, Payam Emami
Nystedt, Björn
Dalén, Love
Díez-del-Molino, David
GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
title GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
title_full GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
title_fullStr GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
title_full_unstemmed GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
title_short GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
title_sort generode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9195343/
https://www.ncbi.nlm.nih.gov/pubmed/35698034
http://dx.doi.org/10.1186/s12859-022-04757-0
work_keys_str_mv AT kutscheraverenae generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT kierczakmarcin generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT vandervalktom generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT vonsethjohanna generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT dussexnicolas generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT lordedana generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT dehasquemarianne generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT stantondavidwg generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT khoonsaripayamemami generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT nystedtbjorn generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT dalenlove generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies
AT diezdelmolinodavid generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies