Cargando…
GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species
BACKGROUND: Many wild species have suffered drastic population size declines over the past centuries, which have led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day po...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9195343/ https://www.ncbi.nlm.nih.gov/pubmed/35698034 http://dx.doi.org/10.1186/s12859-022-04757-0 |
_version_ | 1784726945635762176 |
---|---|
author | Kutschera, Verena E. Kierczak, Marcin van der Valk, Tom von Seth, Johanna Dussex, Nicolas Lord, Edana Dehasque, Marianne Stanton, David W. G. Khoonsari, Payam Emami Nystedt, Björn Dalén, Love Díez-del-Molino, David |
author_facet | Kutschera, Verena E. Kierczak, Marcin van der Valk, Tom von Seth, Johanna Dussex, Nicolas Lord, Edana Dehasque, Marianne Stanton, David W. G. Khoonsari, Payam Emami Nystedt, Björn Dalén, Love Díez-del-Molino, David |
author_sort | Kutschera, Verena E. |
collection | PubMed |
description | BACKGROUND: Many wild species have suffered drastic population size declines over the past centuries, which have led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day populations often lack concordance with dwindling population sizes and conservation status of threatened species. One way to directly quantify the genomic consequences of population declines is to compare genome-wide data from pre-decline museum samples and modern samples. However, doing so requires computational data processing and analysis tools specifically adapted to comparative analyses of degraded, ancient or historical, DNA data with modern DNA data as well as personnel trained to perform such analyses. RESULTS: Here, we present a highly flexible, scalable, and modular pipeline to compare patterns of genomic erosion using samples from disparate time periods. The GenErode pipeline uses state-of-the-art bioinformatics tools to simultaneously process whole-genome re-sequencing data from ancient/historical and modern samples, and to produce comparable estimates of several genomic erosion indices. No programming knowledge is required to run the pipeline and all bioinformatic steps are well-documented, making the pipeline accessible to users with different backgrounds. GenErode is written in Snakemake and Python3 and uses Conda and Singularity containers to achieve reproducibility on high-performance compute clusters. The source code is freely available on GitHub (https://github.com/NBISweden/GenErode). CONCLUSIONS: GenErode is a user-friendly and reproducible pipeline that enables the standardization of genomic erosion indices from temporally sampled whole genome re-sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04757-0. |
format | Online Article Text |
id | pubmed-9195343 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-91953432022-06-15 GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species Kutschera, Verena E. Kierczak, Marcin van der Valk, Tom von Seth, Johanna Dussex, Nicolas Lord, Edana Dehasque, Marianne Stanton, David W. G. Khoonsari, Payam Emami Nystedt, Björn Dalén, Love Díez-del-Molino, David BMC Bioinformatics Software BACKGROUND: Many wild species have suffered drastic population size declines over the past centuries, which have led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day populations often lack concordance with dwindling population sizes and conservation status of threatened species. One way to directly quantify the genomic consequences of population declines is to compare genome-wide data from pre-decline museum samples and modern samples. However, doing so requires computational data processing and analysis tools specifically adapted to comparative analyses of degraded, ancient or historical, DNA data with modern DNA data as well as personnel trained to perform such analyses. RESULTS: Here, we present a highly flexible, scalable, and modular pipeline to compare patterns of genomic erosion using samples from disparate time periods. The GenErode pipeline uses state-of-the-art bioinformatics tools to simultaneously process whole-genome re-sequencing data from ancient/historical and modern samples, and to produce comparable estimates of several genomic erosion indices. No programming knowledge is required to run the pipeline and all bioinformatic steps are well-documented, making the pipeline accessible to users with different backgrounds. GenErode is written in Snakemake and Python3 and uses Conda and Singularity containers to achieve reproducibility on high-performance compute clusters. The source code is freely available on GitHub (https://github.com/NBISweden/GenErode). CONCLUSIONS: GenErode is a user-friendly and reproducible pipeline that enables the standardization of genomic erosion indices from temporally sampled whole genome re-sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04757-0. BioMed Central 2022-06-13 /pmc/articles/PMC9195343/ /pubmed/35698034 http://dx.doi.org/10.1186/s12859-022-04757-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Kutschera, Verena E. Kierczak, Marcin van der Valk, Tom von Seth, Johanna Dussex, Nicolas Lord, Edana Dehasque, Marianne Stanton, David W. G. Khoonsari, Payam Emami Nystedt, Björn Dalén, Love Díez-del-Molino, David GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species |
title | GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species |
title_full | GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species |
title_fullStr | GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species |
title_full_unstemmed | GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species |
title_short | GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species |
title_sort | generode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9195343/ https://www.ncbi.nlm.nih.gov/pubmed/35698034 http://dx.doi.org/10.1186/s12859-022-04757-0 |
work_keys_str_mv | AT kutscheraverenae generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT kierczakmarcin generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT vandervalktom generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT vonsethjohanna generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT dussexnicolas generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT lordedana generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT dehasquemarianne generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT stantondavidwg generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT khoonsaripayamemami generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT nystedtbjorn generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT dalenlove generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies AT diezdelmolinodavid generodeabioinformaticspipelinetoinvestigategenomeerosioninendangeredandextinctspecies |