Cargando…
DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics
Restriction site Associated DNA Sequencing (RAD-Seq) is a technique characterized by the sequencing of specific loci along the genome that is widely employed in the field of evolutionary biology since it allows to exploit variants (mainly Single Nucleotide Polymorphism—SNPs) information from entire...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7293188/ https://www.ncbi.nlm.nih.gov/pubmed/32566401 http://dx.doi.org/10.7717/peerj.9291 |
_version_ | 1783546248524988416 |
---|---|
author | Gauthier, Jérémy Mouden, Charlotte Suchan, Tomasz Alvarez, Nadir Arrigo, Nils Riou, Chloé Lemaitre, Claire Peterlongo, Pierre |
author_facet | Gauthier, Jérémy Mouden, Charlotte Suchan, Tomasz Alvarez, Nadir Arrigo, Nils Riou, Chloé Lemaitre, Claire Peterlongo, Pierre |
author_sort | Gauthier, Jérémy |
collection | PubMed |
description | Restriction site Associated DNA Sequencing (RAD-Seq) is a technique characterized by the sequencing of specific loci along the genome that is widely employed in the field of evolutionary biology since it allows to exploit variants (mainly Single Nucleotide Polymorphism—SNPs) information from entire populations at a reduced cost. Common RAD dedicated tools, such as STACKS or IPyRAD, are based on all-vs-all read alignments, which require consequent time and computing resources. We present an original method, DiscoSnp-RAD, that avoids this pitfall since variants are detected by exploiting specific parts of the assembly graph built from the reads, hence preventing all-vs-all read alignments. We tested the implementation on simulated datasets of increasing size, up to 1,000 samples, and on real RAD-Seq data from 259 specimens of Chiastocheta flies, morphologically assigned to seven species. All individuals were successfully assigned to their species using both STRUCTURE and Maximum Likelihood phylogenetic reconstruction. Moreover, identified variants succeeded to reveal a within-species genetic structure linked to the geographic distribution. Furthermore, our results show that DiscoSnp-RAD is significantly faster than state-of-the-art tools. The overall results show that DiscoSnp-RAD is suitable to identify variants from RAD-Seq data, it does not require time-consuming parameterization steps and it stands out from other tools due to its completely different principle, making it substantially faster, in particular on large datasets. |
format | Online Article Text |
id | pubmed-7293188 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-72931882020-06-18 DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics Gauthier, Jérémy Mouden, Charlotte Suchan, Tomasz Alvarez, Nadir Arrigo, Nils Riou, Chloé Lemaitre, Claire Peterlongo, Pierre PeerJ Bioinformatics Restriction site Associated DNA Sequencing (RAD-Seq) is a technique characterized by the sequencing of specific loci along the genome that is widely employed in the field of evolutionary biology since it allows to exploit variants (mainly Single Nucleotide Polymorphism—SNPs) information from entire populations at a reduced cost. Common RAD dedicated tools, such as STACKS or IPyRAD, are based on all-vs-all read alignments, which require consequent time and computing resources. We present an original method, DiscoSnp-RAD, that avoids this pitfall since variants are detected by exploiting specific parts of the assembly graph built from the reads, hence preventing all-vs-all read alignments. We tested the implementation on simulated datasets of increasing size, up to 1,000 samples, and on real RAD-Seq data from 259 specimens of Chiastocheta flies, morphologically assigned to seven species. All individuals were successfully assigned to their species using both STRUCTURE and Maximum Likelihood phylogenetic reconstruction. Moreover, identified variants succeeded to reveal a within-species genetic structure linked to the geographic distribution. Furthermore, our results show that DiscoSnp-RAD is significantly faster than state-of-the-art tools. The overall results show that DiscoSnp-RAD is suitable to identify variants from RAD-Seq data, it does not require time-consuming parameterization steps and it stands out from other tools due to its completely different principle, making it substantially faster, in particular on large datasets. PeerJ Inc. 2020-06-10 /pmc/articles/PMC7293188/ /pubmed/32566401 http://dx.doi.org/10.7717/peerj.9291 Text en © 2020 Gauthier et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Gauthier, Jérémy Mouden, Charlotte Suchan, Tomasz Alvarez, Nadir Arrigo, Nils Riou, Chloé Lemaitre, Claire Peterlongo, Pierre DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics |
title | DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics |
title_full | DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics |
title_fullStr | DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics |
title_full_unstemmed | DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics |
title_short | DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics |
title_sort | discosnp-rad: de novo detection of small variants for rad-seq population genomics |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7293188/ https://www.ncbi.nlm.nih.gov/pubmed/32566401 http://dx.doi.org/10.7717/peerj.9291 |
work_keys_str_mv | AT gauthierjeremy discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics AT moudencharlotte discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics AT suchantomasz discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics AT alvareznadir discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics AT arrigonils discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics AT riouchloe discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics AT lemaitreclaire discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics AT peterlongopierre discosnpraddenovodetectionofsmallvariantsforradseqpopulationgenomics |