Cargando…

Reference-free detection of isolated SNPs

Detecting single nucleotide polymorphisms (SNPs) between genomes is becoming a routine task with next-generation sequencing. Generally, SNP detection methods use a reference genome. As non-model organisms are increasingly investigated, the need for reference-free methods has been amplified. Most of...

Descripción completa

Detalles Bibliográficos
Autores principales: Uricaru, Raluca, Rizk, Guillaume, Lacroix, Vincent, Quillery, Elsa, Plantard, Olivier, Chikhi, Rayan, Lemaitre, Claire, Peterlongo, Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333369/
https://www.ncbi.nlm.nih.gov/pubmed/25404127
http://dx.doi.org/10.1093/nar/gku1187
_version_ 1782358027061428224
author Uricaru, Raluca
Rizk, Guillaume
Lacroix, Vincent
Quillery, Elsa
Plantard, Olivier
Chikhi, Rayan
Lemaitre, Claire
Peterlongo, Pierre
author_facet Uricaru, Raluca
Rizk, Guillaume
Lacroix, Vincent
Quillery, Elsa
Plantard, Olivier
Chikhi, Rayan
Lemaitre, Claire
Peterlongo, Pierre
author_sort Uricaru, Raluca
collection PubMed
description Detecting single nucleotide polymorphisms (SNPs) between genomes is becoming a routine task with next-generation sequencing. Generally, SNP detection methods use a reference genome. As non-model organisms are increasingly investigated, the need for reference-free methods has been amplified. Most of the existing reference-free methods have fundamental limitations: they can only call SNPs between exactly two datasets, and/or they require a prohibitive amount of computational resources. The method we propose, discoSnp, detects both heterozygous and homozygous isolated SNPs from any number of read datasets, without a reference genome, and with very low memory and time footprints (billions of reads can be analyzed with a standard desktop computer). To facilitate downstream genotyping analyses, discoSnp ranks predictions and outputs quality and coverage per allele. Compared to finding isolated SNPs using a state-of-the-art assembly and mapping approach, discoSnp requires significantly less computational resources, shows similar precision/recall values, and highly ranked predictions are less likely to be false positives. An experimental validation was conducted on an arthropod species (the tick Ixodes ricinus) on which de novo sequencing was performed. Among the predicted SNPs that were tested, 96% were successfully genotyped and truly exhibited polymorphism.
format Online
Article
Text
id pubmed-4333369
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-43333692015-03-18 Reference-free detection of isolated SNPs Uricaru, Raluca Rizk, Guillaume Lacroix, Vincent Quillery, Elsa Plantard, Olivier Chikhi, Rayan Lemaitre, Claire Peterlongo, Pierre Nucleic Acids Res Methods Online Detecting single nucleotide polymorphisms (SNPs) between genomes is becoming a routine task with next-generation sequencing. Generally, SNP detection methods use a reference genome. As non-model organisms are increasingly investigated, the need for reference-free methods has been amplified. Most of the existing reference-free methods have fundamental limitations: they can only call SNPs between exactly two datasets, and/or they require a prohibitive amount of computational resources. The method we propose, discoSnp, detects both heterozygous and homozygous isolated SNPs from any number of read datasets, without a reference genome, and with very low memory and time footprints (billions of reads can be analyzed with a standard desktop computer). To facilitate downstream genotyping analyses, discoSnp ranks predictions and outputs quality and coverage per allele. Compared to finding isolated SNPs using a state-of-the-art assembly and mapping approach, discoSnp requires significantly less computational resources, shows similar precision/recall values, and highly ranked predictions are less likely to be false positives. An experimental validation was conducted on an arthropod species (the tick Ixodes ricinus) on which de novo sequencing was performed. Among the predicted SNPs that were tested, 96% were successfully genotyped and truly exhibited polymorphism. Oxford University Press 2015-01-30 2014-11-17 /pmc/articles/PMC4333369/ /pubmed/25404127 http://dx.doi.org/10.1093/nar/gku1187 Text en © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Uricaru, Raluca
Rizk, Guillaume
Lacroix, Vincent
Quillery, Elsa
Plantard, Olivier
Chikhi, Rayan
Lemaitre, Claire
Peterlongo, Pierre
Reference-free detection of isolated SNPs
title Reference-free detection of isolated SNPs
title_full Reference-free detection of isolated SNPs
title_fullStr Reference-free detection of isolated SNPs
title_full_unstemmed Reference-free detection of isolated SNPs
title_short Reference-free detection of isolated SNPs
title_sort reference-free detection of isolated snps
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333369/
https://www.ncbi.nlm.nih.gov/pubmed/25404127
http://dx.doi.org/10.1093/nar/gku1187
work_keys_str_mv AT uricaruraluca referencefreedetectionofisolatedsnps
AT rizkguillaume referencefreedetectionofisolatedsnps
AT lacroixvincent referencefreedetectionofisolatedsnps
AT quilleryelsa referencefreedetectionofisolatedsnps
AT plantardolivier referencefreedetectionofisolatedsnps
AT chikhirayan referencefreedetectionofisolatedsnps
AT lemaitreclaire referencefreedetectionofisolatedsnps
AT peterlongopierre referencefreedetectionofisolatedsnps