Cargando…

Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L.

The RADseq technology allows researchers to efficiently develop thousands of polymorphic loci across multiple individuals with little or no prior information on the genome. However, many questions remain about the biases inherent to this technology. Notably, sequence misalignments arising from paral...

Descripción completa

Detalles Bibliográficos
Autores principales: Verdu, Cindy F., Guichoux, Erwan, Quevauvillers, Samuel, De Thier, Olivier, Laizet, Yec'han, Delcamp, Adline, Gévaudant, Frédéric, Monty, Arnaud, Porté, Annabel J., Lejeune, Philippe, Lassois, Ludivine, Mariette, Stéphanie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5513258/
https://www.ncbi.nlm.nih.gov/pubmed/28725400
http://dx.doi.org/10.1002/ece3.2466
_version_ 1783250624663519232
author Verdu, Cindy F.
Guichoux, Erwan
Quevauvillers, Samuel
De Thier, Olivier
Laizet, Yec'han
Delcamp, Adline
Gévaudant, Frédéric
Monty, Arnaud
Porté, Annabel J.
Lejeune, Philippe
Lassois, Ludivine
Mariette, Stéphanie
author_facet Verdu, Cindy F.
Guichoux, Erwan
Quevauvillers, Samuel
De Thier, Olivier
Laizet, Yec'han
Delcamp, Adline
Gévaudant, Frédéric
Monty, Arnaud
Porté, Annabel J.
Lejeune, Philippe
Lassois, Ludivine
Mariette, Stéphanie
author_sort Verdu, Cindy F.
collection PubMed
description The RADseq technology allows researchers to efficiently develop thousands of polymorphic loci across multiple individuals with little or no prior information on the genome. However, many questions remain about the biases inherent to this technology. Notably, sequence misalignments arising from paralogy may affect the development of single nucleotide polymorphism (SNP) markers and the estimation of genetic diversity. We evaluated the impact of putative paralog loci on genetic diversity estimation during the development of SNPs from a RADseq dataset for the nonmodel tree species Robinia pseudoacacia L. We sequenced nine genotypes and analyzed the frequency of putative paralogous RAD loci as a function of both the depth of coverage and the mismatch threshold allowed between loci. Putative paralogy was detected in a very variable number of loci, from 1% to more than 20%, with the depth of coverage having a major influence on the result. Putative paralogy artificially increased the observed degree of polymorphism and resulting estimates of diversity. The choice of the depth of coverage also affected diversity estimation and SNP validation: A low threshold decreased the chances of detecting minor alleles while a high threshold increased allelic dropout. SNP validation was better for the low threshold (4×) than for the high threshold (18×) we tested. Using the strategy developed here, we were able to validate more than 80% of the SNPs tested by means of individual genotyping, resulting in a readily usable set of 330 SNPs, suitable for use in population genetics applications.
format Online
Article
Text
id pubmed-5513258
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-55132582017-07-19 Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L. Verdu, Cindy F. Guichoux, Erwan Quevauvillers, Samuel De Thier, Olivier Laizet, Yec'han Delcamp, Adline Gévaudant, Frédéric Monty, Arnaud Porté, Annabel J. Lejeune, Philippe Lassois, Ludivine Mariette, Stéphanie Ecol Evol Original Research The RADseq technology allows researchers to efficiently develop thousands of polymorphic loci across multiple individuals with little or no prior information on the genome. However, many questions remain about the biases inherent to this technology. Notably, sequence misalignments arising from paralogy may affect the development of single nucleotide polymorphism (SNP) markers and the estimation of genetic diversity. We evaluated the impact of putative paralog loci on genetic diversity estimation during the development of SNPs from a RADseq dataset for the nonmodel tree species Robinia pseudoacacia L. We sequenced nine genotypes and analyzed the frequency of putative paralogous RAD loci as a function of both the depth of coverage and the mismatch threshold allowed between loci. Putative paralogy was detected in a very variable number of loci, from 1% to more than 20%, with the depth of coverage having a major influence on the result. Putative paralogy artificially increased the observed degree of polymorphism and resulting estimates of diversity. The choice of the depth of coverage also affected diversity estimation and SNP validation: A low threshold decreased the chances of detecting minor alleles while a high threshold increased allelic dropout. SNP validation was better for the low threshold (4×) than for the high threshold (18×) we tested. Using the strategy developed here, we were able to validate more than 80% of the SNPs tested by means of individual genotyping, resulting in a readily usable set of 330 SNPs, suitable for use in population genetics applications. John Wiley and Sons Inc. 2016-09-22 /pmc/articles/PMC5513258/ /pubmed/28725400 http://dx.doi.org/10.1002/ece3.2466 Text en © 2016 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution (http://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Research
Verdu, Cindy F.
Guichoux, Erwan
Quevauvillers, Samuel
De Thier, Olivier
Laizet, Yec'han
Delcamp, Adline
Gévaudant, Frédéric
Monty, Arnaud
Porté, Annabel J.
Lejeune, Philippe
Lassois, Ludivine
Mariette, Stéphanie
Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L.
title Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L.
title_full Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L.
title_fullStr Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L.
title_full_unstemmed Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L.
title_short Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L.
title_sort dealing with paralogy in radseq data: in silico detection and single nucleotide polymorphism validation in robinia pseudoacacia l.
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5513258/
https://www.ncbi.nlm.nih.gov/pubmed/28725400
http://dx.doi.org/10.1002/ece3.2466
work_keys_str_mv AT verducindyf dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT guichouxerwan dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT quevauvillerssamuel dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT dethierolivier dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT laizetyechan dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT delcampadline dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT gevaudantfrederic dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT montyarnaud dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT porteannabelj dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT lejeunephilippe dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT lassoisludivine dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial
AT mariettestephanie dealingwithparalogyinradseqdatainsilicodetectionandsinglenucleotidepolymorphismvalidationinrobiniapseudoacacial