Cargando…

Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering

Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for s...

Descripción completa

Detalles Bibliográficos
Autores principales: Anand, Santosh, Mangano, Eleonora, Barizzone, Nadia, Bordoni, Roberta, Sorosina, Melissa, Clarelli, Ferdinando, Corrado, Lucia, Martinelli Boneschi, Filippo, D’Alfonso, Sandra, De Bellis, Gianluca
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037392/
https://www.ncbi.nlm.nih.gov/pubmed/27670852
http://dx.doi.org/10.1038/srep33735
_version_ 1782455728533929984
author Anand, Santosh
Mangano, Eleonora
Barizzone, Nadia
Bordoni, Roberta
Sorosina, Melissa
Clarelli, Ferdinando
Corrado, Lucia
Martinelli Boneschi, Filippo
D’Alfonso, Sandra
De Bellis, Gianluca
author_facet Anand, Santosh
Mangano, Eleonora
Barizzone, Nadia
Bordoni, Roberta
Sorosina, Melissa
Clarelli, Ferdinando
Corrado, Lucia
Martinelli Boneschi, Filippo
D’Alfonso, Sandra
De Bellis, Gianluca
author_sort Anand, Santosh
collection PubMed
description Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments.
format Online
Article
Text
id pubmed-5037392
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-50373922016-09-30 Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering Anand, Santosh Mangano, Eleonora Barizzone, Nadia Bordoni, Roberta Sorosina, Melissa Clarelli, Ferdinando Corrado, Lucia Martinelli Boneschi, Filippo D’Alfonso, Sandra De Bellis, Gianluca Sci Rep Article Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments. Nature Publishing Group 2016-09-27 /pmc/articles/PMC5037392/ /pubmed/27670852 http://dx.doi.org/10.1038/srep33735 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Anand, Santosh
Mangano, Eleonora
Barizzone, Nadia
Bordoni, Roberta
Sorosina, Melissa
Clarelli, Ferdinando
Corrado, Lucia
Martinelli Boneschi, Filippo
D’Alfonso, Sandra
De Bellis, Gianluca
Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering
title Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering
title_full Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering
title_fullStr Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering
title_full_unstemmed Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering
title_short Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering
title_sort next generation sequencing of pooled samples: guideline for variants’ filtering
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037392/
https://www.ncbi.nlm.nih.gov/pubmed/27670852
http://dx.doi.org/10.1038/srep33735
work_keys_str_mv AT anandsantosh nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT manganoeleonora nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT barizzonenadia nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT bordoniroberta nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT sorosinamelissa nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT clarelliferdinando nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT corradolucia nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT martinelliboneschifilippo nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT dalfonsosandra nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering
AT debellisgianluca nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering