Cargando…
Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering
Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for s...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037392/ https://www.ncbi.nlm.nih.gov/pubmed/27670852 http://dx.doi.org/10.1038/srep33735 |
_version_ | 1782455728533929984 |
---|---|
author | Anand, Santosh Mangano, Eleonora Barizzone, Nadia Bordoni, Roberta Sorosina, Melissa Clarelli, Ferdinando Corrado, Lucia Martinelli Boneschi, Filippo D’Alfonso, Sandra De Bellis, Gianluca |
author_facet | Anand, Santosh Mangano, Eleonora Barizzone, Nadia Bordoni, Roberta Sorosina, Melissa Clarelli, Ferdinando Corrado, Lucia Martinelli Boneschi, Filippo D’Alfonso, Sandra De Bellis, Gianluca |
author_sort | Anand, Santosh |
collection | PubMed |
description | Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments. |
format | Online Article Text |
id | pubmed-5037392 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-50373922016-09-30 Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering Anand, Santosh Mangano, Eleonora Barizzone, Nadia Bordoni, Roberta Sorosina, Melissa Clarelli, Ferdinando Corrado, Lucia Martinelli Boneschi, Filippo D’Alfonso, Sandra De Bellis, Gianluca Sci Rep Article Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments. Nature Publishing Group 2016-09-27 /pmc/articles/PMC5037392/ /pubmed/27670852 http://dx.doi.org/10.1038/srep33735 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Anand, Santosh Mangano, Eleonora Barizzone, Nadia Bordoni, Roberta Sorosina, Melissa Clarelli, Ferdinando Corrado, Lucia Martinelli Boneschi, Filippo D’Alfonso, Sandra De Bellis, Gianluca Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering |
title | Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering |
title_full | Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering |
title_fullStr | Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering |
title_full_unstemmed | Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering |
title_short | Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering |
title_sort | next generation sequencing of pooled samples: guideline for variants’ filtering |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037392/ https://www.ncbi.nlm.nih.gov/pubmed/27670852 http://dx.doi.org/10.1038/srep33735 |
work_keys_str_mv | AT anandsantosh nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT manganoeleonora nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT barizzonenadia nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT bordoniroberta nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT sorosinamelissa nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT clarelliferdinando nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT corradolucia nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT martinelliboneschifilippo nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT dalfonsosandra nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering AT debellisgianluca nextgenerationsequencingofpooledsamplesguidelineforvariantsfiltering |