Cargando…

SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes

The advent of modern DNA sequencing technology is the driving force in obtaining complete intra-specific genomes that can be used to detect loci that have been subject to positive selection in the recent past. Based on selective sweep theory, beneficial loci can be detected by examining the single n...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pavlidis, Pavlos, Živković, Daniel, Stamatakis, Alexandros, Alachiotis, Nikolaos
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2013
Materias:	Resources
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3748355/ https://www.ncbi.nlm.nih.gov/pubmed/23777627 http://dx.doi.org/10.1093/molbev/mst112

_version_	1782281057724268544
author	Pavlidis, Pavlos Živković, Daniel Stamatakis, Alexandros Alachiotis, Nikolaos
author_facet	Pavlidis, Pavlos Živković, Daniel Stamatakis, Alexandros Alachiotis, Nikolaos
author_sort	Pavlidis, Pavlos
collection	PubMed
description	The advent of modern DNA sequencing technology is the driving force in obtaining complete intra-specific genomes that can be used to detect loci that have been subject to positive selection in the recent past. Based on selective sweep theory, beneficial loci can be detected by examining the single nucleotide polymorphism patterns in intraspecific genome alignments. In the last decade, a plethora of algorithms for identifying selective sweeps have been developed. However, the majority of these algorithms have not been designed for analyzing whole-genome data. We present SweeD (Sweep Detector), an open-source tool for the rapid detection of selective sweeps in whole genomes. It analyzes site frequency spectra and represents a substantial extension of the widely used SweepFinder program. The sequential version of SweeD is up to 22 times faster than SweepFinder and, more importantly, is able to analyze thousands of sequences. We also provide a parallel implementation of SweeD for multi-core processors. Furthermore, we implemented a checkpointing mechanism that allows to deploy SweeD on cluster systems with queue execution time restrictions, as well as to resume long-running analyses after processor failures. In addition, the user can specify various demographic models via the command-line to calculate their theoretically expected site frequency spectra. Therefore, (in contrast to SweepFinder) the neutral site frequencies can optionally be directly calculated from a given demographic model. We show that an increase of sample size results in more precise detection of positive selection. Thus, the ability to analyze substantially larger sample sizes by using SweeD leads to more accurate sweep detection. We validate SweeD via simulations and by scanning the first chromosome from the 1000 human Genomes project for selective sweeps. We compare SweeD results with results from a linkage-disequilibrium-based approach and identify common outliers.
format	Online Article Text
id	pubmed-3748355
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-37483552013-08-21 SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes Pavlidis, Pavlos Živković, Daniel Stamatakis, Alexandros Alachiotis, Nikolaos Mol Biol Evol Resources The advent of modern DNA sequencing technology is the driving force in obtaining complete intra-specific genomes that can be used to detect loci that have been subject to positive selection in the recent past. Based on selective sweep theory, beneficial loci can be detected by examining the single nucleotide polymorphism patterns in intraspecific genome alignments. In the last decade, a plethora of algorithms for identifying selective sweeps have been developed. However, the majority of these algorithms have not been designed for analyzing whole-genome data. We present SweeD (Sweep Detector), an open-source tool for the rapid detection of selective sweeps in whole genomes. It analyzes site frequency spectra and represents a substantial extension of the widely used SweepFinder program. The sequential version of SweeD is up to 22 times faster than SweepFinder and, more importantly, is able to analyze thousands of sequences. We also provide a parallel implementation of SweeD for multi-core processors. Furthermore, we implemented a checkpointing mechanism that allows to deploy SweeD on cluster systems with queue execution time restrictions, as well as to resume long-running analyses after processor failures. In addition, the user can specify various demographic models via the command-line to calculate their theoretically expected site frequency spectra. Therefore, (in contrast to SweepFinder) the neutral site frequencies can optionally be directly calculated from a given demographic model. We show that an increase of sample size results in more precise detection of positive selection. Thus, the ability to analyze substantially larger sample sizes by using SweeD leads to more accurate sweep detection. We validate SweeD via simulations and by scanning the first chromosome from the 1000 human Genomes project for selective sweeps. We compare SweeD results with results from a linkage-disequilibrium-based approach and identify common outliers. Oxford University Press 2013-09 2013-06-18 /pmc/articles/PMC3748355/ /pubmed/23777627 http://dx.doi.org/10.1093/molbev/mst112 Text en © The Author 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Resources Pavlidis, Pavlos Živković, Daniel Stamatakis, Alexandros Alachiotis, Nikolaos SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes
title	SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes
title_full	SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes
title_fullStr	SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes
title_full_unstemmed	SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes
title_short	SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes
title_sort	sweed: likelihood-based detection of selective sweeps in thousands of genomes
topic	Resources
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3748355/ https://www.ncbi.nlm.nih.gov/pubmed/23777627 http://dx.doi.org/10.1093/molbev/mst112
work_keys_str_mv	AT pavlidispavlos sweedlikelihoodbaseddetectionofselectivesweepsinthousandsofgenomes AT zivkovicdaniel sweedlikelihoodbaseddetectionofselectivesweepsinthousandsofgenomes AT stamatakisalexandros sweedlikelihoodbaseddetectionofselectivesweepsinthousandsofgenomes AT alachiotisnikolaos sweedlikelihoodbaseddetectionofselectivesweepsinthousandsofgenomes

SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes

Ejemplares similares