Cargando…

A survey of methods and tools to detect recent and strong positive selection

Positive selection occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and due to genetic hitchhiking the neighboring linked variation diminishes, creating so-called selective sweeps. Detecting traces of positive selection in genomes...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pavlidis, Pavlos, Alachiotis, Nikolaos
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Review
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5385031/ https://www.ncbi.nlm.nih.gov/pubmed/28405579 http://dx.doi.org/10.1186/s40709-017-0064-0

_version_	1782520529926750208
author	Pavlidis, Pavlos Alachiotis, Nikolaos
author_facet	Pavlidis, Pavlos Alachiotis, Nikolaos
author_sort	Pavlidis, Pavlos
collection	PubMed
description	Positive selection occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and due to genetic hitchhiking the neighboring linked variation diminishes, creating so-called selective sweeps. Detecting traces of positive selection in genomes is achieved by searching for signatures introduced by selective sweeps, such as regions of reduced variation, a specific shift of the site frequency spectrum, and particular LD patterns in the region. A variety of methods and tools can be used for detecting sweeps, ranging from simple implementations that compute summary statistics such as Tajima’s D, to more advanced statistical approaches that use combinations of statistics, maximum likelihood, machine learning etc. In this survey, we present and discuss summary statistics and software tools, and classify them based on the selective sweep signature they detect, i.e., SFS-based vs. LD-based, as well as their capacity to analyze whole genomes or just subgenomic regions. Additionally, we summarize the results of comparisons among four open-source software releases (SweeD, SweepFinder, SweepFinder2, and OmegaPlus) regarding sensitivity, specificity, and execution times. In equilibrium neutral models or mild bottlenecks, both SFS- and LD-based methods are able to detect selective sweeps accurately. Methods and tools that rely on LD exhibit higher true positive rates than SFS-based ones under the model of a single sweep or recurrent hitchhiking. However, their false positive rate is elevated when a misspecified demographic model is used to represent the null hypothesis. When the correct (or similar to the correct) demographic model is used instead, the false positive rates are considerably reduced. The accuracy of detecting the true target of selection is decreased in bottleneck scenarios. In terms of execution time, LD-based methods are typically faster than SFS-based methods, due to the nature of required arithmetic.
format	Online Article Text
id	pubmed-5385031
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-53850312017-04-12 A survey of methods and tools to detect recent and strong positive selection Pavlidis, Pavlos Alachiotis, Nikolaos J Biol Res (Thessalon) Review Positive selection occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and due to genetic hitchhiking the neighboring linked variation diminishes, creating so-called selective sweeps. Detecting traces of positive selection in genomes is achieved by searching for signatures introduced by selective sweeps, such as regions of reduced variation, a specific shift of the site frequency spectrum, and particular LD patterns in the region. A variety of methods and tools can be used for detecting sweeps, ranging from simple implementations that compute summary statistics such as Tajima’s D, to more advanced statistical approaches that use combinations of statistics, maximum likelihood, machine learning etc. In this survey, we present and discuss summary statistics and software tools, and classify them based on the selective sweep signature they detect, i.e., SFS-based vs. LD-based, as well as their capacity to analyze whole genomes or just subgenomic regions. Additionally, we summarize the results of comparisons among four open-source software releases (SweeD, SweepFinder, SweepFinder2, and OmegaPlus) regarding sensitivity, specificity, and execution times. In equilibrium neutral models or mild bottlenecks, both SFS- and LD-based methods are able to detect selective sweeps accurately. Methods and tools that rely on LD exhibit higher true positive rates than SFS-based ones under the model of a single sweep or recurrent hitchhiking. However, their false positive rate is elevated when a misspecified demographic model is used to represent the null hypothesis. When the correct (or similar to the correct) demographic model is used instead, the false positive rates are considerably reduced. The accuracy of detecting the true target of selection is decreased in bottleneck scenarios. In terms of execution time, LD-based methods are typically faster than SFS-based methods, due to the nature of required arithmetic. BioMed Central 2017-04-08 /pmc/articles/PMC5385031/ /pubmed/28405579 http://dx.doi.org/10.1186/s40709-017-0064-0 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Review Pavlidis, Pavlos Alachiotis, Nikolaos A survey of methods and tools to detect recent and strong positive selection
title	A survey of methods and tools to detect recent and strong positive selection
title_full	A survey of methods and tools to detect recent and strong positive selection
title_fullStr	A survey of methods and tools to detect recent and strong positive selection
title_full_unstemmed	A survey of methods and tools to detect recent and strong positive selection
title_short	A survey of methods and tools to detect recent and strong positive selection
title_sort	survey of methods and tools to detect recent and strong positive selection
topic	Review
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5385031/ https://www.ncbi.nlm.nih.gov/pubmed/28405579 http://dx.doi.org/10.1186/s40709-017-0064-0
work_keys_str_mv	AT pavlidispavlos asurveyofmethodsandtoolstodetectrecentandstrongpositiveselection AT alachiotisnikolaos asurveyofmethodsandtoolstodetectrecentandstrongpositiveselection AT pavlidispavlos surveyofmethodsandtoolstodetectrecentandstrongpositiveselection AT alachiotisnikolaos surveyofmethodsandtoolstodetectrecentandstrongpositiveselection

A survey of methods and tools to detect recent and strong positive selection

Ejemplares similares