Cargando…

New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs

Small insertions and deletions (INDELs; ≤50 bp) are the most common type of variability after single nucleotide polymorphism (SNP). However, compared with SNPs, we know little about the distribution of fitness effects (DFE) of new INDEL mutations and how prevalent adaptive INDEL substitutions are. S...

Descripción completa

Detalles Bibliográficos
Autores principales: Barton, Henry J, Zeng, Kai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5967470/
https://www.ncbi.nlm.nih.gov/pubmed/29635416
http://dx.doi.org/10.1093/molbev/msy054
_version_ 1783325610250076160
author Barton, Henry J
Zeng, Kai
author_facet Barton, Henry J
Zeng, Kai
author_sort Barton, Henry J
collection PubMed
description Small insertions and deletions (INDELs; ≤50 bp) are the most common type of variability after single nucleotide polymorphism (SNP). However, compared with SNPs, we know little about the distribution of fitness effects (DFE) of new INDEL mutations and how prevalent adaptive INDEL substitutions are. Studying INDELs has been difficult partly because identifying ancestral states at these sites is error-prone and misidentification can lead to severely biased estimates of the strength of selection. To solve these problems, we develop new maximum likelihood methods, which use polymorphism data to simultaneously estimate the DFE, the mutation rate, and the misidentification rate. These methods are applicable to both INDELs and SNPs. Simulations show that they can provide highly accurate results. We applied the methods to an INDEL polymorphism data set in Drosophila melanogaster. We found that the DFE for polymorphic INDELs in protein-coding regions is bimodal, with the variants being either nearly neutral or strongly deleterious. Based on the DFE, we estimated that 71.5–83.7% of the INDEL substitutions that took place along the D. melanogaster lineage were fixed by positive selection, which is comparable with the prevalence of adaptive substitutions at nonsynonymous sites. The new methods have been implemented in the software package anavar.
format Online
Article
Text
id pubmed-5967470
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-59674702018-06-04 New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs Barton, Henry J Zeng, Kai Mol Biol Evol Methods Small insertions and deletions (INDELs; ≤50 bp) are the most common type of variability after single nucleotide polymorphism (SNP). However, compared with SNPs, we know little about the distribution of fitness effects (DFE) of new INDEL mutations and how prevalent adaptive INDEL substitutions are. Studying INDELs has been difficult partly because identifying ancestral states at these sites is error-prone and misidentification can lead to severely biased estimates of the strength of selection. To solve these problems, we develop new maximum likelihood methods, which use polymorphism data to simultaneously estimate the DFE, the mutation rate, and the misidentification rate. These methods are applicable to both INDELs and SNPs. Simulations show that they can provide highly accurate results. We applied the methods to an INDEL polymorphism data set in Drosophila melanogaster. We found that the DFE for polymorphic INDELs in protein-coding regions is bimodal, with the variants being either nearly neutral or strongly deleterious. Based on the DFE, we estimated that 71.5–83.7% of the INDEL substitutions that took place along the D. melanogaster lineage were fixed by positive selection, which is comparable with the prevalence of adaptive substitutions at nonsynonymous sites. The new methods have been implemented in the software package anavar. Oxford University Press 2018-06 2018-04-04 /pmc/articles/PMC5967470/ /pubmed/29635416 http://dx.doi.org/10.1093/molbev/msy054 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Barton, Henry J
Zeng, Kai
New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs
title New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs
title_full New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs
title_fullStr New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs
title_full_unstemmed New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs
title_short New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs
title_sort new methods for inferring the distribution of fitness effects for indels and snps
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5967470/
https://www.ncbi.nlm.nih.gov/pubmed/29635416
http://dx.doi.org/10.1093/molbev/msy054
work_keys_str_mv AT bartonhenryj newmethodsforinferringthedistributionoffitnesseffectsforindelsandsnps
AT zengkai newmethodsforinferringthedistributionoffitnesseffectsforindelsandsnps