Cargando…

A Probabilistic Model for Indel Evolution: Differentiating Insertions from Deletions

Insertions and deletions (indels) are common molecular evolutionary events. However, probabilistic models for indel evolution are under-developed due to their computational complexity. Here, we introduce several improvements to indel modeling: 1) While previous models for indel evolution assumed tha...

Descripción completa

Detalles Bibliográficos
Autores principales: Loewenthal, Gil, Rapoport, Dana, Avram, Oren, Moshe, Asher, Wygoda, Elya, Itzkovitch, Alon, Israeli, Omer, Azouri, Dana, Cartwright, Reed A, Mayrose, Itay, Pupko, Tal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8662616/
https://www.ncbi.nlm.nih.gov/pubmed/34469521
http://dx.doi.org/10.1093/molbev/msab266
Descripción
Sumario:Insertions and deletions (indels) are common molecular evolutionary events. However, probabilistic models for indel evolution are under-developed due to their computational complexity. Here, we introduce several improvements to indel modeling: 1) While previous models for indel evolution assumed that the rates and length distributions of insertions and deletions are equal, here we propose a richer model that explicitly distinguishes between the two; 2) we introduce numerous summary statistics that allow approximate Bayesian computation-based parameter estimation; 3) we develop a method to correct for biases introduced by alignment programs, when inferring indel parameters from empirical data sets; and 4) using a model-selection scheme, we test whether the richer model better fits biological data compared with the simpler model. Our analyses suggest that both our inference scheme and the model-selection procedure achieve high accuracy on simulated data. We further demonstrate that our proposed richer model better fits a large number of empirical data sets and that, for the majority of these data sets, the deletion rate is higher than the insertion rate.