Cargando…

Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences

BACKGROUND: Coalescent simulation is pivotal for understanding population evolutionary models and demographic histories, as well as for developing novel analytical methods for genetic association studies for DNA sequence data. A plethora of coalescent simulators are developed, but selecting the most...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Tao, Deng, Hong-Wen, Niu, Tianhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3890628/
https://www.ncbi.nlm.nih.gov/pubmed/24387001
http://dx.doi.org/10.1186/1471-2105-15-3
_version_ 1782299290744389632
author Yang, Tao
Deng, Hong-Wen
Niu, Tianhua
author_facet Yang, Tao
Deng, Hong-Wen
Niu, Tianhua
author_sort Yang, Tao
collection PubMed
description BACKGROUND: Coalescent simulation is pivotal for understanding population evolutionary models and demographic histories, as well as for developing novel analytical methods for genetic association studies for DNA sequence data. A plethora of coalescent simulators are developed, but selecting the most appropriate program remains challenging. RESULTS: We extensively compared performances of five widely used coalescent simulators – Hudson’s ms, msHOT, MaCS, Simcoal2, and fastsimcoal, to provide a practical guide considering three crucial factors, 1) speed, 2) scalability and 3) recombination hotspot position and intensity accuracy. Although ms represents a popular standard coalescent simulator, it lacks the ability to simulate sequences with recombination hotspots. An extended program msHOT has compensated for the deficiency of ms by incorporating recombination hotspots and gene conversion events at arbitrarily chosen locations and intensities, but remains limited in simulating long stretches of DNA sequences. Simcoal2, based on a discrete generation-by-generation approach, could simulate more complex demographic scenarios, but runs comparatively slow. MaCS and fastsimcoal, both built on fast, modified sequential Markov coalescent algorithms to approximate standard coalescent, are much more efficient whilst keeping salient features of msHOT and Simcoal2, respectively. Our simulations demonstrate that they are more advantageous over other programs for a spectrum of evolutionary models. To validate recombination hotspots, LDhat 2.2 rhomap package, sequenceLDhot and Haploview were compared for hotspot detection, and sequenceLDhot exhibited the best performance based on both real and simulated data. CONCLUSIONS: While ms remains an excellent choice for general coalescent simulations of DNA sequences, MaCS and fastsimcoal are much more scalable and flexible in simulating a variety of demographic events under different recombination hotspot models. Furthermore, sequenceLDhot appears to give the most optimal performance in detecting and validating cross-over hotspots.
format Online
Article
Text
id pubmed-3890628
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38906282014-01-23 Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences Yang, Tao Deng, Hong-Wen Niu, Tianhua BMC Bioinformatics Methodology Article BACKGROUND: Coalescent simulation is pivotal for understanding population evolutionary models and demographic histories, as well as for developing novel analytical methods for genetic association studies for DNA sequence data. A plethora of coalescent simulators are developed, but selecting the most appropriate program remains challenging. RESULTS: We extensively compared performances of five widely used coalescent simulators – Hudson’s ms, msHOT, MaCS, Simcoal2, and fastsimcoal, to provide a practical guide considering three crucial factors, 1) speed, 2) scalability and 3) recombination hotspot position and intensity accuracy. Although ms represents a popular standard coalescent simulator, it lacks the ability to simulate sequences with recombination hotspots. An extended program msHOT has compensated for the deficiency of ms by incorporating recombination hotspots and gene conversion events at arbitrarily chosen locations and intensities, but remains limited in simulating long stretches of DNA sequences. Simcoal2, based on a discrete generation-by-generation approach, could simulate more complex demographic scenarios, but runs comparatively slow. MaCS and fastsimcoal, both built on fast, modified sequential Markov coalescent algorithms to approximate standard coalescent, are much more efficient whilst keeping salient features of msHOT and Simcoal2, respectively. Our simulations demonstrate that they are more advantageous over other programs for a spectrum of evolutionary models. To validate recombination hotspots, LDhat 2.2 rhomap package, sequenceLDhot and Haploview were compared for hotspot detection, and sequenceLDhot exhibited the best performance based on both real and simulated data. CONCLUSIONS: While ms remains an excellent choice for general coalescent simulations of DNA sequences, MaCS and fastsimcoal are much more scalable and flexible in simulating a variety of demographic events under different recombination hotspot models. Furthermore, sequenceLDhot appears to give the most optimal performance in detecting and validating cross-over hotspots. BioMed Central 2014-01-03 /pmc/articles/PMC3890628/ /pubmed/24387001 http://dx.doi.org/10.1186/1471-2105-15-3 Text en Copyright © 2014 Yang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Yang, Tao
Deng, Hong-Wen
Niu, Tianhua
Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
title Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
title_full Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
title_fullStr Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
title_full_unstemmed Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
title_short Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
title_sort critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3890628/
https://www.ncbi.nlm.nih.gov/pubmed/24387001
http://dx.doi.org/10.1186/1471-2105-15-3
work_keys_str_mv AT yangtao criticalassessmentofcoalescentsimulatorsinmodelingrecombinationhotspotsingenomicsequences
AT denghongwen criticalassessmentofcoalescentsimulatorsinmodelingrecombinationhotspotsingenomicsequences
AT niutianhua criticalassessmentofcoalescentsimulatorsinmodelingrecombinationhotspotsingenomicsequences