Cargando…
Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences
BACKGROUND: Coalescent simulation is pivotal for understanding population evolutionary models and demographic histories, as well as for developing novel analytical methods for genetic association studies for DNA sequence data. A plethora of coalescent simulators are developed, but selecting the most...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3890628/ https://www.ncbi.nlm.nih.gov/pubmed/24387001 http://dx.doi.org/10.1186/1471-2105-15-3 |
_version_ | 1782299290744389632 |
---|---|
author | Yang, Tao Deng, Hong-Wen Niu, Tianhua |
author_facet | Yang, Tao Deng, Hong-Wen Niu, Tianhua |
author_sort | Yang, Tao |
collection | PubMed |
description | BACKGROUND: Coalescent simulation is pivotal for understanding population evolutionary models and demographic histories, as well as for developing novel analytical methods for genetic association studies for DNA sequence data. A plethora of coalescent simulators are developed, but selecting the most appropriate program remains challenging. RESULTS: We extensively compared performances of five widely used coalescent simulators – Hudson’s ms, msHOT, MaCS, Simcoal2, and fastsimcoal, to provide a practical guide considering three crucial factors, 1) speed, 2) scalability and 3) recombination hotspot position and intensity accuracy. Although ms represents a popular standard coalescent simulator, it lacks the ability to simulate sequences with recombination hotspots. An extended program msHOT has compensated for the deficiency of ms by incorporating recombination hotspots and gene conversion events at arbitrarily chosen locations and intensities, but remains limited in simulating long stretches of DNA sequences. Simcoal2, based on a discrete generation-by-generation approach, could simulate more complex demographic scenarios, but runs comparatively slow. MaCS and fastsimcoal, both built on fast, modified sequential Markov coalescent algorithms to approximate standard coalescent, are much more efficient whilst keeping salient features of msHOT and Simcoal2, respectively. Our simulations demonstrate that they are more advantageous over other programs for a spectrum of evolutionary models. To validate recombination hotspots, LDhat 2.2 rhomap package, sequenceLDhot and Haploview were compared for hotspot detection, and sequenceLDhot exhibited the best performance based on both real and simulated data. CONCLUSIONS: While ms remains an excellent choice for general coalescent simulations of DNA sequences, MaCS and fastsimcoal are much more scalable and flexible in simulating a variety of demographic events under different recombination hotspot models. Furthermore, sequenceLDhot appears to give the most optimal performance in detecting and validating cross-over hotspots. |
format | Online Article Text |
id | pubmed-3890628 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38906282014-01-23 Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences Yang, Tao Deng, Hong-Wen Niu, Tianhua BMC Bioinformatics Methodology Article BACKGROUND: Coalescent simulation is pivotal for understanding population evolutionary models and demographic histories, as well as for developing novel analytical methods for genetic association studies for DNA sequence data. A plethora of coalescent simulators are developed, but selecting the most appropriate program remains challenging. RESULTS: We extensively compared performances of five widely used coalescent simulators – Hudson’s ms, msHOT, MaCS, Simcoal2, and fastsimcoal, to provide a practical guide considering three crucial factors, 1) speed, 2) scalability and 3) recombination hotspot position and intensity accuracy. Although ms represents a popular standard coalescent simulator, it lacks the ability to simulate sequences with recombination hotspots. An extended program msHOT has compensated for the deficiency of ms by incorporating recombination hotspots and gene conversion events at arbitrarily chosen locations and intensities, but remains limited in simulating long stretches of DNA sequences. Simcoal2, based on a discrete generation-by-generation approach, could simulate more complex demographic scenarios, but runs comparatively slow. MaCS and fastsimcoal, both built on fast, modified sequential Markov coalescent algorithms to approximate standard coalescent, are much more efficient whilst keeping salient features of msHOT and Simcoal2, respectively. Our simulations demonstrate that they are more advantageous over other programs for a spectrum of evolutionary models. To validate recombination hotspots, LDhat 2.2 rhomap package, sequenceLDhot and Haploview were compared for hotspot detection, and sequenceLDhot exhibited the best performance based on both real and simulated data. CONCLUSIONS: While ms remains an excellent choice for general coalescent simulations of DNA sequences, MaCS and fastsimcoal are much more scalable and flexible in simulating a variety of demographic events under different recombination hotspot models. Furthermore, sequenceLDhot appears to give the most optimal performance in detecting and validating cross-over hotspots. BioMed Central 2014-01-03 /pmc/articles/PMC3890628/ /pubmed/24387001 http://dx.doi.org/10.1186/1471-2105-15-3 Text en Copyright © 2014 Yang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Yang, Tao Deng, Hong-Wen Niu, Tianhua Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences |
title | Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences |
title_full | Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences |
title_fullStr | Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences |
title_full_unstemmed | Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences |
title_short | Critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences |
title_sort | critical assessment of coalescent simulators in modeling recombination hotspots in genomic sequences |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3890628/ https://www.ncbi.nlm.nih.gov/pubmed/24387001 http://dx.doi.org/10.1186/1471-2105-15-3 |
work_keys_str_mv | AT yangtao criticalassessmentofcoalescentsimulatorsinmodelingrecombinationhotspotsingenomicsequences AT denghongwen criticalassessmentofcoalescentsimulatorsinmodelingrecombinationhotspotsingenomicsequences AT niutianhua criticalassessmentofcoalescentsimulatorsinmodelingrecombinationhotspotsingenomicsequences |