Cargando…

REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data

BACKGROUND: Next-generation sequencing technology provides a means to study genetic exchange at a higher resolution than was possible using earlier technologies. However, this improvement presents challenges as the alignments of next generation sequence data to a reference genome cannot be directly...

Descripción completa

Detalles Bibliográficos
Autores principales: Shaik, Jahangheer S, Khan, Asis, Beverley, Stephen M, Sibley, L David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4348101/
https://www.ncbi.nlm.nih.gov/pubmed/25766039
http://dx.doi.org/10.1186/s12864-015-1309-7
_version_ 1782359887509979136
author Shaik, Jahangheer S
Khan, Asis
Beverley, Stephen M
Sibley, L David
author_facet Shaik, Jahangheer S
Khan, Asis
Beverley, Stephen M
Sibley, L David
author_sort Shaik, Jahangheer S
collection PubMed
description BACKGROUND: Next-generation sequencing technology provides a means to study genetic exchange at a higher resolution than was possible using earlier technologies. However, this improvement presents challenges as the alignments of next generation sequence data to a reference genome cannot be directly used as input to existing detection algorithms, which instead typically use multiple sequence alignments as input. We therefore designed a software suite called REDHORSE that uses genomic alignments, extracts genetic markers, and generates multiple sequence alignments that can be used as input to existing recombination detection algorithms. In addition, REDHORSE implements a custom recombination detection algorithm that makes use of sequence information and genomic positions to accurately detect crossovers. REDHORSE is a portable and platform independent suite that provides efficient analysis of genetic crosses based on Next-generation sequencing data. RESULTS: We demonstrated the utility of REDHORSE using simulated data and real Next-generation sequencing data. The simulated dataset mimicked recombination between two known haploid parental strains and allowed comparison of detected break points against known true break points to assess performance of recombination detection algorithms. A newly generated NGS dataset from a genetic cross of Toxoplasma gondii allowed us to demonstrate our pipeline. REDHORSE successfully extracted the relevant genetic markers and was able to transform the read alignments from NGS to the genome to generate multiple sequence alignments. Recombination detection algorithm in REDHORSE was able to detect conventional crossovers and double crossovers typically associated with gene conversions whilst filtering out artifacts that might have been introduced during sequencing or alignment. REDHORSE outperformed other commonly used recombination detection algorithms in finding conventional crossovers. In addition, REDHORSE was the only algorithm that was able to detect double crossovers. CONCLUSION: REDHORSE is an efficient analytical pipeline that serves as a bridge between genomic alignments and existing recombination detection algorithms. Moreover, REDHORSE is equipped with a recombination detection algorithm specifically designed for Next-generation sequencing data. REDHORSE is portable, platform independent Java based utility that provides efficient analysis of genetic crosses based on Next-generation sequencing data. REDHORSE is available at http://redhorse.sourceforge.net/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1309-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4348101
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43481012015-03-05 REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data Shaik, Jahangheer S Khan, Asis Beverley, Stephen M Sibley, L David BMC Genomics Software BACKGROUND: Next-generation sequencing technology provides a means to study genetic exchange at a higher resolution than was possible using earlier technologies. However, this improvement presents challenges as the alignments of next generation sequence data to a reference genome cannot be directly used as input to existing detection algorithms, which instead typically use multiple sequence alignments as input. We therefore designed a software suite called REDHORSE that uses genomic alignments, extracts genetic markers, and generates multiple sequence alignments that can be used as input to existing recombination detection algorithms. In addition, REDHORSE implements a custom recombination detection algorithm that makes use of sequence information and genomic positions to accurately detect crossovers. REDHORSE is a portable and platform independent suite that provides efficient analysis of genetic crosses based on Next-generation sequencing data. RESULTS: We demonstrated the utility of REDHORSE using simulated data and real Next-generation sequencing data. The simulated dataset mimicked recombination between two known haploid parental strains and allowed comparison of detected break points against known true break points to assess performance of recombination detection algorithms. A newly generated NGS dataset from a genetic cross of Toxoplasma gondii allowed us to demonstrate our pipeline. REDHORSE successfully extracted the relevant genetic markers and was able to transform the read alignments from NGS to the genome to generate multiple sequence alignments. Recombination detection algorithm in REDHORSE was able to detect conventional crossovers and double crossovers typically associated with gene conversions whilst filtering out artifacts that might have been introduced during sequencing or alignment. REDHORSE outperformed other commonly used recombination detection algorithms in finding conventional crossovers. In addition, REDHORSE was the only algorithm that was able to detect double crossovers. CONCLUSION: REDHORSE is an efficient analytical pipeline that serves as a bridge between genomic alignments and existing recombination detection algorithms. Moreover, REDHORSE is equipped with a recombination detection algorithm specifically designed for Next-generation sequencing data. REDHORSE is portable, platform independent Java based utility that provides efficient analysis of genetic crosses based on Next-generation sequencing data. REDHORSE is available at http://redhorse.sourceforge.net/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1309-7) contains supplementary material, which is available to authorized users. BioMed Central 2015-02-26 /pmc/articles/PMC4348101/ /pubmed/25766039 http://dx.doi.org/10.1186/s12864-015-1309-7 Text en © Shaik et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Shaik, Jahangheer S
Khan, Asis
Beverley, Stephen M
Sibley, L David
REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data
title REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data
title_full REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data
title_fullStr REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data
title_full_unstemmed REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data
title_short REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data
title_sort redhorse-recombination and double crossover detection in haploid organisms using next-generation sequencing data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4348101/
https://www.ncbi.nlm.nih.gov/pubmed/25766039
http://dx.doi.org/10.1186/s12864-015-1309-7
work_keys_str_mv AT shaikjahangheers redhorserecombinationanddoublecrossoverdetectioninhaploidorganismsusingnextgenerationsequencingdata
AT khanasis redhorserecombinationanddoublecrossoverdetectioninhaploidorganismsusingnextgenerationsequencingdata
AT beverleystephenm redhorserecombinationanddoublecrossoverdetectioninhaploidorganismsusingnextgenerationsequencingdata
AT sibleyldavid redhorserecombinationanddoublecrossoverdetectioninhaploidorganismsusingnextgenerationsequencingdata