Cargando…

Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing

BACKGROUND: Cellular organelles with genomes of their own (e.g. plastids and mitochondria) can pass genetic sequences to other organellar genomes within the cell in many species across the eukaryote phylogeny. The extent of the occurrence of these organellar-derived inserted sequences (odins) is sti...

Descripción completa

Detalles Bibliográficos
Autores principales: Samaniego Castruita, Jose Alfredo, Zepeda Mendoza, Marie Lisandra, Barnett, Ross, Wales, Nathan, Gilbert, M Thomas P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4517485/
https://www.ncbi.nlm.nih.gov/pubmed/26216337
http://dx.doi.org/10.1186/s12859-015-0682-1
_version_ 1782383193860603904
author Samaniego Castruita, Jose Alfredo
Zepeda Mendoza, Marie Lisandra
Barnett, Ross
Wales, Nathan
Gilbert, M Thomas P.
author_facet Samaniego Castruita, Jose Alfredo
Zepeda Mendoza, Marie Lisandra
Barnett, Ross
Wales, Nathan
Gilbert, M Thomas P.
author_sort Samaniego Castruita, Jose Alfredo
collection PubMed
description BACKGROUND: Cellular organelles with genomes of their own (e.g. plastids and mitochondria) can pass genetic sequences to other organellar genomes within the cell in many species across the eukaryote phylogeny. The extent of the occurrence of these organellar-derived inserted sequences (odins) is still unknown, but if not accounted for in genomic and phylogenetic studies, they can be a source of error. However, if correctly identified, these inserted sequences can be used for evolutionary and comparative genomic studies. Although such insertions can be detected using various laboratory and bioinformatic strategies, there is currently no straightforward way to apply them as a standard organellar genome assembly on next-generation sequencing data. Furthermore, most current methods for identification of such insertions are unsuitable for use on non-model organisms or ancient DNA datasets. RESULTS: We present a bioinformatic method that uses phasing algorithms to reconstruct both source and inserted organelle sequences. The method was tested in different shotgun and organellar-enriched DNA high-throughput sequencing (HTS) datasets from ancient and modern samples. Specifically, we used datasets from lions (Panthera leo ssp. and Panthera leo leo) to characterize insertions from mitochondrial origin, and from common grapevine (Vitis vinifera) and bugle (Ajuga reptans) to characterize insertions derived from plastid genomes. Comparison of the results against other available organelle genome assembly methods demonstrated that our new method provides an improvement in the sequence assembly. CONCLUSION: Using datasets from a wide range of species and different levels of complexity we showed that our novel bioinformatic method based on phasing algorithms can be used to achieve the next two goals: i) reference-guided assembly of chloroplast/mitochondrial genomes from HTS data and ii) identification and simultaneous assembly of odins. This method represents the first application of haplotype phasing for automatic detection of odins and reference-based organellar genome assembly. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0682-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4517485
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45174852015-07-29 Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing Samaniego Castruita, Jose Alfredo Zepeda Mendoza, Marie Lisandra Barnett, Ross Wales, Nathan Gilbert, M Thomas P. BMC Bioinformatics Methodology Article BACKGROUND: Cellular organelles with genomes of their own (e.g. plastids and mitochondria) can pass genetic sequences to other organellar genomes within the cell in many species across the eukaryote phylogeny. The extent of the occurrence of these organellar-derived inserted sequences (odins) is still unknown, but if not accounted for in genomic and phylogenetic studies, they can be a source of error. However, if correctly identified, these inserted sequences can be used for evolutionary and comparative genomic studies. Although such insertions can be detected using various laboratory and bioinformatic strategies, there is currently no straightforward way to apply them as a standard organellar genome assembly on next-generation sequencing data. Furthermore, most current methods for identification of such insertions are unsuitable for use on non-model organisms or ancient DNA datasets. RESULTS: We present a bioinformatic method that uses phasing algorithms to reconstruct both source and inserted organelle sequences. The method was tested in different shotgun and organellar-enriched DNA high-throughput sequencing (HTS) datasets from ancient and modern samples. Specifically, we used datasets from lions (Panthera leo ssp. and Panthera leo leo) to characterize insertions from mitochondrial origin, and from common grapevine (Vitis vinifera) and bugle (Ajuga reptans) to characterize insertions derived from plastid genomes. Comparison of the results against other available organelle genome assembly methods demonstrated that our new method provides an improvement in the sequence assembly. CONCLUSION: Using datasets from a wide range of species and different levels of complexity we showed that our novel bioinformatic method based on phasing algorithms can be used to achieve the next two goals: i) reference-guided assembly of chloroplast/mitochondrial genomes from HTS data and ii) identification and simultaneous assembly of odins. This method represents the first application of haplotype phasing for automatic detection of odins and reference-based organellar genome assembly. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0682-1) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-28 /pmc/articles/PMC4517485/ /pubmed/26216337 http://dx.doi.org/10.1186/s12859-015-0682-1 Text en © Samaniego Castruita et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Samaniego Castruita, Jose Alfredo
Zepeda Mendoza, Marie Lisandra
Barnett, Ross
Wales, Nathan
Gilbert, M Thomas P.
Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing
title Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing
title_full Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing
title_fullStr Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing
title_full_unstemmed Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing
title_short Odintifier - A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing
title_sort odintifier - a computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4517485/
https://www.ncbi.nlm.nih.gov/pubmed/26216337
http://dx.doi.org/10.1186/s12859-015-0682-1
work_keys_str_mv AT samaniegocastruitajosealfredo odintifieracomputationalmethodforidentifyinginsertionsoforganellaroriginfrommodernandancienthighthroughputsequencingdatabasedonhaplotypephasing
AT zepedamendozamarielisandra odintifieracomputationalmethodforidentifyinginsertionsoforganellaroriginfrommodernandancienthighthroughputsequencingdatabasedonhaplotypephasing
AT barnettross odintifieracomputationalmethodforidentifyinginsertionsoforganellaroriginfrommodernandancienthighthroughputsequencingdatabasedonhaplotypephasing
AT walesnathan odintifieracomputationalmethodforidentifyinginsertionsoforganellaroriginfrommodernandancienthighthroughputsequencingdatabasedonhaplotypephasing
AT gilbertmthomasp odintifieracomputationalmethodforidentifyinginsertionsoforganellaroriginfrommodernandancienthighthroughputsequencingdatabasedonhaplotypephasing