Cargando…

Recombination-aware alignment of diploid individuals

BACKGROUND: Traditionally biological similarity search has been studied under the abstraction of a single string to represent each genome. The more realistic representation of diploid genomes, with two strings defining the genome, has so far been largely omitted in this context. With the development...

Descripción completa

Detalles Bibliográficos
Autores principales: Mäkinen, Veli, Valenzuela, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240689/
https://www.ncbi.nlm.nih.gov/pubmed/25572943
http://dx.doi.org/10.1186/1471-2164-15-S6-S15
_version_ 1782345756780265472
author Mäkinen, Veli
Valenzuela, Daniel
author_facet Mäkinen, Veli
Valenzuela, Daniel
author_sort Mäkinen, Veli
collection PubMed
description BACKGROUND: Traditionally biological similarity search has been studied under the abstraction of a single string to represent each genome. The more realistic representation of diploid genomes, with two strings defining the genome, has so far been largely omitted in this context. With the development of sequencing techniques and better phasing routines through haplotype assembly algorithms, we are not far from the situation when individual diploid genomes could be represented in their full complexity with a pair-wise alignment defining the genome. RESULTS: We propose a generalization of global alignment that is designed to measure similarity between phased predictions of individual diploid genomes. This generalization takes into account that individual diploid genomes evolve through a mutation and recombination process, and that predictions may be erroneous in both dimensions. Even though our model is generic, we focus on the case where one wants to measure only the similarity of genome content allowing free recombination. This results into efficient algorithms for direct application in (i) evaluation of variation calling predictions and (ii) progressive multiple alignments based on labeled directed acyclic graphs (DAGs) to represent profiles. The latter may be of more general interest, in connection to covering alignment of DAGs. Extensions of our model and algorithms can be foreseen to have applications in evaluating phasing algorithms, as well as more fundamental role in phasing child genome based on parent genomes.
format Online
Article
Text
id pubmed-4240689
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42406892014-11-25 Recombination-aware alignment of diploid individuals Mäkinen, Veli Valenzuela, Daniel BMC Genomics Research BACKGROUND: Traditionally biological similarity search has been studied under the abstraction of a single string to represent each genome. The more realistic representation of diploid genomes, with two strings defining the genome, has so far been largely omitted in this context. With the development of sequencing techniques and better phasing routines through haplotype assembly algorithms, we are not far from the situation when individual diploid genomes could be represented in their full complexity with a pair-wise alignment defining the genome. RESULTS: We propose a generalization of global alignment that is designed to measure similarity between phased predictions of individual diploid genomes. This generalization takes into account that individual diploid genomes evolve through a mutation and recombination process, and that predictions may be erroneous in both dimensions. Even though our model is generic, we focus on the case where one wants to measure only the similarity of genome content allowing free recombination. This results into efficient algorithms for direct application in (i) evaluation of variation calling predictions and (ii) progressive multiple alignments based on labeled directed acyclic graphs (DAGs) to represent profiles. The latter may be of more general interest, in connection to covering alignment of DAGs. Extensions of our model and algorithms can be foreseen to have applications in evaluating phasing algorithms, as well as more fundamental role in phasing child genome based on parent genomes. BioMed Central 2014-10-17 /pmc/articles/PMC4240689/ /pubmed/25572943 http://dx.doi.org/10.1186/1471-2164-15-S6-S15 Text en Copyright © 2014 Mäkinen and Valenzuela; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Mäkinen, Veli
Valenzuela, Daniel
Recombination-aware alignment of diploid individuals
title Recombination-aware alignment of diploid individuals
title_full Recombination-aware alignment of diploid individuals
title_fullStr Recombination-aware alignment of diploid individuals
title_full_unstemmed Recombination-aware alignment of diploid individuals
title_short Recombination-aware alignment of diploid individuals
title_sort recombination-aware alignment of diploid individuals
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240689/
https://www.ncbi.nlm.nih.gov/pubmed/25572943
http://dx.doi.org/10.1186/1471-2164-15-S6-S15
work_keys_str_mv AT makinenveli recombinationawarealignmentofdiploidindividuals
AT valenzueladaniel recombinationawarealignmentofdiploidindividuals