Cargando…
Recombination-aware alignment of diploid individuals
BACKGROUND: Traditionally biological similarity search has been studied under the abstraction of a single string to represent each genome. The more realistic representation of diploid genomes, with two strings defining the genome, has so far been largely omitted in this context. With the development...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240689/ https://www.ncbi.nlm.nih.gov/pubmed/25572943 http://dx.doi.org/10.1186/1471-2164-15-S6-S15 |
_version_ | 1782345756780265472 |
---|---|
author | Mäkinen, Veli Valenzuela, Daniel |
author_facet | Mäkinen, Veli Valenzuela, Daniel |
author_sort | Mäkinen, Veli |
collection | PubMed |
description | BACKGROUND: Traditionally biological similarity search has been studied under the abstraction of a single string to represent each genome. The more realistic representation of diploid genomes, with two strings defining the genome, has so far been largely omitted in this context. With the development of sequencing techniques and better phasing routines through haplotype assembly algorithms, we are not far from the situation when individual diploid genomes could be represented in their full complexity with a pair-wise alignment defining the genome. RESULTS: We propose a generalization of global alignment that is designed to measure similarity between phased predictions of individual diploid genomes. This generalization takes into account that individual diploid genomes evolve through a mutation and recombination process, and that predictions may be erroneous in both dimensions. Even though our model is generic, we focus on the case where one wants to measure only the similarity of genome content allowing free recombination. This results into efficient algorithms for direct application in (i) evaluation of variation calling predictions and (ii) progressive multiple alignments based on labeled directed acyclic graphs (DAGs) to represent profiles. The latter may be of more general interest, in connection to covering alignment of DAGs. Extensions of our model and algorithms can be foreseen to have applications in evaluating phasing algorithms, as well as more fundamental role in phasing child genome based on parent genomes. |
format | Online Article Text |
id | pubmed-4240689 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42406892014-11-25 Recombination-aware alignment of diploid individuals Mäkinen, Veli Valenzuela, Daniel BMC Genomics Research BACKGROUND: Traditionally biological similarity search has been studied under the abstraction of a single string to represent each genome. The more realistic representation of diploid genomes, with two strings defining the genome, has so far been largely omitted in this context. With the development of sequencing techniques and better phasing routines through haplotype assembly algorithms, we are not far from the situation when individual diploid genomes could be represented in their full complexity with a pair-wise alignment defining the genome. RESULTS: We propose a generalization of global alignment that is designed to measure similarity between phased predictions of individual diploid genomes. This generalization takes into account that individual diploid genomes evolve through a mutation and recombination process, and that predictions may be erroneous in both dimensions. Even though our model is generic, we focus on the case where one wants to measure only the similarity of genome content allowing free recombination. This results into efficient algorithms for direct application in (i) evaluation of variation calling predictions and (ii) progressive multiple alignments based on labeled directed acyclic graphs (DAGs) to represent profiles. The latter may be of more general interest, in connection to covering alignment of DAGs. Extensions of our model and algorithms can be foreseen to have applications in evaluating phasing algorithms, as well as more fundamental role in phasing child genome based on parent genomes. BioMed Central 2014-10-17 /pmc/articles/PMC4240689/ /pubmed/25572943 http://dx.doi.org/10.1186/1471-2164-15-S6-S15 Text en Copyright © 2014 Mäkinen and Valenzuela; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Mäkinen, Veli Valenzuela, Daniel Recombination-aware alignment of diploid individuals |
title | Recombination-aware alignment of diploid individuals |
title_full | Recombination-aware alignment of diploid individuals |
title_fullStr | Recombination-aware alignment of diploid individuals |
title_full_unstemmed | Recombination-aware alignment of diploid individuals |
title_short | Recombination-aware alignment of diploid individuals |
title_sort | recombination-aware alignment of diploid individuals |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240689/ https://www.ncbi.nlm.nih.gov/pubmed/25572943 http://dx.doi.org/10.1186/1471-2164-15-S6-S15 |
work_keys_str_mv | AT makinenveli recombinationawarealignmentofdiploidindividuals AT valenzueladaniel recombinationawarealignmentofdiploidindividuals |