Cargando…

Linearization of ancestral multichromosomal genomes

BACKGROUND: Recovering the structure of ancestral genomes can be formalized in terms of properties of binary matrices such as the Consecutive-Ones Property (C1P). The Linearization Problem asks to extract, from a given binary matrix, a maximum weight subset of rows that satisfies such a property. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Maňuch, Ján, Patterson, Murray, Wittler, Roland, Chauve, Cedric, Tannier, Eric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526452/
https://www.ncbi.nlm.nih.gov/pubmed/23281593
http://dx.doi.org/10.1186/1471-2105-13-S19-S11
_version_ 1782253563637923840
author Maňuch, Ján
Patterson, Murray
Wittler, Roland
Chauve, Cedric
Tannier, Eric
author_facet Maňuch, Ján
Patterson, Murray
Wittler, Roland
Chauve, Cedric
Tannier, Eric
author_sort Maňuch, Ján
collection PubMed
description BACKGROUND: Recovering the structure of ancestral genomes can be formalized in terms of properties of binary matrices such as the Consecutive-Ones Property (C1P). The Linearization Problem asks to extract, from a given binary matrix, a maximum weight subset of rows that satisfies such a property. This problem is in general intractable, and in particular if the ancestral genome is expected to contain only linear chromosomes or a unique circular chromosome. In the present work, we consider a relaxation of this problem, which allows ancestral genomes that can contain several chromosomes, each either linear or circular. RESULT: We show that, when restricted to binary matrices of degree two, which correspond to adjacencies, the genomic characters used in most ancestral genome reconstruction methods, this relaxed version of the Linearization Problem is polynomially solvable using a reduction to a matching problem. This result holds in the more general case where columns have bounded multiplicity, which models possibly duplicated ancestral genes. We also prove that for matrices with rows of degrees 2 and 3, without multiplicity and without weights on the rows, the problem is NP-complete, thus tracing sharp tractability boundaries. CONCLUSION: As it happened for the breakpoint median problem, also used in ancestral genome reconstruction, relaxing the definition of a genome turns an intractable problem into a tractable one. The relaxation is adapted to some biological contexts, such as bacterial genomes with several replicons, possibly partially assembled. Algorithms can also be used as heuristics for hard variants. More generally, this work opens a way to better understand linearization results for ancestral genome structure inference.
format Online
Article
Text
id pubmed-3526452
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35264522013-01-10 Linearization of ancestral multichromosomal genomes Maňuch, Ján Patterson, Murray Wittler, Roland Chauve, Cedric Tannier, Eric BMC Bioinformatics Proceedings BACKGROUND: Recovering the structure of ancestral genomes can be formalized in terms of properties of binary matrices such as the Consecutive-Ones Property (C1P). The Linearization Problem asks to extract, from a given binary matrix, a maximum weight subset of rows that satisfies such a property. This problem is in general intractable, and in particular if the ancestral genome is expected to contain only linear chromosomes or a unique circular chromosome. In the present work, we consider a relaxation of this problem, which allows ancestral genomes that can contain several chromosomes, each either linear or circular. RESULT: We show that, when restricted to binary matrices of degree two, which correspond to adjacencies, the genomic characters used in most ancestral genome reconstruction methods, this relaxed version of the Linearization Problem is polynomially solvable using a reduction to a matching problem. This result holds in the more general case where columns have bounded multiplicity, which models possibly duplicated ancestral genes. We also prove that for matrices with rows of degrees 2 and 3, without multiplicity and without weights on the rows, the problem is NP-complete, thus tracing sharp tractability boundaries. CONCLUSION: As it happened for the breakpoint median problem, also used in ancestral genome reconstruction, relaxing the definition of a genome turns an intractable problem into a tractable one. The relaxation is adapted to some biological contexts, such as bacterial genomes with several replicons, possibly partially assembled. Algorithms can also be used as heuristics for hard variants. More generally, this work opens a way to better understand linearization results for ancestral genome structure inference. BioMed Central 2012-12-19 /pmc/articles/PMC3526452/ /pubmed/23281593 http://dx.doi.org/10.1186/1471-2105-13-S19-S11 Text en Copyright ©2012 Maňuch et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Maňuch, Ján
Patterson, Murray
Wittler, Roland
Chauve, Cedric
Tannier, Eric
Linearization of ancestral multichromosomal genomes
title Linearization of ancestral multichromosomal genomes
title_full Linearization of ancestral multichromosomal genomes
title_fullStr Linearization of ancestral multichromosomal genomes
title_full_unstemmed Linearization of ancestral multichromosomal genomes
title_short Linearization of ancestral multichromosomal genomes
title_sort linearization of ancestral multichromosomal genomes
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526452/
https://www.ncbi.nlm.nih.gov/pubmed/23281593
http://dx.doi.org/10.1186/1471-2105-13-S19-S11
work_keys_str_mv AT manuchjan linearizationofancestralmultichromosomalgenomes
AT pattersonmurray linearizationofancestralmultichromosomalgenomes
AT wittlerroland linearizationofancestralmultichromosomalgenomes
AT chauvecedric linearizationofancestralmultichromosomalgenomes
AT tanniereric linearizationofancestralmultichromosomalgenomes