Cargando…

Computing the family-free DCJ similarity

BACKGROUND: The genomic similarity is a large-scale measure for comparing two given genomes. In this work we study the (NP-hard) problem of computing the genomic similarity under the DCJ model in a setting that does not assume that the genes of the compared genomes are grouped into gene families. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Rubert, Diego P., Hoshino, Edna A., Braga, Marília D. V., Stoye, Jens, Martinez, Fábio V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998916/
https://www.ncbi.nlm.nih.gov/pubmed/29745861
http://dx.doi.org/10.1186/s12859-018-2130-5
Descripción
Sumario:BACKGROUND: The genomic similarity is a large-scale measure for comparing two given genomes. In this work we study the (NP-hard) problem of computing the genomic similarity under the DCJ model in a setting that does not assume that the genes of the compared genomes are grouped into gene families. This problem is called family-free DCJ similarity. RESULTS: We propose an exact ILP algorithm to solve the family-free DCJ similarity problem, then we show its APX-hardness and present four combinatorial heuristics with computational experiments comparing their results to the ILP. CONCLUSIONS: We show that the family-free DCJ similarity can be computed in reasonable time, although for larger genomes it is necessary to resort to heuristics. This provides a basis for further studies on the applicability and model refinement of family-free whole genome similarity measures. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2130-5) contains supplementary material, which is available to authorized users.