Cargando…

Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison

Genetic maps order genetic markers along chromosomes. They are, for instance, extensively used in marker-assisted selection to accelerate breeding programs. Even for the same species, people often have to deal with several alternative maps obtained using different ordering methods or different datas...

Descripción completa

Detalles Bibliográficos
Autores principales:	De Mattéo, Lisa, Holtz, Yan, Ranwez, Vincent, Bérard, Sèverine
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6320017/ https://www.ncbi.nlm.nih.gov/pubmed/30589848 http://dx.doi.org/10.1371/journal.pone.0208838

_version_	1783385151354437632
author	De Mattéo, Lisa Holtz, Yan Ranwez, Vincent Bérard, Sèverine
author_facet	De Mattéo, Lisa Holtz, Yan Ranwez, Vincent Bérard, Sèverine
author_sort	De Mattéo, Lisa
collection	PubMed
description	Genetic maps order genetic markers along chromosomes. They are, for instance, extensively used in marker-assisted selection to accelerate breeding programs. Even for the same species, people often have to deal with several alternative maps obtained using different ordering methods or different datasets, e.g. resulting from different segregating populations. Having efficient tools to identify the consistency and discrepancy of alternative maps is thus essential to facilitate genetic map comparisons. We propose to encode genetic maps by bucket order, a kind of order, which takes into account the blurred parts of the marker order while being an efficient data structure to achieve low complexity algorithms. The main result of this paper is an O(n log(n)) procedure to identify the largest agreements between two bucket orders of n elements, their Longest Common Subsequence (LCS), providing an efficient solution to highlight discrepancies between two genetic maps. The LCS of two maps, being the largest set of their collinear markers, is used as a building block to compute pairwise map congruence, to visually emphasize maker collinearity and in some scaffolding methods relying on genetic maps to improve genome assembly. As the LCS computation is a key subroutine of all these genetic map related tools, replacing the current LCS subroutine of those methods by ours –to do the exact same work but faster– could significantly speed up those methods without changing their accuracy. To ease such transition we provide all required algorithmic details in this self contained paper as well as an R package implementing them, named LCSLCIS, which is freely available at: https://github.com/holtzy/LCSLCIS.
format	Online Article Text
id	pubmed-6320017
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-63200172019-01-19 Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison De Mattéo, Lisa Holtz, Yan Ranwez, Vincent Bérard, Sèverine PLoS One Research Article Genetic maps order genetic markers along chromosomes. They are, for instance, extensively used in marker-assisted selection to accelerate breeding programs. Even for the same species, people often have to deal with several alternative maps obtained using different ordering methods or different datasets, e.g. resulting from different segregating populations. Having efficient tools to identify the consistency and discrepancy of alternative maps is thus essential to facilitate genetic map comparisons. We propose to encode genetic maps by bucket order, a kind of order, which takes into account the blurred parts of the marker order while being an efficient data structure to achieve low complexity algorithms. The main result of this paper is an O(n log(n)) procedure to identify the largest agreements between two bucket orders of n elements, their Longest Common Subsequence (LCS), providing an efficient solution to highlight discrepancies between two genetic maps. The LCS of two maps, being the largest set of their collinear markers, is used as a building block to compute pairwise map congruence, to visually emphasize maker collinearity and in some scaffolding methods relying on genetic maps to improve genome assembly. As the LCS computation is a key subroutine of all these genetic map related tools, replacing the current LCS subroutine of those methods by ours –to do the exact same work but faster– could significantly speed up those methods without changing their accuracy. To ease such transition we provide all required algorithmic details in this self contained paper as well as an R package implementing them, named LCSLCIS, which is freely available at: https://github.com/holtzy/LCSLCIS. Public Library of Science 2018-12-27 /pmc/articles/PMC6320017/ /pubmed/30589848 http://dx.doi.org/10.1371/journal.pone.0208838 Text en © 2018 De Mattéo et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article De Mattéo, Lisa Holtz, Yan Ranwez, Vincent Bérard, Sèverine Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison
title	Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison
title_full	Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison
title_fullStr	Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison
title_full_unstemmed	Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison
title_short	Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison
title_sort	efficient algorithms for longest common subsequence of two bucket orders to speed up pairwise genetic map comparison
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6320017/ https://www.ncbi.nlm.nih.gov/pubmed/30589848 http://dx.doi.org/10.1371/journal.pone.0208838
work_keys_str_mv	AT dematteolisa efficientalgorithmsforlongestcommonsubsequenceoftwobucketorderstospeeduppairwisegeneticmapcomparison AT holtzyan efficientalgorithmsforlongestcommonsubsequenceoftwobucketorderstospeeduppairwisegeneticmapcomparison AT ranwezvincent efficientalgorithmsforlongestcommonsubsequenceoftwobucketorderstospeeduppairwisegeneticmapcomparison AT berardseverine efficientalgorithmsforlongestcommonsubsequenceoftwobucketorderstospeeduppairwisegeneticmapcomparison

Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison

Ejemplares similares