Cargando…

Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm

Identifying recombinant sequences in an era of large genomic databases is challenging as it requires an efficient algorithm to identify candidate recombinants and parents, as well as appropriate statistical methods to correct for the large number of comparisons performed. In 2007, a computation was...

Descripción completa

Detalles Bibliográficos
Autores principales: Lam, Ha Minh, Ratmann, Oliver, Boni, Maciej F
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5850291/
https://www.ncbi.nlm.nih.gov/pubmed/29029186
http://dx.doi.org/10.1093/molbev/msx263
_version_ 1783306209115242496
author Lam, Ha Minh
Ratmann, Oliver
Boni, Maciej F
author_facet Lam, Ha Minh
Ratmann, Oliver
Boni, Maciej F
author_sort Lam, Ha Minh
collection PubMed
description Identifying recombinant sequences in an era of large genomic databases is challenging as it requires an efficient algorithm to identify candidate recombinants and parents, as well as appropriate statistical methods to correct for the large number of comparisons performed. In 2007, a computation was introduced for an exact nonparametric mosaicism statistic that gave high-precision P values for putative recombinants. This exact computation meant that multiple-comparisons corrected P values also had high precision, which is crucial when performing millions or billions of tests in large databases. Here, we introduce an improvement to the algorithmic complexity of this computation from O(mn(3)) to O(mn(2)), where m and n are the numbers of recombination-informative sites in the candidate recombinant. This new computation allows for recombination analysis to be performed in alignments with thousands of polymorphic sites. Benchmark runs are presented on viral genome sequence alignments, new features are introduced, and applications outside recombination analysis are discussed.
format Online
Article
Text
id pubmed-5850291
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58502912018-03-23 Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm Lam, Ha Minh Ratmann, Oliver Boni, Maciej F Mol Biol Evol Methods Identifying recombinant sequences in an era of large genomic databases is challenging as it requires an efficient algorithm to identify candidate recombinants and parents, as well as appropriate statistical methods to correct for the large number of comparisons performed. In 2007, a computation was introduced for an exact nonparametric mosaicism statistic that gave high-precision P values for putative recombinants. This exact computation meant that multiple-comparisons corrected P values also had high precision, which is crucial when performing millions or billions of tests in large databases. Here, we introduce an improvement to the algorithmic complexity of this computation from O(mn(3)) to O(mn(2)), where m and n are the numbers of recombination-informative sites in the candidate recombinant. This new computation allows for recombination analysis to be performed in alignments with thousands of polymorphic sites. Benchmark runs are presented on viral genome sequence alignments, new features are introduced, and applications outside recombination analysis are discussed. Oxford University Press 2018-01 2017-10-03 /pmc/articles/PMC5850291/ /pubmed/29029186 http://dx.doi.org/10.1093/molbev/msx263 Text en © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods
Lam, Ha Minh
Ratmann, Oliver
Boni, Maciej F
Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm
title Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm
title_full Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm
title_fullStr Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm
title_full_unstemmed Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm
title_short Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm
title_sort improved algorithmic complexity for the 3seq recombination detection algorithm
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5850291/
https://www.ncbi.nlm.nih.gov/pubmed/29029186
http://dx.doi.org/10.1093/molbev/msx263
work_keys_str_mv AT lamhaminh improvedalgorithmiccomplexityforthe3seqrecombinationdetectionalgorithm
AT ratmannoliver improvedalgorithmiccomplexityforthe3seqrecombinationdetectionalgorithm
AT bonimaciejf improvedalgorithmiccomplexityforthe3seqrecombinationdetectionalgorithm