Cargando…

SLDMS: A Tool for Calculating the Overlapping Regions of Sequences

In the field of genome assembly, contig assembly is one of the most important parts. Contig assembly requires the processing of overlapping regions of a large number of DNA sequences and this calculation usually takes a lot of time. The time consumption of contig assembly algorithms is an important...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yu, You, DongLiang, Zhang, TianJiao, Wang, GuoHua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8761809/
https://www.ncbi.nlm.nih.gov/pubmed/35046988
http://dx.doi.org/10.3389/fpls.2021.813036
_version_ 1784633614554628096
author Chen, Yu
You, DongLiang
Zhang, TianJiao
Wang, GuoHua
author_facet Chen, Yu
You, DongLiang
Zhang, TianJiao
Wang, GuoHua
author_sort Chen, Yu
collection PubMed
description In the field of genome assembly, contig assembly is one of the most important parts. Contig assembly requires the processing of overlapping regions of a large number of DNA sequences and this calculation usually takes a lot of time. The time consumption of contig assembly algorithms is an important indicator to evaluate the degree of algorithm superiority. Existing methods for processing overlapping regions of sequences consume too much in terms of running time. Therefore, we propose a method SLDMS for processing sequence overlapping regions based on suffix array and monotonic stack, which can effectively improve the efficiency of sequence overlapping regions processing. The running time of the SLDMS is much less than that of Canu and Flye in dealing with the sequence overlap interval and in some data with most sequencing errors occur at both the ends of the sequencing data, the running time of the SLDMS is only about one-tenth of the other two methods.
format Online
Article
Text
id pubmed-8761809
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-87618092022-01-18 SLDMS: A Tool for Calculating the Overlapping Regions of Sequences Chen, Yu You, DongLiang Zhang, TianJiao Wang, GuoHua Front Plant Sci Plant Science In the field of genome assembly, contig assembly is one of the most important parts. Contig assembly requires the processing of overlapping regions of a large number of DNA sequences and this calculation usually takes a lot of time. The time consumption of contig assembly algorithms is an important indicator to evaluate the degree of algorithm superiority. Existing methods for processing overlapping regions of sequences consume too much in terms of running time. Therefore, we propose a method SLDMS for processing sequence overlapping regions based on suffix array and monotonic stack, which can effectively improve the efficiency of sequence overlapping regions processing. The running time of the SLDMS is much less than that of Canu and Flye in dealing with the sequence overlap interval and in some data with most sequencing errors occur at both the ends of the sequencing data, the running time of the SLDMS is only about one-tenth of the other two methods. Frontiers Media S.A. 2022-01-03 /pmc/articles/PMC8761809/ /pubmed/35046988 http://dx.doi.org/10.3389/fpls.2021.813036 Text en Copyright © 2022 Chen, You, Zhang and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Chen, Yu
You, DongLiang
Zhang, TianJiao
Wang, GuoHua
SLDMS: A Tool for Calculating the Overlapping Regions of Sequences
title SLDMS: A Tool for Calculating the Overlapping Regions of Sequences
title_full SLDMS: A Tool for Calculating the Overlapping Regions of Sequences
title_fullStr SLDMS: A Tool for Calculating the Overlapping Regions of Sequences
title_full_unstemmed SLDMS: A Tool for Calculating the Overlapping Regions of Sequences
title_short SLDMS: A Tool for Calculating the Overlapping Regions of Sequences
title_sort sldms: a tool for calculating the overlapping regions of sequences
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8761809/
https://www.ncbi.nlm.nih.gov/pubmed/35046988
http://dx.doi.org/10.3389/fpls.2021.813036
work_keys_str_mv AT chenyu sldmsatoolforcalculatingtheoverlappingregionsofsequences
AT youdongliang sldmsatoolforcalculatingtheoverlappingregionsofsequences
AT zhangtianjiao sldmsatoolforcalculatingtheoverlappingregionsofsequences
AT wangguohua sldmsatoolforcalculatingtheoverlappingregionsofsequences