Cargando…
smsMap: mapping single molecule sequencing reads by locating the alignment starting positions
BACKGROUND: Single Molecule Sequencing (SMS) technology can produce longer reads with higher sequencing error rate. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. Most existing mapping tools generally adopt the traditiona...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7430848/ https://www.ncbi.nlm.nih.gov/pubmed/32753028 http://dx.doi.org/10.1186/s12859-020-03698-w |
_version_ | 1783571493111726080 |
---|---|
author | Wei, Ze-Gang Zhang, Shao-Wu Liu, Fei |
author_facet | Wei, Ze-Gang Zhang, Shao-Wu Liu, Fei |
author_sort | Wei, Ze-Gang |
collection | PubMed |
description | BACKGROUND: Single Molecule Sequencing (SMS) technology can produce longer reads with higher sequencing error rate. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. Most existing mapping tools generally adopt the traditional seed-and-extend strategy, and the candidate aligned regions for each query read are selected either by counting the number of matched seeds or chaining a group of seeds. However, for all the existing mapping tools, the coverage ratio of the alignment region to the query read is lower, and the read alignment quality and efficiency need to be improved. Here, we introduce smsMap, a novel mapping tool that is specifically designed to map the long reads of SMS to a reference genome. RESULTS: smsMap was evaluated with other existing seven SMS mapping tools (e.g., BLASR, minimap2, and BWA-MEM) on both simulated and real-life SMS datasets. The experimental results show that smsMap can efficiently achieve higher aligned read coverage ratio and has higher sensitivity that can align more sequences and bases to the reference genome. Additionally, smsMap is more robust to sequencing errors. CONCLUSIONS: smsMap is computationally efficient to align SMS reads, especially for the larger size of the reference genome (e.g., H. sapiens genome with over 3 billion base pairs). The source code of smsMap can be freely downloaded from https://github.com/NWPU-903PR/smsMap. |
format | Online Article Text |
id | pubmed-7430848 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-74308482020-08-18 smsMap: mapping single molecule sequencing reads by locating the alignment starting positions Wei, Ze-Gang Zhang, Shao-Wu Liu, Fei BMC Bioinformatics Methodology Article BACKGROUND: Single Molecule Sequencing (SMS) technology can produce longer reads with higher sequencing error rate. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. Most existing mapping tools generally adopt the traditional seed-and-extend strategy, and the candidate aligned regions for each query read are selected either by counting the number of matched seeds or chaining a group of seeds. However, for all the existing mapping tools, the coverage ratio of the alignment region to the query read is lower, and the read alignment quality and efficiency need to be improved. Here, we introduce smsMap, a novel mapping tool that is specifically designed to map the long reads of SMS to a reference genome. RESULTS: smsMap was evaluated with other existing seven SMS mapping tools (e.g., BLASR, minimap2, and BWA-MEM) on both simulated and real-life SMS datasets. The experimental results show that smsMap can efficiently achieve higher aligned read coverage ratio and has higher sensitivity that can align more sequences and bases to the reference genome. Additionally, smsMap is more robust to sequencing errors. CONCLUSIONS: smsMap is computationally efficient to align SMS reads, especially for the larger size of the reference genome (e.g., H. sapiens genome with over 3 billion base pairs). The source code of smsMap can be freely downloaded from https://github.com/NWPU-903PR/smsMap. BioMed Central 2020-08-04 /pmc/articles/PMC7430848/ /pubmed/32753028 http://dx.doi.org/10.1186/s12859-020-03698-w Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Wei, Ze-Gang Zhang, Shao-Wu Liu, Fei smsMap: mapping single molecule sequencing reads by locating the alignment starting positions |
title | smsMap: mapping single molecule sequencing reads by locating the alignment starting positions |
title_full | smsMap: mapping single molecule sequencing reads by locating the alignment starting positions |
title_fullStr | smsMap: mapping single molecule sequencing reads by locating the alignment starting positions |
title_full_unstemmed | smsMap: mapping single molecule sequencing reads by locating the alignment starting positions |
title_short | smsMap: mapping single molecule sequencing reads by locating the alignment starting positions |
title_sort | smsmap: mapping single molecule sequencing reads by locating the alignment starting positions |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7430848/ https://www.ncbi.nlm.nih.gov/pubmed/32753028 http://dx.doi.org/10.1186/s12859-020-03698-w |
work_keys_str_mv | AT weizegang smsmapmappingsinglemoleculesequencingreadsbylocatingthealignmentstartingpositions AT zhangshaowu smsmapmappingsinglemoleculesequencingreadsbylocatingthealignmentstartingpositions AT liufei smsmapmappingsinglemoleculesequencingreadsbylocatingthealignmentstartingpositions |