Cargando…

IMOS: improved Meta-aligner and Minimap2 On Spark

BACKGROUND: Long reads provide valuable information regarding the sequence composition of genomes. Long reads are usually very noisy which renders their alignments on the reference genome a daunting task. It may take days to process datasets enough to sequence a human genome on a single node. Hence,...

Descripción completa

Detalles Bibliográficos
Autores principales: Hadadian Nejad Yousefi, Mostafa, Goudarzi, Maziar, Motahari, Seyed Abolfazl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6345043/
https://www.ncbi.nlm.nih.gov/pubmed/30678641
http://dx.doi.org/10.1186/s12859-018-2592-5
_version_ 1783389515809816576
author Hadadian Nejad Yousefi, Mostafa
Goudarzi, Maziar
Motahari, Seyed Abolfazl
author_facet Hadadian Nejad Yousefi, Mostafa
Goudarzi, Maziar
Motahari, Seyed Abolfazl
author_sort Hadadian Nejad Yousefi, Mostafa
collection PubMed
description BACKGROUND: Long reads provide valuable information regarding the sequence composition of genomes. Long reads are usually very noisy which renders their alignments on the reference genome a daunting task. It may take days to process datasets enough to sequence a human genome on a single node. Hence, it is of primary importance to have an aligner which can operate on distributed clusters of computers with high performance in accuracy and speed. RESULTS: In this paper, we presented IMOS, an aligner for mapping noisy long reads to the reference genome. It can be used on a single node as well as on distributed nodes. In its single-node mode, IMOS is an Improved version of Meta-aligner (IM) enhancing both its accuracy and speed. IM is up to 6x faster than the original Meta-aligner. It is also implemented to run IM and Minimap2 on Apache Spark for deploying on a cluster of nodes. Moreover, multi-node IMOS is faster than SparkBWA while executing both IM (1.5x) and Minimap2 (25x). CONCLUSION: In this paper, we purposed an architecture for mapping long reads to a reference. Due to its implementation, IMOS speed can increase almost linearly with respect to the number of nodes in a cluster. Also, it is a multi-platform application able to operate on Linux, Windows, and macOS.
format Online
Article
Text
id pubmed-6345043
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63450432019-01-29 IMOS: improved Meta-aligner and Minimap2 On Spark Hadadian Nejad Yousefi, Mostafa Goudarzi, Maziar Motahari, Seyed Abolfazl BMC Bioinformatics Software BACKGROUND: Long reads provide valuable information regarding the sequence composition of genomes. Long reads are usually very noisy which renders their alignments on the reference genome a daunting task. It may take days to process datasets enough to sequence a human genome on a single node. Hence, it is of primary importance to have an aligner which can operate on distributed clusters of computers with high performance in accuracy and speed. RESULTS: In this paper, we presented IMOS, an aligner for mapping noisy long reads to the reference genome. It can be used on a single node as well as on distributed nodes. In its single-node mode, IMOS is an Improved version of Meta-aligner (IM) enhancing both its accuracy and speed. IM is up to 6x faster than the original Meta-aligner. It is also implemented to run IM and Minimap2 on Apache Spark for deploying on a cluster of nodes. Moreover, multi-node IMOS is faster than SparkBWA while executing both IM (1.5x) and Minimap2 (25x). CONCLUSION: In this paper, we purposed an architecture for mapping long reads to a reference. Due to its implementation, IMOS speed can increase almost linearly with respect to the number of nodes in a cluster. Also, it is a multi-platform application able to operate on Linux, Windows, and macOS. BioMed Central 2019-01-24 /pmc/articles/PMC6345043/ /pubmed/30678641 http://dx.doi.org/10.1186/s12859-018-2592-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Hadadian Nejad Yousefi, Mostafa
Goudarzi, Maziar
Motahari, Seyed Abolfazl
IMOS: improved Meta-aligner and Minimap2 On Spark
title IMOS: improved Meta-aligner and Minimap2 On Spark
title_full IMOS: improved Meta-aligner and Minimap2 On Spark
title_fullStr IMOS: improved Meta-aligner and Minimap2 On Spark
title_full_unstemmed IMOS: improved Meta-aligner and Minimap2 On Spark
title_short IMOS: improved Meta-aligner and Minimap2 On Spark
title_sort imos: improved meta-aligner and minimap2 on spark
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6345043/
https://www.ncbi.nlm.nih.gov/pubmed/30678641
http://dx.doi.org/10.1186/s12859-018-2592-5
work_keys_str_mv AT hadadiannejadyousefimostafa imosimprovedmetaalignerandminimap2onspark
AT goudarzimaziar imosimprovedmetaalignerandminimap2onspark
AT motahariseyedabolfazl imosimprovedmetaalignerandminimap2onspark