Cargando…
pblat: a multithread blat algorithm speeding up aligning sequences to genomes
BACKGROUND: The blat is a widely used sequence alignment tool. It is especially useful for aligning long sequences and gapped mapping, which cannot be performed properly by other fast sequence mappers designed for short reads. However, the blat tool is single threaded and when used to map whole geno...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6334396/ https://www.ncbi.nlm.nih.gov/pubmed/30646844 http://dx.doi.org/10.1186/s12859-019-2597-8 |
_version_ | 1783387706138558464 |
---|---|
author | Wang, Meng Kong, Lei |
author_facet | Wang, Meng Kong, Lei |
author_sort | Wang, Meng |
collection | PubMed |
description | BACKGROUND: The blat is a widely used sequence alignment tool. It is especially useful for aligning long sequences and gapped mapping, which cannot be performed properly by other fast sequence mappers designed for short reads. However, the blat tool is single threaded and when used to map whole genome or whole transcriptome sequences to reference genomes this program can take days to finish, making it unsuitable for large scale sequencing projects and iterative analysis. Here, we present pblat (parallel blat), a parallelized blat algorithm with multithread and cluster computing support, which functions to rapidly fine map large scale DNA/RNA sequences against genomes. RESULTS: The pblat algorithm takes advantage of modern multicore processors and significantly reduces the run time with the number of threads used. pblat utilizes almost equal amount of memory as when running blat. The results generated by pblat are identical with those generated by blat. The pblat tool is easy to install and can run on Linux and Mac OS systems. In addition, we provide a cluster version of pblat (pblat-cluster) running on computing clusters with MPI support. CONCLUSION: pblat is open source and free available for non-commercial users. It is easy to install and easy to use. pblat and pblat-cluster would facilitate the high-throughput mapping of large scale genomic and transcript sequences to reference genomes with both high speed and high precision. |
format | Online Article Text |
id | pubmed-6334396 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63343962019-01-23 pblat: a multithread blat algorithm speeding up aligning sequences to genomes Wang, Meng Kong, Lei BMC Bioinformatics Software BACKGROUND: The blat is a widely used sequence alignment tool. It is especially useful for aligning long sequences and gapped mapping, which cannot be performed properly by other fast sequence mappers designed for short reads. However, the blat tool is single threaded and when used to map whole genome or whole transcriptome sequences to reference genomes this program can take days to finish, making it unsuitable for large scale sequencing projects and iterative analysis. Here, we present pblat (parallel blat), a parallelized blat algorithm with multithread and cluster computing support, which functions to rapidly fine map large scale DNA/RNA sequences against genomes. RESULTS: The pblat algorithm takes advantage of modern multicore processors and significantly reduces the run time with the number of threads used. pblat utilizes almost equal amount of memory as when running blat. The results generated by pblat are identical with those generated by blat. The pblat tool is easy to install and can run on Linux and Mac OS systems. In addition, we provide a cluster version of pblat (pblat-cluster) running on computing clusters with MPI support. CONCLUSION: pblat is open source and free available for non-commercial users. It is easy to install and easy to use. pblat and pblat-cluster would facilitate the high-throughput mapping of large scale genomic and transcript sequences to reference genomes with both high speed and high precision. BioMed Central 2019-01-15 /pmc/articles/PMC6334396/ /pubmed/30646844 http://dx.doi.org/10.1186/s12859-019-2597-8 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Wang, Meng Kong, Lei pblat: a multithread blat algorithm speeding up aligning sequences to genomes |
title | pblat: a multithread blat algorithm speeding up aligning sequences to genomes |
title_full | pblat: a multithread blat algorithm speeding up aligning sequences to genomes |
title_fullStr | pblat: a multithread blat algorithm speeding up aligning sequences to genomes |
title_full_unstemmed | pblat: a multithread blat algorithm speeding up aligning sequences to genomes |
title_short | pblat: a multithread blat algorithm speeding up aligning sequences to genomes |
title_sort | pblat: a multithread blat algorithm speeding up aligning sequences to genomes |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6334396/ https://www.ncbi.nlm.nih.gov/pubmed/30646844 http://dx.doi.org/10.1186/s12859-019-2597-8 |
work_keys_str_mv | AT wangmeng pblatamultithreadblatalgorithmspeedingupaligningsequencestogenomes AT konglei pblatamultithreadblatalgorithmspeedingupaligningsequencestogenomes |