Cargando…

Parallelization of MAFFT for large-scale multiple sequence alignments

SUMMARY: We report an update for the MAFFT multiple sequence alignment program to enable parallel calculation of large numbers of sequences. The G-INS-1 option of MAFFT was recently reported to have higher accuracy than other methods for large data, but this method has been impractical for most larg...

Descripción completa

Detalles Bibliográficos
Autores principales: Nakamura, Tsukasa, Yamada, Kazunori D, Tomii, Kentaro, Katoh, Kazutaka
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6041967/
https://www.ncbi.nlm.nih.gov/pubmed/29506019
http://dx.doi.org/10.1093/bioinformatics/bty121
_version_ 1783339079136444416
author Nakamura, Tsukasa
Yamada, Kazunori D
Tomii, Kentaro
Katoh, Kazutaka
author_facet Nakamura, Tsukasa
Yamada, Kazunori D
Tomii, Kentaro
Katoh, Kazutaka
author_sort Nakamura, Tsukasa
collection PubMed
description SUMMARY: We report an update for the MAFFT multiple sequence alignment program to enable parallel calculation of large numbers of sequences. The G-INS-1 option of MAFFT was recently reported to have higher accuracy than other methods for large data, but this method has been impractical for most large-scale analyses, due to the requirement of large computational resources. We introduce a scalable variant, G-large-INS-1, which has equivalent accuracy to G-INS-1 and is applicable to 50 000 or more sequences. AVAILABILITY AND IMPLEMENTATION: This feature is available in MAFFT versions 7.355 or later at https://mafft.cbrc.jp/alignment/software/mpi.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6041967
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60419672018-07-17 Parallelization of MAFFT for large-scale multiple sequence alignments Nakamura, Tsukasa Yamada, Kazunori D Tomii, Kentaro Katoh, Kazutaka Bioinformatics Applications Notes SUMMARY: We report an update for the MAFFT multiple sequence alignment program to enable parallel calculation of large numbers of sequences. The G-INS-1 option of MAFFT was recently reported to have higher accuracy than other methods for large data, but this method has been impractical for most large-scale analyses, due to the requirement of large computational resources. We introduce a scalable variant, G-large-INS-1, which has equivalent accuracy to G-INS-1 and is applicable to 50 000 or more sequences. AVAILABILITY AND IMPLEMENTATION: This feature is available in MAFFT versions 7.355 or later at https://mafft.cbrc.jp/alignment/software/mpi.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-07-15 2018-03-01 /pmc/articles/PMC6041967/ /pubmed/29506019 http://dx.doi.org/10.1093/bioinformatics/bty121 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Nakamura, Tsukasa
Yamada, Kazunori D
Tomii, Kentaro
Katoh, Kazutaka
Parallelization of MAFFT for large-scale multiple sequence alignments
title Parallelization of MAFFT for large-scale multiple sequence alignments
title_full Parallelization of MAFFT for large-scale multiple sequence alignments
title_fullStr Parallelization of MAFFT for large-scale multiple sequence alignments
title_full_unstemmed Parallelization of MAFFT for large-scale multiple sequence alignments
title_short Parallelization of MAFFT for large-scale multiple sequence alignments
title_sort parallelization of mafft for large-scale multiple sequence alignments
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6041967/
https://www.ncbi.nlm.nih.gov/pubmed/29506019
http://dx.doi.org/10.1093/bioinformatics/bty121
work_keys_str_mv AT nakamuratsukasa parallelizationofmafftforlargescalemultiplesequencealignments
AT yamadakazunorid parallelizationofmafftforlargescalemultiplesequencealignments
AT tomiikentaro parallelizationofmafftforlargescalemultiplesequencealignments
AT katohkazutaka parallelizationofmafftforlargescalemultiplesequencealignments