Cargando…

Parallelization of MAFFT for large-scale multiple sequence alignments

SUMMARY: We report an update for the MAFFT multiple sequence alignment program to enable parallel calculation of large numbers of sequences. The G-INS-1 option of MAFFT was recently reported to have higher accuracy than other methods for large data, but this method has been impractical for most larg...

Descripción completa

Detalles Bibliográficos
Autores principales: Nakamura, Tsukasa, Yamada, Kazunori D, Tomii, Kentaro, Katoh, Kazutaka
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6041967/
https://www.ncbi.nlm.nih.gov/pubmed/29506019
http://dx.doi.org/10.1093/bioinformatics/bty121
Descripción
Sumario:SUMMARY: We report an update for the MAFFT multiple sequence alignment program to enable parallel calculation of large numbers of sequences. The G-INS-1 option of MAFFT was recently reported to have higher accuracy than other methods for large data, but this method has been impractical for most large-scale analyses, due to the requirement of large computational resources. We introduce a scalable variant, G-large-INS-1, which has equivalent accuracy to G-INS-1 and is applicable to 50 000 or more sequences. AVAILABILITY AND IMPLEMENTATION: This feature is available in MAFFT versions 7.355 or later at https://mafft.cbrc.jp/alignment/software/mpi.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.