Cargando…

Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods

A comparison of 10 most popular Multiple Sequence Alignment (MSA) tools, namely, MUSCLE, MAFFT(L-INS-i), MAFFT (FFT-NS-2), T-Coffee, ProbCons, SATe, Clustal Omega, Kalign, Multalin, and Dialign-TX is presented. We also focused on the significance of some implementations embedded in algorithm of each...

Descripción completa

Detalles Bibliográficos
Autores principales: Pervez, Muhammad Tariq, Babar, Masroor Ellahi, Nadeem, Asif, Aslam, Muhammad, Awan, Ali Raza, Aslam, Naeem, Hussain, Tanveer, Naveed, Nasir, Qadri, Salman, Waheed, Usman, Shoaib, Muhammad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267518/
https://www.ncbi.nlm.nih.gov/pubmed/25574120
http://dx.doi.org/10.4137/EBO.S19199
_version_ 1782349156127342592
author Pervez, Muhammad Tariq
Babar, Masroor Ellahi
Nadeem, Asif
Aslam, Muhammad
Awan, Ali Raza
Aslam, Naeem
Hussain, Tanveer
Naveed, Nasir
Qadri, Salman
Waheed, Usman
Shoaib, Muhammad
author_facet Pervez, Muhammad Tariq
Babar, Masroor Ellahi
Nadeem, Asif
Aslam, Muhammad
Awan, Ali Raza
Aslam, Naeem
Hussain, Tanveer
Naveed, Nasir
Qadri, Salman
Waheed, Usman
Shoaib, Muhammad
author_sort Pervez, Muhammad Tariq
collection PubMed
description A comparison of 10 most popular Multiple Sequence Alignment (MSA) tools, namely, MUSCLE, MAFFT(L-INS-i), MAFFT (FFT-NS-2), T-Coffee, ProbCons, SATe, Clustal Omega, Kalign, Multalin, and Dialign-TX is presented. We also focused on the significance of some implementations embedded in algorithm of each tool. Based on 10 simulated trees of different number of taxa generated by R, 400 known alignments and sequence files were constructed using indel-Seq-Gen. A total of 4000 test alignments were generated to study the effect of sequence length, indel size, deletion rate, and insertion rate. Results showed that alignment quality was highly dependent on the number of deletions and insertions in the sequences and that the sequence length and indel size had a weaker effect. Overall, ProbCons was consistently on the top of list of the evaluated MSA tools. SATe, being little less accurate, was 529.10% faster than ProbCons and 236.72% faster than MAFFT(L-INS-i). Among other tools, Kalign and MUSCLE achieved the highest sum of pairs. We also considered BALiBASE benchmark datasets and the results relative to BAliBASE- and indel-Seq-Gen-generated alignments were consistent in the most cases.
format Online
Article
Text
id pubmed-4267518
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-42675182015-01-08 Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods Pervez, Muhammad Tariq Babar, Masroor Ellahi Nadeem, Asif Aslam, Muhammad Awan, Ali Raza Aslam, Naeem Hussain, Tanveer Naveed, Nasir Qadri, Salman Waheed, Usman Shoaib, Muhammad Evol Bioinform Online Original Research A comparison of 10 most popular Multiple Sequence Alignment (MSA) tools, namely, MUSCLE, MAFFT(L-INS-i), MAFFT (FFT-NS-2), T-Coffee, ProbCons, SATe, Clustal Omega, Kalign, Multalin, and Dialign-TX is presented. We also focused on the significance of some implementations embedded in algorithm of each tool. Based on 10 simulated trees of different number of taxa generated by R, 400 known alignments and sequence files were constructed using indel-Seq-Gen. A total of 4000 test alignments were generated to study the effect of sequence length, indel size, deletion rate, and insertion rate. Results showed that alignment quality was highly dependent on the number of deletions and insertions in the sequences and that the sequence length and indel size had a weaker effect. Overall, ProbCons was consistently on the top of list of the evaluated MSA tools. SATe, being little less accurate, was 529.10% faster than ProbCons and 236.72% faster than MAFFT(L-INS-i). Among other tools, Kalign and MUSCLE achieved the highest sum of pairs. We also considered BALiBASE benchmark datasets and the results relative to BAliBASE- and indel-Seq-Gen-generated alignments were consistent in the most cases. Libertas Academica 2014-12-07 /pmc/articles/PMC4267518/ /pubmed/25574120 http://dx.doi.org/10.4137/EBO.S19199 Text en © 2014 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.
spellingShingle Original Research
Pervez, Muhammad Tariq
Babar, Masroor Ellahi
Nadeem, Asif
Aslam, Muhammad
Awan, Ali Raza
Aslam, Naeem
Hussain, Tanveer
Naveed, Nasir
Qadri, Salman
Waheed, Usman
Shoaib, Muhammad
Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods
title Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods
title_full Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods
title_fullStr Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods
title_full_unstemmed Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods
title_short Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods
title_sort evaluating the accuracy and efficiency of multiple sequence alignment methods
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267518/
https://www.ncbi.nlm.nih.gov/pubmed/25574120
http://dx.doi.org/10.4137/EBO.S19199
work_keys_str_mv AT pervezmuhammadtariq evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT babarmasroorellahi evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT nadeemasif evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT aslammuhammad evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT awanaliraza evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT aslamnaeem evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT hussaintanveer evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT naveednasir evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT qadrisalman evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT waheedusman evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods
AT shoaibmuhammad evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods