Cargando…
Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods
A comparison of 10 most popular Multiple Sequence Alignment (MSA) tools, namely, MUSCLE, MAFFT(L-INS-i), MAFFT (FFT-NS-2), T-Coffee, ProbCons, SATe, Clustal Omega, Kalign, Multalin, and Dialign-TX is presented. We also focused on the significance of some implementations embedded in algorithm of each...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267518/ https://www.ncbi.nlm.nih.gov/pubmed/25574120 http://dx.doi.org/10.4137/EBO.S19199 |
_version_ | 1782349156127342592 |
---|---|
author | Pervez, Muhammad Tariq Babar, Masroor Ellahi Nadeem, Asif Aslam, Muhammad Awan, Ali Raza Aslam, Naeem Hussain, Tanveer Naveed, Nasir Qadri, Salman Waheed, Usman Shoaib, Muhammad |
author_facet | Pervez, Muhammad Tariq Babar, Masroor Ellahi Nadeem, Asif Aslam, Muhammad Awan, Ali Raza Aslam, Naeem Hussain, Tanveer Naveed, Nasir Qadri, Salman Waheed, Usman Shoaib, Muhammad |
author_sort | Pervez, Muhammad Tariq |
collection | PubMed |
description | A comparison of 10 most popular Multiple Sequence Alignment (MSA) tools, namely, MUSCLE, MAFFT(L-INS-i), MAFFT (FFT-NS-2), T-Coffee, ProbCons, SATe, Clustal Omega, Kalign, Multalin, and Dialign-TX is presented. We also focused on the significance of some implementations embedded in algorithm of each tool. Based on 10 simulated trees of different number of taxa generated by R, 400 known alignments and sequence files were constructed using indel-Seq-Gen. A total of 4000 test alignments were generated to study the effect of sequence length, indel size, deletion rate, and insertion rate. Results showed that alignment quality was highly dependent on the number of deletions and insertions in the sequences and that the sequence length and indel size had a weaker effect. Overall, ProbCons was consistently on the top of list of the evaluated MSA tools. SATe, being little less accurate, was 529.10% faster than ProbCons and 236.72% faster than MAFFT(L-INS-i). Among other tools, Kalign and MUSCLE achieved the highest sum of pairs. We also considered BALiBASE benchmark datasets and the results relative to BAliBASE- and indel-Seq-Gen-generated alignments were consistent in the most cases. |
format | Online Article Text |
id | pubmed-4267518 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-42675182015-01-08 Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods Pervez, Muhammad Tariq Babar, Masroor Ellahi Nadeem, Asif Aslam, Muhammad Awan, Ali Raza Aslam, Naeem Hussain, Tanveer Naveed, Nasir Qadri, Salman Waheed, Usman Shoaib, Muhammad Evol Bioinform Online Original Research A comparison of 10 most popular Multiple Sequence Alignment (MSA) tools, namely, MUSCLE, MAFFT(L-INS-i), MAFFT (FFT-NS-2), T-Coffee, ProbCons, SATe, Clustal Omega, Kalign, Multalin, and Dialign-TX is presented. We also focused on the significance of some implementations embedded in algorithm of each tool. Based on 10 simulated trees of different number of taxa generated by R, 400 known alignments and sequence files were constructed using indel-Seq-Gen. A total of 4000 test alignments were generated to study the effect of sequence length, indel size, deletion rate, and insertion rate. Results showed that alignment quality was highly dependent on the number of deletions and insertions in the sequences and that the sequence length and indel size had a weaker effect. Overall, ProbCons was consistently on the top of list of the evaluated MSA tools. SATe, being little less accurate, was 529.10% faster than ProbCons and 236.72% faster than MAFFT(L-INS-i). Among other tools, Kalign and MUSCLE achieved the highest sum of pairs. We also considered BALiBASE benchmark datasets and the results relative to BAliBASE- and indel-Seq-Gen-generated alignments were consistent in the most cases. Libertas Academica 2014-12-07 /pmc/articles/PMC4267518/ /pubmed/25574120 http://dx.doi.org/10.4137/EBO.S19199 Text en © 2014 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License. |
spellingShingle | Original Research Pervez, Muhammad Tariq Babar, Masroor Ellahi Nadeem, Asif Aslam, Muhammad Awan, Ali Raza Aslam, Naeem Hussain, Tanveer Naveed, Nasir Qadri, Salman Waheed, Usman Shoaib, Muhammad Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods |
title | Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods |
title_full | Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods |
title_fullStr | Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods |
title_full_unstemmed | Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods |
title_short | Evaluating the Accuracy and Efficiency of Multiple Sequence Alignment Methods |
title_sort | evaluating the accuracy and efficiency of multiple sequence alignment methods |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267518/ https://www.ncbi.nlm.nih.gov/pubmed/25574120 http://dx.doi.org/10.4137/EBO.S19199 |
work_keys_str_mv | AT pervezmuhammadtariq evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT babarmasroorellahi evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT nadeemasif evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT aslammuhammad evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT awanaliraza evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT aslamnaeem evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT hussaintanveer evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT naveednasir evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT qadrisalman evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT waheedusman evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods AT shoaibmuhammad evaluatingtheaccuracyandefficiencyofmultiplesequencealignmentmethods |