Cargando…
Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq
Next-generation sequencing technologies provide an unparallelled opportunity for the characterization and discovery of known and novel viruses. Because viruses are known to have the highest mutation rates when compared to eukaryotic and bacterial organisms, we assess the extent to which eleven well-...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3813700/ https://www.ncbi.nlm.nih.gov/pubmed/24204709 http://dx.doi.org/10.1371/journal.pone.0076935 |
_version_ | 1782289144357060608 |
---|---|
author | Borozan, Ivan Watt, Stuart N. Ferretti, Vincent |
author_facet | Borozan, Ivan Watt, Stuart N. Ferretti, Vincent |
author_sort | Borozan, Ivan |
collection | PubMed |
description | Next-generation sequencing technologies provide an unparallelled opportunity for the characterization and discovery of known and novel viruses. Because viruses are known to have the highest mutation rates when compared to eukaryotic and bacterial organisms, we assess the extent to which eleven well-known alignment algorithms (BLAST, BLAT, BWA, BWA-SW, BWA-MEM, BFAST, Bowtie2, Novoalign, GSNAP, SHRiMP2 and STAR) can be used for characterizing mutated and non-mutated viral sequences - including those that exhibit RNA splicing - in transcriptome samples. To evaluate aligners objectively we developed a realistic RNA-Seq simulation and evaluation framework (RiSER) and propose a new combined score to rank aligners for viral characterization in terms of their precision, sensitivity and alignment accuracy. We used RiSER to simulate both human and viral read sequences and suggest the best set of aligners for viral sequence characterization in human transcriptome samples. Our results show that significant and substantial differences exist between aligners and that a digital-subtraction-based viral identification framework can and should use different aligners for different parts of the process. We determine the extent to which mutated viral sequences can be effectively characterized and show that more sensitive aligners such as BLAST, BFAST, SHRiMP2, BWA-SW and GSNAP can accurately characterize substantially divergent viral sequences with up to 15% overall sequence mutation rate. We believe that the results presented here will be useful to researchers choosing aligners for viral sequence characterization using next-generation sequencing data. |
format | Online Article Text |
id | pubmed-3813700 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-38137002013-11-07 Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq Borozan, Ivan Watt, Stuart N. Ferretti, Vincent PLoS One Research Article Next-generation sequencing technologies provide an unparallelled opportunity for the characterization and discovery of known and novel viruses. Because viruses are known to have the highest mutation rates when compared to eukaryotic and bacterial organisms, we assess the extent to which eleven well-known alignment algorithms (BLAST, BLAT, BWA, BWA-SW, BWA-MEM, BFAST, Bowtie2, Novoalign, GSNAP, SHRiMP2 and STAR) can be used for characterizing mutated and non-mutated viral sequences - including those that exhibit RNA splicing - in transcriptome samples. To evaluate aligners objectively we developed a realistic RNA-Seq simulation and evaluation framework (RiSER) and propose a new combined score to rank aligners for viral characterization in terms of their precision, sensitivity and alignment accuracy. We used RiSER to simulate both human and viral read sequences and suggest the best set of aligners for viral sequence characterization in human transcriptome samples. Our results show that significant and substantial differences exist between aligners and that a digital-subtraction-based viral identification framework can and should use different aligners for different parts of the process. We determine the extent to which mutated viral sequences can be effectively characterized and show that more sensitive aligners such as BLAST, BFAST, SHRiMP2, BWA-SW and GSNAP can accurately characterize substantially divergent viral sequences with up to 15% overall sequence mutation rate. We believe that the results presented here will be useful to researchers choosing aligners for viral sequence characterization using next-generation sequencing data. Public Library of Science 2013-10-30 /pmc/articles/PMC3813700/ /pubmed/24204709 http://dx.doi.org/10.1371/journal.pone.0076935 Text en © 2013 Borozan et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Borozan, Ivan Watt, Stuart N. Ferretti, Vincent Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq |
title | Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq |
title_full | Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq |
title_fullStr | Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq |
title_full_unstemmed | Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq |
title_short | Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq |
title_sort | evaluation of alignment algorithms for discovery and identification of pathogens using rna-seq |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3813700/ https://www.ncbi.nlm.nih.gov/pubmed/24204709 http://dx.doi.org/10.1371/journal.pone.0076935 |
work_keys_str_mv | AT borozanivan evaluationofalignmentalgorithmsfordiscoveryandidentificationofpathogensusingrnaseq AT wattstuartn evaluationofalignmentalgorithmsfordiscoveryandidentificationofpathogensusingrnaseq AT ferrettivincent evaluationofalignmentalgorithmsfordiscoveryandidentificationofpathogensusingrnaseq |