Cargando…
FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications
Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BW...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990525/ https://www.ncbi.nlm.nih.gov/pubmed/24743329 http://dx.doi.org/10.1371/journal.pone.0094250 |
_version_ | 1782312294881951744 |
---|---|
author | Xiao, Chuan-Le Mai, Zhi-Biao Lian, Xin-Lei Zhong, Jia-Yong Jin, Jing-jie He, Qing-Yu Zhang, Gong |
author_facet | Xiao, Chuan-Le Mai, Zhi-Biao Lian, Xin-Lei Zhong, Jia-Yong Jin, Jing-jie He, Qing-Yu Zhang, Gong |
author_sort | Xiao, Chuan-Le |
collection | PubMed |
description | Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BWT) based algorithms are fast but less robust. To have both advantages, we developed an algorithm FANSe2 with iterative mapping strategy based on the statistics of real-world sequencing error distribution to substantially accelerate the mapping without compromising the accuracy. Its sensitivity and accuracy are higher than the BWT-based algorithms in the tests using both prokaryotic and eukaryotic sequencing datasets. The gene identification results of FANSe2 is experimentally validated, while the previous algorithms have false positives and false negatives. FANSe2 showed remarkably better consistency to the microarray than most other algorithms in terms of gene expression quantifications. We implemented a scalable and almost maintenance-free parallelization method that can utilize the computational power of multiple office computers, a novel feature not present in any other mainstream algorithm. With three normal office computers, we demonstrated that FANSe2 mapped an RNA-seq dataset generated from an entire Illunima HiSeq 2000 flowcell (8 lanes, 608 M reads) to masked human genome within 4.1 hours with higher sensitivity than Bowtie/Bowtie2. FANSe2 thus provides robust accuracy, full indel sensitivity, fast speed, versatile compatibility and economical computational utilization, making it a useful and practical tool for deep sequencing applications. FANSe2 is freely available at http://bioinformatics.jnu.edu.cn/software/fanse2/. |
format | Online Article Text |
id | pubmed-3990525 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-39905252014-04-21 FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications Xiao, Chuan-Le Mai, Zhi-Biao Lian, Xin-Lei Zhong, Jia-Yong Jin, Jing-jie He, Qing-Yu Zhang, Gong PLoS One Research Article Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BWT) based algorithms are fast but less robust. To have both advantages, we developed an algorithm FANSe2 with iterative mapping strategy based on the statistics of real-world sequencing error distribution to substantially accelerate the mapping without compromising the accuracy. Its sensitivity and accuracy are higher than the BWT-based algorithms in the tests using both prokaryotic and eukaryotic sequencing datasets. The gene identification results of FANSe2 is experimentally validated, while the previous algorithms have false positives and false negatives. FANSe2 showed remarkably better consistency to the microarray than most other algorithms in terms of gene expression quantifications. We implemented a scalable and almost maintenance-free parallelization method that can utilize the computational power of multiple office computers, a novel feature not present in any other mainstream algorithm. With three normal office computers, we demonstrated that FANSe2 mapped an RNA-seq dataset generated from an entire Illunima HiSeq 2000 flowcell (8 lanes, 608 M reads) to masked human genome within 4.1 hours with higher sensitivity than Bowtie/Bowtie2. FANSe2 thus provides robust accuracy, full indel sensitivity, fast speed, versatile compatibility and economical computational utilization, making it a useful and practical tool for deep sequencing applications. FANSe2 is freely available at http://bioinformatics.jnu.edu.cn/software/fanse2/. Public Library of Science 2014-04-17 /pmc/articles/PMC3990525/ /pubmed/24743329 http://dx.doi.org/10.1371/journal.pone.0094250 Text en © 2014 Xiao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Xiao, Chuan-Le Mai, Zhi-Biao Lian, Xin-Lei Zhong, Jia-Yong Jin, Jing-jie He, Qing-Yu Zhang, Gong FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications |
title |
FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications |
title_full |
FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications |
title_fullStr |
FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications |
title_full_unstemmed |
FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications |
title_short |
FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications |
title_sort | fanse2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990525/ https://www.ncbi.nlm.nih.gov/pubmed/24743329 http://dx.doi.org/10.1371/journal.pone.0094250 |
work_keys_str_mv | AT xiaochuanle fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications AT maizhibiao fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications AT lianxinlei fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications AT zhongjiayong fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications AT jinjingjie fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications AT heqingyu fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications AT zhanggong fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications |