Cargando…

FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications

Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BW...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiao, Chuan-Le, Mai, Zhi-Biao, Lian, Xin-Lei, Zhong, Jia-Yong, Jin, Jing-jie, He, Qing-Yu, Zhang, Gong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990525/
https://www.ncbi.nlm.nih.gov/pubmed/24743329
http://dx.doi.org/10.1371/journal.pone.0094250
_version_ 1782312294881951744
author Xiao, Chuan-Le
Mai, Zhi-Biao
Lian, Xin-Lei
Zhong, Jia-Yong
Jin, Jing-jie
He, Qing-Yu
Zhang, Gong
author_facet Xiao, Chuan-Le
Mai, Zhi-Biao
Lian, Xin-Lei
Zhong, Jia-Yong
Jin, Jing-jie
He, Qing-Yu
Zhang, Gong
author_sort Xiao, Chuan-Le
collection PubMed
description Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BWT) based algorithms are fast but less robust. To have both advantages, we developed an algorithm FANSe2 with iterative mapping strategy based on the statistics of real-world sequencing error distribution to substantially accelerate the mapping without compromising the accuracy. Its sensitivity and accuracy are higher than the BWT-based algorithms in the tests using both prokaryotic and eukaryotic sequencing datasets. The gene identification results of FANSe2 is experimentally validated, while the previous algorithms have false positives and false negatives. FANSe2 showed remarkably better consistency to the microarray than most other algorithms in terms of gene expression quantifications. We implemented a scalable and almost maintenance-free parallelization method that can utilize the computational power of multiple office computers, a novel feature not present in any other mainstream algorithm. With three normal office computers, we demonstrated that FANSe2 mapped an RNA-seq dataset generated from an entire Illunima HiSeq 2000 flowcell (8 lanes, 608 M reads) to masked human genome within 4.1 hours with higher sensitivity than Bowtie/Bowtie2. FANSe2 thus provides robust accuracy, full indel sensitivity, fast speed, versatile compatibility and economical computational utilization, making it a useful and practical tool for deep sequencing applications. FANSe2 is freely available at http://bioinformatics.jnu.edu.cn/software/fanse2/.
format Online
Article
Text
id pubmed-3990525
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39905252014-04-21 FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications Xiao, Chuan-Le Mai, Zhi-Biao Lian, Xin-Lei Zhong, Jia-Yong Jin, Jing-jie He, Qing-Yu Zhang, Gong PLoS One Research Article Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BWT) based algorithms are fast but less robust. To have both advantages, we developed an algorithm FANSe2 with iterative mapping strategy based on the statistics of real-world sequencing error distribution to substantially accelerate the mapping without compromising the accuracy. Its sensitivity and accuracy are higher than the BWT-based algorithms in the tests using both prokaryotic and eukaryotic sequencing datasets. The gene identification results of FANSe2 is experimentally validated, while the previous algorithms have false positives and false negatives. FANSe2 showed remarkably better consistency to the microarray than most other algorithms in terms of gene expression quantifications. We implemented a scalable and almost maintenance-free parallelization method that can utilize the computational power of multiple office computers, a novel feature not present in any other mainstream algorithm. With three normal office computers, we demonstrated that FANSe2 mapped an RNA-seq dataset generated from an entire Illunima HiSeq 2000 flowcell (8 lanes, 608 M reads) to masked human genome within 4.1 hours with higher sensitivity than Bowtie/Bowtie2. FANSe2 thus provides robust accuracy, full indel sensitivity, fast speed, versatile compatibility and economical computational utilization, making it a useful and practical tool for deep sequencing applications. FANSe2 is freely available at http://bioinformatics.jnu.edu.cn/software/fanse2/. Public Library of Science 2014-04-17 /pmc/articles/PMC3990525/ /pubmed/24743329 http://dx.doi.org/10.1371/journal.pone.0094250 Text en © 2014 Xiao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xiao, Chuan-Le
Mai, Zhi-Biao
Lian, Xin-Lei
Zhong, Jia-Yong
Jin, Jing-jie
He, Qing-Yu
Zhang, Gong
FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications
title FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications
title_full FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications
title_fullStr FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications
title_full_unstemmed FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications
title_short FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications
title_sort fanse2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990525/
https://www.ncbi.nlm.nih.gov/pubmed/24743329
http://dx.doi.org/10.1371/journal.pone.0094250
work_keys_str_mv AT xiaochuanle fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications
AT maizhibiao fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications
AT lianxinlei fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications
AT zhongjiayong fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications
AT jinjingjie fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications
AT heqingyu fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications
AT zhanggong fanse2arobustandcostefficientalignmenttoolforquantitativenextgenerationsequencingapplications