Cargando…

Efficient assembly of nanopore reads via highly accurate and intact error correction

Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences d...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Ying, Nie, Fan, Xie, Shang-Qian, Zheng, Ying-Feng, Dai, Qi, Bray, Thomas, Wang, Yao-Xin, Xing, Jian-Feng, Huang, Zhi-Jian, Wang, De-Peng, He, Li-Juan, Luo, Feng, Wang, Jian-Xin, Liu, Yi-Zhi, Xiao, Chuan-Le
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7782737/
https://www.ncbi.nlm.nih.gov/pubmed/33397900
http://dx.doi.org/10.1038/s41467-020-20236-7
_version_ 1783631966150590464
author Chen, Ying
Nie, Fan
Xie, Shang-Qian
Zheng, Ying-Feng
Dai, Qi
Bray, Thomas
Wang, Yao-Xin
Xing, Jian-Feng
Huang, Zhi-Jian
Wang, De-Peng
He, Li-Juan
Luo, Feng
Wang, Jian-Xin
Liu, Yi-Zhi
Xiao, Chuan-Le
author_facet Chen, Ying
Nie, Fan
Xie, Shang-Qian
Zheng, Ying-Feng
Dai, Qi
Bray, Thomas
Wang, Yao-Xin
Xing, Jian-Feng
Huang, Zhi-Jian
Wang, De-Peng
He, Li-Juan
Luo, Feng
Wang, Jian-Xin
Liu, Yi-Zhi
Xiao, Chuan-Le
author_sort Chen, Ying
collection PubMed
description Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences during error correction, which reduces both the length of the reads and contiguity of the final assembly. Here, we develop an error correction, and de novo assembly tool designed to overcome complex errors in nanopore reads. We propose an adaptive read selection and two-step progressive method to quickly correct nanopore reads to high accuracy. We introduce a two-stage assembler to utilize the full length of nanopore reads. Our tool achieves superior performance in both error correction and de novo assembling nanopore reads. It requires only 8122 hours to assemble a 35X coverage human genome and achieves a 2.47-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line shows an NG50 of 22 Mbp. The high-quality assembly of nanopore reads can significantly reduce false positives in structure variation detection.
format Online
Article
Text
id pubmed-7782737
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-77827372021-01-11 Efficient assembly of nanopore reads via highly accurate and intact error correction Chen, Ying Nie, Fan Xie, Shang-Qian Zheng, Ying-Feng Dai, Qi Bray, Thomas Wang, Yao-Xin Xing, Jian-Feng Huang, Zhi-Jian Wang, De-Peng He, Li-Juan Luo, Feng Wang, Jian-Xin Liu, Yi-Zhi Xiao, Chuan-Le Nat Commun Article Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences during error correction, which reduces both the length of the reads and contiguity of the final assembly. Here, we develop an error correction, and de novo assembly tool designed to overcome complex errors in nanopore reads. We propose an adaptive read selection and two-step progressive method to quickly correct nanopore reads to high accuracy. We introduce a two-stage assembler to utilize the full length of nanopore reads. Our tool achieves superior performance in both error correction and de novo assembling nanopore reads. It requires only 8122 hours to assemble a 35X coverage human genome and achieves a 2.47-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line shows an NG50 of 22 Mbp. The high-quality assembly of nanopore reads can significantly reduce false positives in structure variation detection. Nature Publishing Group UK 2021-01-04 /pmc/articles/PMC7782737/ /pubmed/33397900 http://dx.doi.org/10.1038/s41467-020-20236-7 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Chen, Ying
Nie, Fan
Xie, Shang-Qian
Zheng, Ying-Feng
Dai, Qi
Bray, Thomas
Wang, Yao-Xin
Xing, Jian-Feng
Huang, Zhi-Jian
Wang, De-Peng
He, Li-Juan
Luo, Feng
Wang, Jian-Xin
Liu, Yi-Zhi
Xiao, Chuan-Le
Efficient assembly of nanopore reads via highly accurate and intact error correction
title Efficient assembly of nanopore reads via highly accurate and intact error correction
title_full Efficient assembly of nanopore reads via highly accurate and intact error correction
title_fullStr Efficient assembly of nanopore reads via highly accurate and intact error correction
title_full_unstemmed Efficient assembly of nanopore reads via highly accurate and intact error correction
title_short Efficient assembly of nanopore reads via highly accurate and intact error correction
title_sort efficient assembly of nanopore reads via highly accurate and intact error correction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7782737/
https://www.ncbi.nlm.nih.gov/pubmed/33397900
http://dx.doi.org/10.1038/s41467-020-20236-7
work_keys_str_mv AT chenying efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT niefan efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT xieshangqian efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT zhengyingfeng efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT daiqi efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT braythomas efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT wangyaoxin efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT xingjianfeng efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT huangzhijian efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT wangdepeng efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT helijuan efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT luofeng efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT wangjianxin efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT liuyizhi efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection
AT xiaochuanle efficientassemblyofnanoporereadsviahighlyaccurateandintacterrorcorrection