Cargando…

Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly

Here, we describe single-tube long fragment read (stLFR), a technology that enables sequencing of data from long DNA molecules using economical second-generation sequencing technology. It is based on adding the same barcode sequence to subfragments of the original long DNA molecule (DNA cobarcoding)...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Ou, Chin, Robert, Cheng, Xiaofang, Wu, Michelle Ka Yan, Mao, Qing, Tang, Jingbo, Sun, Yuhui, Anderson, Ellis, Lam, Han K., Chen, Dan, Zhou, Yujun, Wang, Linying, Fan, Fei, Zou, Yan, Xie, Yinlong, Zhang, Rebecca Yu, Drmanac, Snezana, Nguyen, Darlene, Xu, Chongjun, Villarosa, Christian, Gablenz, Scott, Barua, Nina, Nguyen, Staci, Tian, Wenlan, Liu, Jia Sophie, Wang, Jingwan, Liu, Xiao, Qi, Xiaojuan, Chen, Ao, Wang, He, Dong, Yuliang, Zhang, Wenwei, Alexeev, Andrei, Yang, Huanming, Wang, Jian, Kristiansen, Karsten, Xu, Xun, Drmanac, Radoje, Peters, Brock A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499310/
https://www.ncbi.nlm.nih.gov/pubmed/30940689
http://dx.doi.org/10.1101/gr.245126.118
_version_ 1783415771147272192
author Wang, Ou
Chin, Robert
Cheng, Xiaofang
Wu, Michelle Ka Yan
Mao, Qing
Tang, Jingbo
Sun, Yuhui
Anderson, Ellis
Lam, Han K.
Chen, Dan
Zhou, Yujun
Wang, Linying
Fan, Fei
Zou, Yan
Xie, Yinlong
Zhang, Rebecca Yu
Drmanac, Snezana
Nguyen, Darlene
Xu, Chongjun
Villarosa, Christian
Gablenz, Scott
Barua, Nina
Nguyen, Staci
Tian, Wenlan
Liu, Jia Sophie
Wang, Jingwan
Liu, Xiao
Qi, Xiaojuan
Chen, Ao
Wang, He
Dong, Yuliang
Zhang, Wenwei
Alexeev, Andrei
Yang, Huanming
Wang, Jian
Kristiansen, Karsten
Xu, Xun
Drmanac, Radoje
Peters, Brock A.
author_facet Wang, Ou
Chin, Robert
Cheng, Xiaofang
Wu, Michelle Ka Yan
Mao, Qing
Tang, Jingbo
Sun, Yuhui
Anderson, Ellis
Lam, Han K.
Chen, Dan
Zhou, Yujun
Wang, Linying
Fan, Fei
Zou, Yan
Xie, Yinlong
Zhang, Rebecca Yu
Drmanac, Snezana
Nguyen, Darlene
Xu, Chongjun
Villarosa, Christian
Gablenz, Scott
Barua, Nina
Nguyen, Staci
Tian, Wenlan
Liu, Jia Sophie
Wang, Jingwan
Liu, Xiao
Qi, Xiaojuan
Chen, Ao
Wang, He
Dong, Yuliang
Zhang, Wenwei
Alexeev, Andrei
Yang, Huanming
Wang, Jian
Kristiansen, Karsten
Xu, Xun
Drmanac, Radoje
Peters, Brock A.
author_sort Wang, Ou
collection PubMed
description Here, we describe single-tube long fragment read (stLFR), a technology that enables sequencing of data from long DNA molecules using economical second-generation sequencing technology. It is based on adding the same barcode sequence to subfragments of the original long DNA molecule (DNA cobarcoding). To achieve this efficiently, stLFR uses the surface of microbeads to create millions of miniaturized barcoding reactions in a single tube. Using a combinatorial process, up to 3.6 billion unique barcode sequences were generated on beads, enabling practically nonredundant cobarcoding with 50 million barcodes per sample. Using stLFR, we demonstrate efficient unique cobarcoding of more than 8 million 20- to 300-kb genomic DNA fragments. Analysis of the human genome NA12878 with stLFR demonstrated high-quality variant calling and phase block lengths up to N50 34 Mb. We also demonstrate detection of complex structural variants and complete diploid de novo assembly of NA12878. These analyses were all performed using single stLFR libraries, and their construction did not significantly add to the time or cost of whole-genome sequencing (WGS) library preparation. stLFR represents an easily automatable solution that enables high-quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications.
format Online
Article
Text
id pubmed-6499310
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-64993102019-05-17 Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly Wang, Ou Chin, Robert Cheng, Xiaofang Wu, Michelle Ka Yan Mao, Qing Tang, Jingbo Sun, Yuhui Anderson, Ellis Lam, Han K. Chen, Dan Zhou, Yujun Wang, Linying Fan, Fei Zou, Yan Xie, Yinlong Zhang, Rebecca Yu Drmanac, Snezana Nguyen, Darlene Xu, Chongjun Villarosa, Christian Gablenz, Scott Barua, Nina Nguyen, Staci Tian, Wenlan Liu, Jia Sophie Wang, Jingwan Liu, Xiao Qi, Xiaojuan Chen, Ao Wang, He Dong, Yuliang Zhang, Wenwei Alexeev, Andrei Yang, Huanming Wang, Jian Kristiansen, Karsten Xu, Xun Drmanac, Radoje Peters, Brock A. Genome Res Method Here, we describe single-tube long fragment read (stLFR), a technology that enables sequencing of data from long DNA molecules using economical second-generation sequencing technology. It is based on adding the same barcode sequence to subfragments of the original long DNA molecule (DNA cobarcoding). To achieve this efficiently, stLFR uses the surface of microbeads to create millions of miniaturized barcoding reactions in a single tube. Using a combinatorial process, up to 3.6 billion unique barcode sequences were generated on beads, enabling practically nonredundant cobarcoding with 50 million barcodes per sample. Using stLFR, we demonstrate efficient unique cobarcoding of more than 8 million 20- to 300-kb genomic DNA fragments. Analysis of the human genome NA12878 with stLFR demonstrated high-quality variant calling and phase block lengths up to N50 34 Mb. We also demonstrate detection of complex structural variants and complete diploid de novo assembly of NA12878. These analyses were all performed using single stLFR libraries, and their construction did not significantly add to the time or cost of whole-genome sequencing (WGS) library preparation. stLFR represents an easily automatable solution that enables high-quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications. Cold Spring Harbor Laboratory Press 2019-05 /pmc/articles/PMC6499310/ /pubmed/30940689 http://dx.doi.org/10.1101/gr.245126.118 Text en © 2019 Wang et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Wang, Ou
Chin, Robert
Cheng, Xiaofang
Wu, Michelle Ka Yan
Mao, Qing
Tang, Jingbo
Sun, Yuhui
Anderson, Ellis
Lam, Han K.
Chen, Dan
Zhou, Yujun
Wang, Linying
Fan, Fei
Zou, Yan
Xie, Yinlong
Zhang, Rebecca Yu
Drmanac, Snezana
Nguyen, Darlene
Xu, Chongjun
Villarosa, Christian
Gablenz, Scott
Barua, Nina
Nguyen, Staci
Tian, Wenlan
Liu, Jia Sophie
Wang, Jingwan
Liu, Xiao
Qi, Xiaojuan
Chen, Ao
Wang, He
Dong, Yuliang
Zhang, Wenwei
Alexeev, Andrei
Yang, Huanming
Wang, Jian
Kristiansen, Karsten
Xu, Xun
Drmanac, Radoje
Peters, Brock A.
Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly
title Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly
title_full Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly
title_fullStr Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly
title_full_unstemmed Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly
title_short Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly
title_sort efficient and unique cobarcoding of second-generation sequencing reads from long dna molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499310/
https://www.ncbi.nlm.nih.gov/pubmed/30940689
http://dx.doi.org/10.1101/gr.245126.118
work_keys_str_mv AT wangou efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT chinrobert efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT chengxiaofang efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT wumichellekayan efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT maoqing efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT tangjingbo efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT sunyuhui efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT andersonellis efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT lamhank efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT chendan efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT zhouyujun efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT wanglinying efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT fanfei efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT zouyan efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT xieyinlong efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT zhangrebeccayu efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT drmanacsnezana efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT nguyendarlene efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT xuchongjun efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT villarosachristian efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT gablenzscott efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT baruanina efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT nguyenstaci efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT tianwenlan efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT liujiasophie efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT wangjingwan efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT liuxiao efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT qixiaojuan efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT chenao efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT wanghe efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT dongyuliang efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT zhangwenwei efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT alexeevandrei efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT yanghuanming efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT wangjian efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT kristiansenkarsten efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT xuxun efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT drmanacradoje efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly
AT petersbrocka efficientanduniquecobarcodingofsecondgenerationsequencingreadsfromlongdnamoleculesenablingcosteffectiveandaccuratesequencinghaplotypinganddenovoassembly