Cargando…

Long-read sequencing and de novo assembly of a Chinese genome

Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arr...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Lingling, Guo, Yunfei, Dong, Chengliang, Huddleston, John, Yang, Hui, Han, Xiaolu, Fu, Aisi, Li, Quan, Li, Na, Gong, Siyi, Lintner, Katherine E., Ding, Qiong, Wang, Zou, Hu, Jiang, Wang, Depeng, Wang, Feng, Wang, Lin, Lyon, Gholson J., Guan, Yongtao, Shen, Yufeng, Evgrafov, Oleg V., Knowles, James A., Thibaud-Nissen, Francoise, Schneider, Valerie, Yu, Chack-Yung, Zhou, Libing, Eichler, Evan E., So, Kwok-Fai, Wang, Kai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4931320/
https://www.ncbi.nlm.nih.gov/pubmed/27356984
http://dx.doi.org/10.1038/ncomms12065
_version_ 1782440870853738496
author Shi, Lingling
Guo, Yunfei
Dong, Chengliang
Huddleston, John
Yang, Hui
Han, Xiaolu
Fu, Aisi
Li, Quan
Li, Na
Gong, Siyi
Lintner, Katherine E.
Ding, Qiong
Wang, Zou
Hu, Jiang
Wang, Depeng
Wang, Feng
Wang, Lin
Lyon, Gholson J.
Guan, Yongtao
Shen, Yufeng
Evgrafov, Oleg V.
Knowles, James A.
Thibaud-Nissen, Francoise
Schneider, Valerie
Yu, Chack-Yung
Zhou, Libing
Eichler, Evan E.
So, Kwok-Fai
Wang, Kai
author_facet Shi, Lingling
Guo, Yunfei
Dong, Chengliang
Huddleston, John
Yang, Hui
Han, Xiaolu
Fu, Aisi
Li, Quan
Li, Na
Gong, Siyi
Lintner, Katherine E.
Ding, Qiong
Wang, Zou
Hu, Jiang
Wang, Depeng
Wang, Feng
Wang, Lin
Lyon, Gholson J.
Guan, Yongtao
Shen, Yufeng
Evgrafov, Oleg V.
Knowles, James A.
Thibaud-Nissen, Francoise
Schneider, Valerie
Yu, Chack-Yung
Zhou, Libing
Eichler, Evan E.
So, Kwok-Fai
Wang, Kai
author_sort Shi, Lingling
collection PubMed
description Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93 Gb (contig N50: 8.3 Mb, scaffold N50: 22.0 Mb, including 39.3 Mb N-bases), together with 206 Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8 Mb of HX1-specific sequences, including 4.1 Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.
format Online
Article
Text
id pubmed-4931320
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-49313202016-07-12 Long-read sequencing and de novo assembly of a Chinese genome Shi, Lingling Guo, Yunfei Dong, Chengliang Huddleston, John Yang, Hui Han, Xiaolu Fu, Aisi Li, Quan Li, Na Gong, Siyi Lintner, Katherine E. Ding, Qiong Wang, Zou Hu, Jiang Wang, Depeng Wang, Feng Wang, Lin Lyon, Gholson J. Guan, Yongtao Shen, Yufeng Evgrafov, Oleg V. Knowles, James A. Thibaud-Nissen, Francoise Schneider, Valerie Yu, Chack-Yung Zhou, Libing Eichler, Evan E. So, Kwok-Fai Wang, Kai Nat Commun Article Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93 Gb (contig N50: 8.3 Mb, scaffold N50: 22.0 Mb, including 39.3 Mb N-bases), together with 206 Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8 Mb of HX1-specific sequences, including 4.1 Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations. Nature Publishing Group 2016-06-30 /pmc/articles/PMC4931320/ /pubmed/27356984 http://dx.doi.org/10.1038/ncomms12065 Text en Copyright © 2016, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved. http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Shi, Lingling
Guo, Yunfei
Dong, Chengliang
Huddleston, John
Yang, Hui
Han, Xiaolu
Fu, Aisi
Li, Quan
Li, Na
Gong, Siyi
Lintner, Katherine E.
Ding, Qiong
Wang, Zou
Hu, Jiang
Wang, Depeng
Wang, Feng
Wang, Lin
Lyon, Gholson J.
Guan, Yongtao
Shen, Yufeng
Evgrafov, Oleg V.
Knowles, James A.
Thibaud-Nissen, Francoise
Schneider, Valerie
Yu, Chack-Yung
Zhou, Libing
Eichler, Evan E.
So, Kwok-Fai
Wang, Kai
Long-read sequencing and de novo assembly of a Chinese genome
title Long-read sequencing and de novo assembly of a Chinese genome
title_full Long-read sequencing and de novo assembly of a Chinese genome
title_fullStr Long-read sequencing and de novo assembly of a Chinese genome
title_full_unstemmed Long-read sequencing and de novo assembly of a Chinese genome
title_short Long-read sequencing and de novo assembly of a Chinese genome
title_sort long-read sequencing and de novo assembly of a chinese genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4931320/
https://www.ncbi.nlm.nih.gov/pubmed/27356984
http://dx.doi.org/10.1038/ncomms12065
work_keys_str_mv AT shilingling longreadsequencinganddenovoassemblyofachinesegenome
AT guoyunfei longreadsequencinganddenovoassemblyofachinesegenome
AT dongchengliang longreadsequencinganddenovoassemblyofachinesegenome
AT huddlestonjohn longreadsequencinganddenovoassemblyofachinesegenome
AT yanghui longreadsequencinganddenovoassemblyofachinesegenome
AT hanxiaolu longreadsequencinganddenovoassemblyofachinesegenome
AT fuaisi longreadsequencinganddenovoassemblyofachinesegenome
AT liquan longreadsequencinganddenovoassemblyofachinesegenome
AT lina longreadsequencinganddenovoassemblyofachinesegenome
AT gongsiyi longreadsequencinganddenovoassemblyofachinesegenome
AT lintnerkatherinee longreadsequencinganddenovoassemblyofachinesegenome
AT dingqiong longreadsequencinganddenovoassemblyofachinesegenome
AT wangzou longreadsequencinganddenovoassemblyofachinesegenome
AT hujiang longreadsequencinganddenovoassemblyofachinesegenome
AT wangdepeng longreadsequencinganddenovoassemblyofachinesegenome
AT wangfeng longreadsequencinganddenovoassemblyofachinesegenome
AT wanglin longreadsequencinganddenovoassemblyofachinesegenome
AT lyongholsonj longreadsequencinganddenovoassemblyofachinesegenome
AT guanyongtao longreadsequencinganddenovoassemblyofachinesegenome
AT shenyufeng longreadsequencinganddenovoassemblyofachinesegenome
AT evgrafovolegv longreadsequencinganddenovoassemblyofachinesegenome
AT knowlesjamesa longreadsequencinganddenovoassemblyofachinesegenome
AT thibaudnissenfrancoise longreadsequencinganddenovoassemblyofachinesegenome
AT schneidervalerie longreadsequencinganddenovoassemblyofachinesegenome
AT yuchackyung longreadsequencinganddenovoassemblyofachinesegenome
AT zhoulibing longreadsequencinganddenovoassemblyofachinesegenome
AT eichlerevane longreadsequencinganddenovoassemblyofachinesegenome
AT sokwokfai longreadsequencinganddenovoassemblyofachinesegenome
AT wangkai longreadsequencinganddenovoassemblyofachinesegenome