Cargando…

The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual

We used long-read DNA sequencing to assemble the genome of a Southern Han Chinese male. We organized the sequence into chromosomes and filled in gaps using the recently completed T2T-CHM13 genome as a guide, yielding a gap-free genome, Han1, containing 3,099,707,698 bases. Using the T2T-CHM13 annota...

Descripción completa

Detalles Bibliográficos
Autores principales: Chao, Kuan-Hao, Zimin, Aleksey V, Pertea, Mihaela, Salzberg, Steven L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9997556/
https://www.ncbi.nlm.nih.gov/pubmed/36630290
http://dx.doi.org/10.1093/g3journal/jkac321
_version_ 1784903278417412096
author Chao, Kuan-Hao
Zimin, Aleksey V
Pertea, Mihaela
Salzberg, Steven L
author_facet Chao, Kuan-Hao
Zimin, Aleksey V
Pertea, Mihaela
Salzberg, Steven L
author_sort Chao, Kuan-Hao
collection PubMed
description We used long-read DNA sequencing to assemble the genome of a Southern Han Chinese male. We organized the sequence into chromosomes and filled in gaps using the recently completed T2T-CHM13 genome as a guide, yielding a gap-free genome, Han1, containing 3,099,707,698 bases. Using the T2T-CHM13 annotation as a reference, we mapped all genes onto the Han1 genome and identified additional gene copies, generating a total of 60,708 putative genes, of which 20,003 are protein-coding. A comprehensive comparison between the genes revealed that 235 protein-coding genes were substantially different between the individuals, with frameshifts or truncations affecting the protein-coding sequence. Most of these were heterozygous variants in which one gene copy was unaffected. This represents the first gene-level comparison between two finished, annotated individual human genomes.
format Online
Article
Text
id pubmed-9997556
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-99975562023-03-10 The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual Chao, Kuan-Hao Zimin, Aleksey V Pertea, Mihaela Salzberg, Steven L G3 (Bethesda) Genomic Prediction We used long-read DNA sequencing to assemble the genome of a Southern Han Chinese male. We organized the sequence into chromosomes and filled in gaps using the recently completed T2T-CHM13 genome as a guide, yielding a gap-free genome, Han1, containing 3,099,707,698 bases. Using the T2T-CHM13 annotation as a reference, we mapped all genes onto the Han1 genome and identified additional gene copies, generating a total of 60,708 putative genes, of which 20,003 are protein-coding. A comprehensive comparison between the genes revealed that 235 protein-coding genes were substantially different between the individuals, with frameshifts or truncations affecting the protein-coding sequence. Most of these were heterozygous variants in which one gene copy was unaffected. This represents the first gene-level comparison between two finished, annotated individual human genomes. Oxford University Press 2023-01-11 /pmc/articles/PMC9997556/ /pubmed/36630290 http://dx.doi.org/10.1093/g3journal/jkac321 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genomic Prediction
Chao, Kuan-Hao
Zimin, Aleksey V
Pertea, Mihaela
Salzberg, Steven L
The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
title The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
title_full The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
title_fullStr The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
title_full_unstemmed The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
title_short The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
title_sort first gapless, reference-quality, fully annotated genome from a southern han chinese individual
topic Genomic Prediction
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9997556/
https://www.ncbi.nlm.nih.gov/pubmed/36630290
http://dx.doi.org/10.1093/g3journal/jkac321
work_keys_str_mv AT chaokuanhao thefirstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual
AT ziminalekseyv thefirstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual
AT perteamihaela thefirstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual
AT salzbergstevenl thefirstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual
AT chaokuanhao firstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual
AT ziminalekseyv firstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual
AT perteamihaela firstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual
AT salzbergstevenl firstgaplessreferencequalityfullyannotatedgenomefromasouthernhanchineseindividual