Cargando…
LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads
We present LT1, the first high-quality human reference genome from the Baltic States. LT1 is a female de novo human reference genome assembly, constructed using 57× nanopore long reads and polished using 47× short paired-end reads. We utilized 72 GB of Hi-C chromosomal mapping data for scaffolding,...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
GigaScience Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9650228/ https://www.ncbi.nlm.nih.gov/pubmed/36824523 http://dx.doi.org/10.46471/gigabyte.51 |
_version_ | 1784827965491642368 |
---|---|
author | Kim, Hui-Su Blazyte, Asta Jeon, Sungwon Yoon, Changhan Kim, Yeonkyung Kim, Changjae Bolser, Dan Ahn, Ji-Hye Edwards, Jeremy S. Bhak, Jong |
author_facet | Kim, Hui-Su Blazyte, Asta Jeon, Sungwon Yoon, Changhan Kim, Yeonkyung Kim, Changjae Bolser, Dan Ahn, Ji-Hye Edwards, Jeremy S. Bhak, Jong |
author_sort | Kim, Hui-Su |
collection | PubMed |
description | We present LT1, the first high-quality human reference genome from the Baltic States. LT1 is a female de novo human reference genome assembly, constructed using 57× nanopore long reads and polished using 47× short paired-end reads. We utilized 72 GB of Hi-C chromosomal mapping data for scaffolding, to maximize assembly contiguity and accuracy. The contig assembly of LT1 was 2.73 Gbp in length, comprising 4490 contigs with an NG50 value of 12.0 Mbp. After scaffolding with Hi-C data and manual curation, the final assembly has an NG50 value of 137 Mbp and 4699 scaffolds. Assessment of gene prediction quality using Benchmarking Universal Single-Copy Orthologs (BUSCO) identified 89.3% of the single-copy orthologous genes included in the benchmark. Detailed characterization of LT1 suggests it has 73,744 predicted transcripts, 4.2 million autosomal SNPs, 974,616 short indels, and 12,079 large structural variants. These data may be used as a benchmark for further in-depth genomic analyses of Baltic populations. |
format | Online Article Text |
id | pubmed-9650228 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | GigaScience Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-96502282023-02-22 LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads Kim, Hui-Su Blazyte, Asta Jeon, Sungwon Yoon, Changhan Kim, Yeonkyung Kim, Changjae Bolser, Dan Ahn, Ji-Hye Edwards, Jeremy S. Bhak, Jong GigaByte Data Release We present LT1, the first high-quality human reference genome from the Baltic States. LT1 is a female de novo human reference genome assembly, constructed using 57× nanopore long reads and polished using 47× short paired-end reads. We utilized 72 GB of Hi-C chromosomal mapping data for scaffolding, to maximize assembly contiguity and accuracy. The contig assembly of LT1 was 2.73 Gbp in length, comprising 4490 contigs with an NG50 value of 12.0 Mbp. After scaffolding with Hi-C data and manual curation, the final assembly has an NG50 value of 137 Mbp and 4699 scaffolds. Assessment of gene prediction quality using Benchmarking Universal Single-Copy Orthologs (BUSCO) identified 89.3% of the single-copy orthologous genes included in the benchmark. Detailed characterization of LT1 suggests it has 73,744 predicted transcripts, 4.2 million autosomal SNPs, 974,616 short indels, and 12,079 large structural variants. These data may be used as a benchmark for further in-depth genomic analyses of Baltic populations. GigaScience Press 2022-05-04 /pmc/articles/PMC9650228/ /pubmed/36824523 http://dx.doi.org/10.46471/gigabyte.51 Text en © The Author(s) 2022. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Data Release Kim, Hui-Su Blazyte, Asta Jeon, Sungwon Yoon, Changhan Kim, Yeonkyung Kim, Changjae Bolser, Dan Ahn, Ji-Hye Edwards, Jeremy S. Bhak, Jong LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads |
title | LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads |
title_full | LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads |
title_fullStr | LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads |
title_full_unstemmed | LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads |
title_short | LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads |
title_sort | lt1, an ont long-read-based assembly scaffolded with hi-c data and polished with short reads |
topic | Data Release |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9650228/ https://www.ncbi.nlm.nih.gov/pubmed/36824523 http://dx.doi.org/10.46471/gigabyte.51 |
work_keys_str_mv | AT kimhuisu lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT blazyteasta lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT jeonsungwon lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT yoonchanghan lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT kimyeonkyung lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT kimchangjae lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT bolserdan lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT ahnjihye lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT edwardsjeremys lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads AT bhakjong lt1anontlongreadbasedassemblyscaffoldedwithhicdataandpolishedwithshortreads |