Cargando…

Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome

The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemb...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, HoJoon, Greer, Stephanie U., Pavlichin, Dmitri S., Zhou, Bo, Urban, Alexander E., Weissman, Tsachy, Ji, Hanlee P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10475782/
https://www.ncbi.nlm.nih.gov/pubmed/37671027
http://dx.doi.org/10.1016/j.crmeth.2023.100543
_version_ 1785100790481813504
author Lee, HoJoon
Greer, Stephanie U.
Pavlichin, Dmitri S.
Zhou, Bo
Urban, Alexander E.
Weissman, Tsachy
Ji, Hanlee P.
author_facet Lee, HoJoon
Greer, Stephanie U.
Pavlichin, Dmitri S.
Zhou, Bo
Urban, Alexander E.
Weissman, Tsachy
Ji, Hanlee P.
author_sort Lee, HoJoon
collection PubMed
description The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as “pan-conserved segment tags” (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.
format Online
Article
Text
id pubmed-10475782
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-104757822023-09-05 Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome Lee, HoJoon Greer, Stephanie U. Pavlichin, Dmitri S. Zhou, Bo Urban, Alexander E. Weissman, Tsachy Ji, Hanlee P. Cell Rep Methods Article The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as “pan-conserved segment tags” (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework. Elsevier 2023-08-02 /pmc/articles/PMC10475782/ /pubmed/37671027 http://dx.doi.org/10.1016/j.crmeth.2023.100543 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Lee, HoJoon
Greer, Stephanie U.
Pavlichin, Dmitri S.
Zhou, Bo
Urban, Alexander E.
Weissman, Tsachy
Ji, Hanlee P.
Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
title Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
title_full Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
title_fullStr Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
title_full_unstemmed Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
title_short Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
title_sort pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10475782/
https://www.ncbi.nlm.nih.gov/pubmed/37671027
http://dx.doi.org/10.1016/j.crmeth.2023.100543
work_keys_str_mv AT leehojoon panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome
AT greerstephanieu panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome
AT pavlichindmitris panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome
AT zhoubo panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome
AT urbanalexandere panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome
AT weissmantsachy panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome
AT panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome
AT jihanleep panconservedsegmenttagsidentifyultraconservedsequencesacrossassembliesinthehumanpangenome