Cargando…

Automated annotation of human centromeres with HORmon

Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats [HORs]). Altho...

Descripción completa

Detalles Bibliográficos
Autores principales: Kunyavskaya, Olga, Dvorkina, Tatiana, Bzikadze, Andrey V., Alexandrov, Ivan A., Pevzner, Pavel A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9248890/
https://www.ncbi.nlm.nih.gov/pubmed/35545449
http://dx.doi.org/10.1101/gr.276362.121
_version_ 1784739452150611968
author Kunyavskaya, Olga
Dvorkina, Tatiana
Bzikadze, Andrey V.
Alexandrov, Ivan A.
Pevzner, Pavel A.
author_facet Kunyavskaya, Olga
Dvorkina, Tatiana
Bzikadze, Andrey V.
Alexandrov, Ivan A.
Pevzner, Pavel A.
author_sort Kunyavskaya, Olga
collection PubMed
description Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats [HORs]). Although there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we show that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference.
format Online
Article
Text
id pubmed-9248890
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-92488902022-12-01 Automated annotation of human centromeres with HORmon Kunyavskaya, Olga Dvorkina, Tatiana Bzikadze, Andrey V. Alexandrov, Ivan A. Pevzner, Pavel A. Genome Res Method Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats [HORs]). Although there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we show that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference. Cold Spring Harbor Laboratory Press 2022-06 /pmc/articles/PMC9248890/ /pubmed/35545449 http://dx.doi.org/10.1101/gr.276362.121 Text en © 2022 Kunyavskaya et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Method
Kunyavskaya, Olga
Dvorkina, Tatiana
Bzikadze, Andrey V.
Alexandrov, Ivan A.
Pevzner, Pavel A.
Automated annotation of human centromeres with HORmon
title Automated annotation of human centromeres with HORmon
title_full Automated annotation of human centromeres with HORmon
title_fullStr Automated annotation of human centromeres with HORmon
title_full_unstemmed Automated annotation of human centromeres with HORmon
title_short Automated annotation of human centromeres with HORmon
title_sort automated annotation of human centromeres with hormon
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9248890/
https://www.ncbi.nlm.nih.gov/pubmed/35545449
http://dx.doi.org/10.1101/gr.276362.121
work_keys_str_mv AT kunyavskayaolga automatedannotationofhumancentromereswithhormon
AT dvorkinatatiana automatedannotationofhumancentromereswithhormon
AT bzikadzeandreyv automatedannotationofhumancentromereswithhormon
AT alexandrovivana automatedannotationofhumancentromereswithhormon
AT pevznerpavela automatedannotationofhumancentromereswithhormon