Cargando…
Centromere reference models for human chromosomes X and Y satellite arrays
The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary t...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3975068/ https://www.ncbi.nlm.nih.gov/pubmed/24501022 http://dx.doi.org/10.1101/gr.159624.113 |
_version_ | 1782310079288049664 |
---|---|
author | Miga, Karen H. Newton, Yulia Jain, Miten Altemose, Nicolas Willard, Huntington F. Kent, W. James |
author_facet | Miga, Karen H. Newton, Yulia Jain, Miten Altemose, Nicolas Willard, Huntington F. Kent, W. James |
author_sort | Miga, Karen H. |
collection | PubMed |
description | The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes. |
format | Online Article Text |
id | pubmed-3975068 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-39750682014-10-01 Centromere reference models for human chromosomes X and Y satellite arrays Miga, Karen H. Newton, Yulia Jain, Miten Altemose, Nicolas Willard, Huntington F. Kent, W. James Genome Res Method The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes. Cold Spring Harbor Laboratory Press 2014-04 /pmc/articles/PMC3975068/ /pubmed/24501022 http://dx.doi.org/10.1101/gr.159624.113 Text en © 2014 Miga et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at http://creativecommons.org/licenses/by-nc/3.0/. |
spellingShingle | Method Miga, Karen H. Newton, Yulia Jain, Miten Altemose, Nicolas Willard, Huntington F. Kent, W. James Centromere reference models for human chromosomes X and Y satellite arrays |
title | Centromere reference models for human chromosomes X and Y satellite arrays |
title_full | Centromere reference models for human chromosomes X and Y satellite arrays |
title_fullStr | Centromere reference models for human chromosomes X and Y satellite arrays |
title_full_unstemmed | Centromere reference models for human chromosomes X and Y satellite arrays |
title_short | Centromere reference models for human chromosomes X and Y satellite arrays |
title_sort | centromere reference models for human chromosomes x and y satellite arrays |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3975068/ https://www.ncbi.nlm.nih.gov/pubmed/24501022 http://dx.doi.org/10.1101/gr.159624.113 |
work_keys_str_mv | AT migakarenh centromerereferencemodelsforhumanchromosomesxandysatellitearrays AT newtonyulia centromerereferencemodelsforhumanchromosomesxandysatellitearrays AT jainmiten centromerereferencemodelsforhumanchromosomesxandysatellitearrays AT altemosenicolas centromerereferencemodelsforhumanchromosomesxandysatellitearrays AT willardhuntingtonf centromerereferencemodelsforhumanchromosomesxandysatellitearrays AT kentwjames centromerereferencemodelsforhumanchromosomesxandysatellitearrays |