Cargando…

Identical repeated backbone of the human genome

BACKGROUND: Identical sequences with a minimal length of about 300 base pairs (bp) have been involved in the generation of various meiotic/mitotic genomic rearrangements through non-allelic homologous recombination (NAHR) events. Genomic disorders and structural variation, together with gene remodel...

Descripción completa

Detalles Bibliográficos
Autores principales: Zepeda-Mendoza, Cinthya J, Lemus, Tzitziki, Yáñez, Omar, García, Delfino, Valle-García, David, Meza-Sosa, Karla F, Gutiérrez-Arcelus, María, Márquez-Ortiz, Yamile, Domínguez-Vidaña, Rocío, Gonzaga-Jauregui, Claudia, Flores, Margarita, Palacios, Rafael
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845111/
https://www.ncbi.nlm.nih.gov/pubmed/20096123
http://dx.doi.org/10.1186/1471-2164-11-60
_version_ 1782179378585665536
author Zepeda-Mendoza, Cinthya J
Lemus, Tzitziki
Yáñez, Omar
García, Delfino
Valle-García, David
Meza-Sosa, Karla F
Gutiérrez-Arcelus, María
Márquez-Ortiz, Yamile
Domínguez-Vidaña, Rocío
Gonzaga-Jauregui, Claudia
Flores, Margarita
Palacios, Rafael
author_facet Zepeda-Mendoza, Cinthya J
Lemus, Tzitziki
Yáñez, Omar
García, Delfino
Valle-García, David
Meza-Sosa, Karla F
Gutiérrez-Arcelus, María
Márquez-Ortiz, Yamile
Domínguez-Vidaña, Rocío
Gonzaga-Jauregui, Claudia
Flores, Margarita
Palacios, Rafael
author_sort Zepeda-Mendoza, Cinthya J
collection PubMed
description BACKGROUND: Identical sequences with a minimal length of about 300 base pairs (bp) have been involved in the generation of various meiotic/mitotic genomic rearrangements through non-allelic homologous recombination (NAHR) events. Genomic disorders and structural variation, together with gene remodelling processes have been associated with many of these rearrangements. Based on these observations, we identified and integrated all the 100% identical repeats of at least 300 bp in the NCBI version 36.2 human genome reference assembly into non-overlapping regions, thus defining the Identical Repeated Backbone (IRB) of the reference human genome. RESULTS: The IRB sequences are distributed all over the genome in 66,600 regions, which correspond to ~2% of the total NCBI human genome reference assembly. Important structural and functional elements such as common repeats, segmental duplications, and genes are contained in the IRB. About 80% of the IRB bp overlap with known copy-number variants (CNVs). By analyzing the genes embedded in the IRB, we were able to detect some identical genes not previously included in the Ensembl release 50 annotation of human genes. In addition, we found evidence of IRB gene copy-number polymorphisms in raw sequence reads of two diploid sequenced genomes. CONCLUSIONS: In general, the IRB offers new insight into the complex organization of the identical repeated sequences of the human genome. It provides an accurate map of potential NAHR sites which could be used in targeting the study of novel CNVs, predicting DNA copy-number variation in newly sequenced genomes, and improve genome annotation.
format Text
id pubmed-2845111
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28451112010-03-26 Identical repeated backbone of the human genome Zepeda-Mendoza, Cinthya J Lemus, Tzitziki Yáñez, Omar García, Delfino Valle-García, David Meza-Sosa, Karla F Gutiérrez-Arcelus, María Márquez-Ortiz, Yamile Domínguez-Vidaña, Rocío Gonzaga-Jauregui, Claudia Flores, Margarita Palacios, Rafael BMC Genomics Research Article BACKGROUND: Identical sequences with a minimal length of about 300 base pairs (bp) have been involved in the generation of various meiotic/mitotic genomic rearrangements through non-allelic homologous recombination (NAHR) events. Genomic disorders and structural variation, together with gene remodelling processes have been associated with many of these rearrangements. Based on these observations, we identified and integrated all the 100% identical repeats of at least 300 bp in the NCBI version 36.2 human genome reference assembly into non-overlapping regions, thus defining the Identical Repeated Backbone (IRB) of the reference human genome. RESULTS: The IRB sequences are distributed all over the genome in 66,600 regions, which correspond to ~2% of the total NCBI human genome reference assembly. Important structural and functional elements such as common repeats, segmental duplications, and genes are contained in the IRB. About 80% of the IRB bp overlap with known copy-number variants (CNVs). By analyzing the genes embedded in the IRB, we were able to detect some identical genes not previously included in the Ensembl release 50 annotation of human genes. In addition, we found evidence of IRB gene copy-number polymorphisms in raw sequence reads of two diploid sequenced genomes. CONCLUSIONS: In general, the IRB offers new insight into the complex organization of the identical repeated sequences of the human genome. It provides an accurate map of potential NAHR sites which could be used in targeting the study of novel CNVs, predicting DNA copy-number variation in newly sequenced genomes, and improve genome annotation. BioMed Central 2010-01-23 /pmc/articles/PMC2845111/ /pubmed/20096123 http://dx.doi.org/10.1186/1471-2164-11-60 Text en Copyright ©2010 Zepeda-Mendoza et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zepeda-Mendoza, Cinthya J
Lemus, Tzitziki
Yáñez, Omar
García, Delfino
Valle-García, David
Meza-Sosa, Karla F
Gutiérrez-Arcelus, María
Márquez-Ortiz, Yamile
Domínguez-Vidaña, Rocío
Gonzaga-Jauregui, Claudia
Flores, Margarita
Palacios, Rafael
Identical repeated backbone of the human genome
title Identical repeated backbone of the human genome
title_full Identical repeated backbone of the human genome
title_fullStr Identical repeated backbone of the human genome
title_full_unstemmed Identical repeated backbone of the human genome
title_short Identical repeated backbone of the human genome
title_sort identical repeated backbone of the human genome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845111/
https://www.ncbi.nlm.nih.gov/pubmed/20096123
http://dx.doi.org/10.1186/1471-2164-11-60
work_keys_str_mv AT zepedamendozacinthyaj identicalrepeatedbackboneofthehumangenome
AT lemustzitziki identicalrepeatedbackboneofthehumangenome
AT yanezomar identicalrepeatedbackboneofthehumangenome
AT garciadelfino identicalrepeatedbackboneofthehumangenome
AT vallegarciadavid identicalrepeatedbackboneofthehumangenome
AT mezasosakarlaf identicalrepeatedbackboneofthehumangenome
AT gutierrezarcelusmaria identicalrepeatedbackboneofthehumangenome
AT marquezortizyamile identicalrepeatedbackboneofthehumangenome
AT dominguezvidanarocio identicalrepeatedbackboneofthehumangenome
AT gonzagajaureguiclaudia identicalrepeatedbackboneofthehumangenome
AT floresmargarita identicalrepeatedbackboneofthehumangenome
AT palaciosrafael identicalrepeatedbackboneofthehumangenome