Cargando…

Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome

The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate it...

Descripción completa

Detalles Bibliográficos
Autores principales: Fonville, Natalie C., Velmurugan, Karthik Raja, Tae, Hongseok, Vaksman, Zalman, McIver, Lauren J., Garner, Harold R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4899811/
https://www.ncbi.nlm.nih.gov/pubmed/27278669
http://dx.doi.org/10.1038/srep27722
_version_ 1782436537477103616
author Fonville, Natalie C.
Velmurugan, Karthik Raja
Tae, Hongseok
Vaksman, Zalman
McIver, Lauren J.
Garner, Harold R.
author_facet Fonville, Natalie C.
Velmurugan, Karthik Raja
Tae, Hongseok
Vaksman, Zalman
McIver, Lauren J.
Garner, Harold R.
author_sort Fonville, Natalie C.
collection PubMed
description The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.
format Online
Article
Text
id pubmed-4899811
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-48998112016-06-13 Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome Fonville, Natalie C. Velmurugan, Karthik Raja Tae, Hongseok Vaksman, Zalman McIver, Lauren J. Garner, Harold R. Sci Rep Article The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA. Nature Publishing Group 2016-06-09 /pmc/articles/PMC4899811/ /pubmed/27278669 http://dx.doi.org/10.1038/srep27722 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Fonville, Natalie C.
Velmurugan, Karthik Raja
Tae, Hongseok
Vaksman, Zalman
McIver, Lauren J.
Garner, Harold R.
Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
title Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
title_full Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
title_fullStr Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
title_full_unstemmed Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
title_short Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
title_sort genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4899811/
https://www.ncbi.nlm.nih.gov/pubmed/27278669
http://dx.doi.org/10.1038/srep27722
work_keys_str_mv AT fonvillenataliec genomicleftoversidentifyingnovelmicrosatellitesoverrepresentedmotifsandfunctionalelementsinthehumangenome
AT velmurugankarthikraja genomicleftoversidentifyingnovelmicrosatellitesoverrepresentedmotifsandfunctionalelementsinthehumangenome
AT taehongseok genomicleftoversidentifyingnovelmicrosatellitesoverrepresentedmotifsandfunctionalelementsinthehumangenome
AT vaksmanzalman genomicleftoversidentifyingnovelmicrosatellitesoverrepresentedmotifsandfunctionalelementsinthehumangenome
AT mciverlaurenj genomicleftoversidentifyingnovelmicrosatellitesoverrepresentedmotifsandfunctionalelementsinthehumangenome
AT garnerharoldr genomicleftoversidentifyingnovelmicrosatellitesoverrepresentedmotifsandfunctionalelementsinthehumangenome