Cargando…

CAGm: a repository of germline microsatellite variations in the 1000 genomes project

The human genome harbors an abundance of repetitive DNA; however, its function continues to be debated. Microsatellites—a class of short tandem repeat—are established as an important source of genetic variation. Array length variants are common among microsatellites and affect gene expression; but,...

Descripción completa

Detalles Bibliográficos
Autores principales: Kinney, Nicholas, Titus-Glover, Kyle, Wren, Jonathan D, Varghese, Robin T, Michalak, Pawel, Liao, Han, Anandakrishnan, Ramu, Pulenthiran, Arichanah, Kang, Lin, Garner, Harold R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323891/
https://www.ncbi.nlm.nih.gov/pubmed/30329086
http://dx.doi.org/10.1093/nar/gky969
_version_ 1783385863127826432
author Kinney, Nicholas
Titus-Glover, Kyle
Wren, Jonathan D
Varghese, Robin T
Michalak, Pawel
Liao, Han
Anandakrishnan, Ramu
Pulenthiran, Arichanah
Kang, Lin
Garner, Harold R
author_facet Kinney, Nicholas
Titus-Glover, Kyle
Wren, Jonathan D
Varghese, Robin T
Michalak, Pawel
Liao, Han
Anandakrishnan, Ramu
Pulenthiran, Arichanah
Kang, Lin
Garner, Harold R
author_sort Kinney, Nicholas
collection PubMed
description The human genome harbors an abundance of repetitive DNA; however, its function continues to be debated. Microsatellites—a class of short tandem repeat—are established as an important source of genetic variation. Array length variants are common among microsatellites and affect gene expression; but, efforts to understand the role and diversity of microsatellite variation has been hampered by several challenges. Without adequate depth, both long-read and short-read sequencing may not detect the variants present in a sample; additionally, large sample sizes are needed to reveal the degree of population-level polymorphism. To address these challenges we present the Comparative Analysis of Germline Microsatellites (CAGm): a database of germline microsatellites from 2529 individuals in the 1000 genomes project. A key novelty of CAGm is the ability to aggregate microsatellite variation by population, ethnicity (super population) and gender. The database provides advanced searching for microsatellites embedded in genes and functional elements. All data can be downloaded as Microsoft Excel spreadsheets. Two use-case scenarios are presented to demonstrate its utility: a mononucleotide (A) microsatellite at the BAT-26 locus and a dinucleotide (CA) microsatellite in the coding region of FGFRL1. CAGm is freely available at http://www.cagmdb.org/.
format Online
Article
Text
id pubmed-6323891
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63238912019-01-10 CAGm: a repository of germline microsatellite variations in the 1000 genomes project Kinney, Nicholas Titus-Glover, Kyle Wren, Jonathan D Varghese, Robin T Michalak, Pawel Liao, Han Anandakrishnan, Ramu Pulenthiran, Arichanah Kang, Lin Garner, Harold R Nucleic Acids Res Database Issue The human genome harbors an abundance of repetitive DNA; however, its function continues to be debated. Microsatellites—a class of short tandem repeat—are established as an important source of genetic variation. Array length variants are common among microsatellites and affect gene expression; but, efforts to understand the role and diversity of microsatellite variation has been hampered by several challenges. Without adequate depth, both long-read and short-read sequencing may not detect the variants present in a sample; additionally, large sample sizes are needed to reveal the degree of population-level polymorphism. To address these challenges we present the Comparative Analysis of Germline Microsatellites (CAGm): a database of germline microsatellites from 2529 individuals in the 1000 genomes project. A key novelty of CAGm is the ability to aggregate microsatellite variation by population, ethnicity (super population) and gender. The database provides advanced searching for microsatellites embedded in genes and functional elements. All data can be downloaded as Microsoft Excel spreadsheets. Two use-case scenarios are presented to demonstrate its utility: a mononucleotide (A) microsatellite at the BAT-26 locus and a dinucleotide (CA) microsatellite in the coding region of FGFRL1. CAGm is freely available at http://www.cagmdb.org/. Oxford University Press 2019-01-08 2018-10-17 /pmc/articles/PMC6323891/ /pubmed/30329086 http://dx.doi.org/10.1093/nar/gky969 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Database Issue
Kinney, Nicholas
Titus-Glover, Kyle
Wren, Jonathan D
Varghese, Robin T
Michalak, Pawel
Liao, Han
Anandakrishnan, Ramu
Pulenthiran, Arichanah
Kang, Lin
Garner, Harold R
CAGm: a repository of germline microsatellite variations in the 1000 genomes project
title CAGm: a repository of germline microsatellite variations in the 1000 genomes project
title_full CAGm: a repository of germline microsatellite variations in the 1000 genomes project
title_fullStr CAGm: a repository of germline microsatellite variations in the 1000 genomes project
title_full_unstemmed CAGm: a repository of germline microsatellite variations in the 1000 genomes project
title_short CAGm: a repository of germline microsatellite variations in the 1000 genomes project
title_sort cagm: a repository of germline microsatellite variations in the 1000 genomes project
topic Database Issue
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323891/
https://www.ncbi.nlm.nih.gov/pubmed/30329086
http://dx.doi.org/10.1093/nar/gky969
work_keys_str_mv AT kinneynicholas cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT titusgloverkyle cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT wrenjonathand cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT vargheserobint cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT michalakpawel cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT liaohan cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT anandakrishnanramu cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT pulenthiranarichanah cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT kanglin cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject
AT garnerharoldr cagmarepositoryofgermlinemicrosatellitevariationsinthe1000genomesproject