Cargando…

mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease

BACKGROUND: Non-human primates (NHPs), particularly macaques, serve as critical and highly relevant pre-clinical models of human disease. The similarity in human and macaque natural disease susceptibility, along with parallel genetic risk alleles, underscores the value of macaques in the development...

Descripción completa

Detalles Bibliográficos
Autores principales: Bimber, Benjamin N., Yan, Melissa Y., Peterson, Samuel M., Ferguson, Betsy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6402181/
https://www.ncbi.nlm.nih.gov/pubmed/30841849
http://dx.doi.org/10.1186/s12864-019-5559-7
_version_ 1783400341973237760
author Bimber, Benjamin N.
Yan, Melissa Y.
Peterson, Samuel M.
Ferguson, Betsy
author_facet Bimber, Benjamin N.
Yan, Melissa Y.
Peterson, Samuel M.
Ferguson, Betsy
author_sort Bimber, Benjamin N.
collection PubMed
description BACKGROUND: Non-human primates (NHPs), particularly macaques, serve as critical and highly relevant pre-clinical models of human disease. The similarity in human and macaque natural disease susceptibility, along with parallel genetic risk alleles, underscores the value of macaques in the development of effective treatment strategies. Nonetheless, there are limited genomic resources available to support the exploration and discovery of macaque models of inherited disease. Notably, there are few public databases tailored to searching NHP sequence variants, and no other database making use of centralized variant calling, or providing genotype-level data and predicted pathogenic effects for each variant. RESULTS: The macaque Genotype And Phenotype (mGAP) resource is the first public website providing searchable, annotated macaque variant data. The mGAP resource includes a catalog of high confidence variants, derived from whole genome sequence (WGS). The current mGAP release at time of publication (1.7) contains 17,087,212 variants based on the sequence analysis of 293 rhesus macaques. A custom pipeline was developed to enable annotation of the macaque variants, leveraging human data sources that include regulatory elements (ENCODE, RegulomeDB), known disease- or phenotype-associated variants (GRASP), predicted impact (SIFT, PolyPhen2), and sequence conservation (Phylop, PhastCons). Currently mGAP includes 2767 variants that are identical to alleles listed in the human ClinVar database, of which 276 variants, spanning 258 genes, are identified as pathogenic. An additional 12,472 variants are predicted as high impact (SnpEff) and 13,129 are predicted as damaging (PolyPhen2). In total, these variants are predicted to be associated with more than 2000 human disease or phenotype entries reported in OMIM (Online Mendelian Inheritance in Man). Importantly, mGAP also provides genotype-level data for all subjects, allowing identification of specific individuals harboring alleles of interest. CONCLUSIONS: The mGAP resource provides variant and genotype data from hundreds of rhesus macaques, processed in a consistent manner across all subjects (https://mgap.ohsu.edu). Together with the extensive variant annotations, mGAP presents unprecedented opportunity to investigate potential genetic associations with currently characterized disease models, and to uncover new macaque models based on parallels with human risk alleles. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5559-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6402181
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64021812019-03-14 mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease Bimber, Benjamin N. Yan, Melissa Y. Peterson, Samuel M. Ferguson, Betsy BMC Genomics Database Article BACKGROUND: Non-human primates (NHPs), particularly macaques, serve as critical and highly relevant pre-clinical models of human disease. The similarity in human and macaque natural disease susceptibility, along with parallel genetic risk alleles, underscores the value of macaques in the development of effective treatment strategies. Nonetheless, there are limited genomic resources available to support the exploration and discovery of macaque models of inherited disease. Notably, there are few public databases tailored to searching NHP sequence variants, and no other database making use of centralized variant calling, or providing genotype-level data and predicted pathogenic effects for each variant. RESULTS: The macaque Genotype And Phenotype (mGAP) resource is the first public website providing searchable, annotated macaque variant data. The mGAP resource includes a catalog of high confidence variants, derived from whole genome sequence (WGS). The current mGAP release at time of publication (1.7) contains 17,087,212 variants based on the sequence analysis of 293 rhesus macaques. A custom pipeline was developed to enable annotation of the macaque variants, leveraging human data sources that include regulatory elements (ENCODE, RegulomeDB), known disease- or phenotype-associated variants (GRASP), predicted impact (SIFT, PolyPhen2), and sequence conservation (Phylop, PhastCons). Currently mGAP includes 2767 variants that are identical to alleles listed in the human ClinVar database, of which 276 variants, spanning 258 genes, are identified as pathogenic. An additional 12,472 variants are predicted as high impact (SnpEff) and 13,129 are predicted as damaging (PolyPhen2). In total, these variants are predicted to be associated with more than 2000 human disease or phenotype entries reported in OMIM (Online Mendelian Inheritance in Man). Importantly, mGAP also provides genotype-level data for all subjects, allowing identification of specific individuals harboring alleles of interest. CONCLUSIONS: The mGAP resource provides variant and genotype data from hundreds of rhesus macaques, processed in a consistent manner across all subjects (https://mgap.ohsu.edu). Together with the extensive variant annotations, mGAP presents unprecedented opportunity to investigate potential genetic associations with currently characterized disease models, and to uncover new macaque models based on parallels with human risk alleles. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5559-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-06 /pmc/articles/PMC6402181/ /pubmed/30841849 http://dx.doi.org/10.1186/s12864-019-5559-7 Text en © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Database Article
Bimber, Benjamin N.
Yan, Melissa Y.
Peterson, Samuel M.
Ferguson, Betsy
mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease
title mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease
title_full mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease
title_fullStr mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease
title_full_unstemmed mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease
title_short mGAP: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease
title_sort mgap: the macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease
topic Database Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6402181/
https://www.ncbi.nlm.nih.gov/pubmed/30841849
http://dx.doi.org/10.1186/s12864-019-5559-7
work_keys_str_mv AT bimberbenjaminn mgapthemacaquegenotypeandphenotyperesourceaframeworkforaccessingandinterpretingmacaquevariantdataandidentifyingnewmodelsofhumandisease
AT yanmelissay mgapthemacaquegenotypeandphenotyperesourceaframeworkforaccessingandinterpretingmacaquevariantdataandidentifyingnewmodelsofhumandisease
AT petersonsamuelm mgapthemacaquegenotypeandphenotyperesourceaframeworkforaccessingandinterpretingmacaquevariantdataandidentifyingnewmodelsofhumandisease
AT fergusonbetsy mgapthemacaquegenotypeandphenotyperesourceaframeworkforaccessingandinterpretingmacaquevariantdataandidentifyingnewmodelsofhumandisease