Cargando…

TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes

Genome-wide association studies (GWAS) associate single nucleotide polymorphisms (SNPs) to complex phenotypes. Most human SNPs fall in non-coding regions and are likely regulatory SNPs, but linkage disequilibrium (LD) blocks make it difficult to distinguish functional SNPs. Therefore, putative funct...

Descripción completa

Detalles Bibliográficos
Autores principales: González, Aitor, Artufel, Marie, Rihet, Pascal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6698643/
https://www.ncbi.nlm.nih.gov/pubmed/31045203
http://dx.doi.org/10.1093/nar/gkz320
_version_ 1783444585807085568
author González, Aitor
Artufel, Marie
Rihet, Pascal
author_facet González, Aitor
Artufel, Marie
Rihet, Pascal
author_sort González, Aitor
collection PubMed
description Genome-wide association studies (GWAS) associate single nucleotide polymorphisms (SNPs) to complex phenotypes. Most human SNPs fall in non-coding regions and are likely regulatory SNPs, but linkage disequilibrium (LD) blocks make it difficult to distinguish functional SNPs. Therefore, putative functional SNPs are usually annotated with molecular markers of gene regulatory regions and prioritized with dedicated prediction tools. We integrated associated SNPs, LD blocks and regulatory features into a supervised model called TAGOOS (TAG SNP bOOSting) and computed scores genome-wide. The TAGOOS scores enriched and prioritized unseen associated SNPs with an odds ratio of 4.3 and 3.5 and an area under the curve (AUC) of 0.65 and 0.6 for intronic and intergenic regions, respectively. The TAGOOS score was correlated with the maximal significance of associated SNPs and expression quantitative trait loci (eQTLs) and with the number of biological samples annotated for key regulatory features. Analysis of loci and regions associated to cleft lip and human adult height phenotypes recovered known functional loci and predicted new functional loci enriched in transcriptions factors related to the phenotypes. In conclusion, we trained a supervised model based on associated SNPs to prioritize putative functional regions. The TAGOOS scores, annotations and UCSC genome tracks are available here: https://tagoos.readthedocs.io.
format Online
Article
Text
id pubmed-6698643
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-66986432019-08-22 TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes González, Aitor Artufel, Marie Rihet, Pascal Nucleic Acids Res Methods Online Genome-wide association studies (GWAS) associate single nucleotide polymorphisms (SNPs) to complex phenotypes. Most human SNPs fall in non-coding regions and are likely regulatory SNPs, but linkage disequilibrium (LD) blocks make it difficult to distinguish functional SNPs. Therefore, putative functional SNPs are usually annotated with molecular markers of gene regulatory regions and prioritized with dedicated prediction tools. We integrated associated SNPs, LD blocks and regulatory features into a supervised model called TAGOOS (TAG SNP bOOSting) and computed scores genome-wide. The TAGOOS scores enriched and prioritized unseen associated SNPs with an odds ratio of 4.3 and 3.5 and an area under the curve (AUC) of 0.65 and 0.6 for intronic and intergenic regions, respectively. The TAGOOS score was correlated with the maximal significance of associated SNPs and expression quantitative trait loci (eQTLs) and with the number of biological samples annotated for key regulatory features. Analysis of loci and regions associated to cleft lip and human adult height phenotypes recovered known functional loci and predicted new functional loci enriched in transcriptions factors related to the phenotypes. In conclusion, we trained a supervised model based on associated SNPs to prioritize putative functional regions. The TAGOOS scores, annotations and UCSC genome tracks are available here: https://tagoos.readthedocs.io. Oxford University Press 2019-08-22 2019-05-02 /pmc/articles/PMC6698643/ /pubmed/31045203 http://dx.doi.org/10.1093/nar/gkz320 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
González, Aitor
Artufel, Marie
Rihet, Pascal
TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes
title TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes
title_full TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes
title_fullStr TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes
title_full_unstemmed TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes
title_short TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes
title_sort tagoos: genome-wide supervised learning of non-coding loci associated to complex phenotypes
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6698643/
https://www.ncbi.nlm.nih.gov/pubmed/31045203
http://dx.doi.org/10.1093/nar/gkz320
work_keys_str_mv AT gonzalezaitor tagoosgenomewidesupervisedlearningofnoncodinglociassociatedtocomplexphenotypes
AT artufelmarie tagoosgenomewidesupervisedlearningofnoncodinglociassociatedtocomplexphenotypes
AT rihetpascal tagoosgenomewidesupervisedlearningofnoncodinglociassociatedtocomplexphenotypes