Cargando…
TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes
Genome-wide association studies (GWAS) associate single nucleotide polymorphisms (SNPs) to complex phenotypes. Most human SNPs fall in non-coding regions and are likely regulatory SNPs, but linkage disequilibrium (LD) blocks make it difficult to distinguish functional SNPs. Therefore, putative funct...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6698643/ https://www.ncbi.nlm.nih.gov/pubmed/31045203 http://dx.doi.org/10.1093/nar/gkz320 |
_version_ | 1783444585807085568 |
---|---|
author | González, Aitor Artufel, Marie Rihet, Pascal |
author_facet | González, Aitor Artufel, Marie Rihet, Pascal |
author_sort | González, Aitor |
collection | PubMed |
description | Genome-wide association studies (GWAS) associate single nucleotide polymorphisms (SNPs) to complex phenotypes. Most human SNPs fall in non-coding regions and are likely regulatory SNPs, but linkage disequilibrium (LD) blocks make it difficult to distinguish functional SNPs. Therefore, putative functional SNPs are usually annotated with molecular markers of gene regulatory regions and prioritized with dedicated prediction tools. We integrated associated SNPs, LD blocks and regulatory features into a supervised model called TAGOOS (TAG SNP bOOSting) and computed scores genome-wide. The TAGOOS scores enriched and prioritized unseen associated SNPs with an odds ratio of 4.3 and 3.5 and an area under the curve (AUC) of 0.65 and 0.6 for intronic and intergenic regions, respectively. The TAGOOS score was correlated with the maximal significance of associated SNPs and expression quantitative trait loci (eQTLs) and with the number of biological samples annotated for key regulatory features. Analysis of loci and regions associated to cleft lip and human adult height phenotypes recovered known functional loci and predicted new functional loci enriched in transcriptions factors related to the phenotypes. In conclusion, we trained a supervised model based on associated SNPs to prioritize putative functional regions. The TAGOOS scores, annotations and UCSC genome tracks are available here: https://tagoos.readthedocs.io. |
format | Online Article Text |
id | pubmed-6698643 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-66986432019-08-22 TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes González, Aitor Artufel, Marie Rihet, Pascal Nucleic Acids Res Methods Online Genome-wide association studies (GWAS) associate single nucleotide polymorphisms (SNPs) to complex phenotypes. Most human SNPs fall in non-coding regions and are likely regulatory SNPs, but linkage disequilibrium (LD) blocks make it difficult to distinguish functional SNPs. Therefore, putative functional SNPs are usually annotated with molecular markers of gene regulatory regions and prioritized with dedicated prediction tools. We integrated associated SNPs, LD blocks and regulatory features into a supervised model called TAGOOS (TAG SNP bOOSting) and computed scores genome-wide. The TAGOOS scores enriched and prioritized unseen associated SNPs with an odds ratio of 4.3 and 3.5 and an area under the curve (AUC) of 0.65 and 0.6 for intronic and intergenic regions, respectively. The TAGOOS score was correlated with the maximal significance of associated SNPs and expression quantitative trait loci (eQTLs) and with the number of biological samples annotated for key regulatory features. Analysis of loci and regions associated to cleft lip and human adult height phenotypes recovered known functional loci and predicted new functional loci enriched in transcriptions factors related to the phenotypes. In conclusion, we trained a supervised model based on associated SNPs to prioritize putative functional regions. The TAGOOS scores, annotations and UCSC genome tracks are available here: https://tagoos.readthedocs.io. Oxford University Press 2019-08-22 2019-05-02 /pmc/articles/PMC6698643/ /pubmed/31045203 http://dx.doi.org/10.1093/nar/gkz320 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online González, Aitor Artufel, Marie Rihet, Pascal TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes |
title | TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes |
title_full | TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes |
title_fullStr | TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes |
title_full_unstemmed | TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes |
title_short | TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes |
title_sort | tagoos: genome-wide supervised learning of non-coding loci associated to complex phenotypes |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6698643/ https://www.ncbi.nlm.nih.gov/pubmed/31045203 http://dx.doi.org/10.1093/nar/gkz320 |
work_keys_str_mv | AT gonzalezaitor tagoosgenomewidesupervisedlearningofnoncodinglociassociatedtocomplexphenotypes AT artufelmarie tagoosgenomewidesupervisedlearningofnoncodinglociassociatedtocomplexphenotypes AT rihetpascal tagoosgenomewidesupervisedlearningofnoncodinglociassociatedtocomplexphenotypes |