Cargando…

Inferring Crohn’s disease association from exome sequences by integrating biological knowledge

BACKGROUND: Exome sequencing has been emerged as a primary method to identify detailed sequence variants associated with complex diseases including Crohn’s disease in the protein-coding regions of human genome. However, constructing an interpretable model for exome sequencing data is challenging bec...

Descripción completa

Detalles Bibliográficos
Autores principales: Jeong, Chan-Seok, Kim, Dongsup
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4989895/
https://www.ncbi.nlm.nih.gov/pubmed/27535358
http://dx.doi.org/10.1186/s12920-016-0189-2
_version_ 1782448623458451456
author Jeong, Chan-Seok
Kim, Dongsup
author_facet Jeong, Chan-Seok
Kim, Dongsup
author_sort Jeong, Chan-Seok
collection PubMed
description BACKGROUND: Exome sequencing has been emerged as a primary method to identify detailed sequence variants associated with complex diseases including Crohn’s disease in the protein-coding regions of human genome. However, constructing an interpretable model for exome sequencing data is challenging because of the huge diversity of genomic variation. In addition, it has been known that utilizing biologically relevant information in a rigorous manner is essential for effectively extracting disease-associated information. RESULTS: In this paper, we incorporate three different types of biological knowledge such as predicted pathogenicity, disease gene annotation, and functional interaction network of human genes, and integrate them with exome sequence data in non-negative matrix tri-factorization framework. Based on the proposed method, we successfully identified Crohn’s disease patients from exome sequencing data and achieved the area under the receiver operating characteristics curve (AUC) of 0.816, while other clustering methods not using biological information achieved the AUC of 0.786. Moreover, the disease association score derived from our method showed higher correlation with Crohn’s disease genes than other unrelated genes. CONCLUSIONS: As a consequence, by integrating biological information across multiple levels such as variant, gene, and systems, our method could be useful for identifying disease susceptibility and its associated genes from exome sequencing data.
format Online
Article
Text
id pubmed-4989895
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49898952016-08-30 Inferring Crohn’s disease association from exome sequences by integrating biological knowledge Jeong, Chan-Seok Kim, Dongsup BMC Med Genomics Research BACKGROUND: Exome sequencing has been emerged as a primary method to identify detailed sequence variants associated with complex diseases including Crohn’s disease in the protein-coding regions of human genome. However, constructing an interpretable model for exome sequencing data is challenging because of the huge diversity of genomic variation. In addition, it has been known that utilizing biologically relevant information in a rigorous manner is essential for effectively extracting disease-associated information. RESULTS: In this paper, we incorporate three different types of biological knowledge such as predicted pathogenicity, disease gene annotation, and functional interaction network of human genes, and integrate them with exome sequence data in non-negative matrix tri-factorization framework. Based on the proposed method, we successfully identified Crohn’s disease patients from exome sequencing data and achieved the area under the receiver operating characteristics curve (AUC) of 0.816, while other clustering methods not using biological information achieved the AUC of 0.786. Moreover, the disease association score derived from our method showed higher correlation with Crohn’s disease genes than other unrelated genes. CONCLUSIONS: As a consequence, by integrating biological information across multiple levels such as variant, gene, and systems, our method could be useful for identifying disease susceptibility and its associated genes from exome sequencing data. BioMed Central 2016-08-12 /pmc/articles/PMC4989895/ /pubmed/27535358 http://dx.doi.org/10.1186/s12920-016-0189-2 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Jeong, Chan-Seok
Kim, Dongsup
Inferring Crohn’s disease association from exome sequences by integrating biological knowledge
title Inferring Crohn’s disease association from exome sequences by integrating biological knowledge
title_full Inferring Crohn’s disease association from exome sequences by integrating biological knowledge
title_fullStr Inferring Crohn’s disease association from exome sequences by integrating biological knowledge
title_full_unstemmed Inferring Crohn’s disease association from exome sequences by integrating biological knowledge
title_short Inferring Crohn’s disease association from exome sequences by integrating biological knowledge
title_sort inferring crohn’s disease association from exome sequences by integrating biological knowledge
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4989895/
https://www.ncbi.nlm.nih.gov/pubmed/27535358
http://dx.doi.org/10.1186/s12920-016-0189-2
work_keys_str_mv AT jeongchanseok inferringcrohnsdiseaseassociationfromexomesequencesbyintegratingbiologicalknowledge
AT kimdongsup inferringcrohnsdiseaseassociationfromexomesequencesbyintegratingbiologicalknowledge