Cargando…

The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation

Background: Genomic sequencing, including whole exome sequencing (WES), is enabling a higher resolution for defining diseases, understand mechanisms, and improving the practice of clinical care. However, WES routinely identifies genomic variants with uncertain functional effects. Furthering uncertai...

Descripción completa

Detalles Bibliográficos
Autor principal: Zimmermann, Michael T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6305277/
https://www.ncbi.nlm.nih.gov/pubmed/30619486
http://dx.doi.org/10.3389/fgene.2018.00670
_version_ 1783382526493982720
author Zimmermann, Michael T.
author_facet Zimmermann, Michael T.
author_sort Zimmermann, Michael T.
collection PubMed
description Background: Genomic sequencing, including whole exome sequencing (WES), is enabling a higher resolution for defining diseases, understand mechanisms, and improving the practice of clinical care. However, WES routinely identifies genomic variants with uncertain functional effects. Furthering uncertainty in WES data interpretation is that many genes can express multiple transcripts and their relative expression may differ by body tissue. In order to interpret WES data, we not only need to understand which transcript is most relevant, but what tissue is most relevant. Methods: In this work, we quantify how frequently differences in transcript and tissue expression affect WES data interpretation at gene, pathway, disease, and biologic network levels. We combined and analyzed multiple large and publically available datasets to inform genomic data interpretation. Results: Across well-established biologic pathways and genes with pathogenic disease variants, 54 and 40% have a different protein coding effect by transcript selection for, respectively, 25 and 50% of the genes contained. Additionally, strong differences in human tissue expression levels affect 33 and 19% of the same set of pathways and diseases for, respectively, 25 and 50% of the genes contained. Conclusion: Whole exome sequencing identifies genomic variants, but to interpret the functional effects of those variants in high-resolution, we recommend building transcript selection and cross-tissue gene expression levels into hypotheses and analyses. Using current large-scale data, we show how extensively interpretation of genomic variants may differ according to transcript and tissue, across most pathways and disease. Thus, their inclusion is necessary for WES data interpretation.
format Online
Article
Text
id pubmed-6305277
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-63052772019-01-07 The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation Zimmermann, Michael T. Front Genet Genetics Background: Genomic sequencing, including whole exome sequencing (WES), is enabling a higher resolution for defining diseases, understand mechanisms, and improving the practice of clinical care. However, WES routinely identifies genomic variants with uncertain functional effects. Furthering uncertainty in WES data interpretation is that many genes can express multiple transcripts and their relative expression may differ by body tissue. In order to interpret WES data, we not only need to understand which transcript is most relevant, but what tissue is most relevant. Methods: In this work, we quantify how frequently differences in transcript and tissue expression affect WES data interpretation at gene, pathway, disease, and biologic network levels. We combined and analyzed multiple large and publically available datasets to inform genomic data interpretation. Results: Across well-established biologic pathways and genes with pathogenic disease variants, 54 and 40% have a different protein coding effect by transcript selection for, respectively, 25 and 50% of the genes contained. Additionally, strong differences in human tissue expression levels affect 33 and 19% of the same set of pathways and diseases for, respectively, 25 and 50% of the genes contained. Conclusion: Whole exome sequencing identifies genomic variants, but to interpret the functional effects of those variants in high-resolution, we recommend building transcript selection and cross-tissue gene expression levels into hypotheses and analyses. Using current large-scale data, we show how extensively interpretation of genomic variants may differ according to transcript and tissue, across most pathways and disease. Thus, their inclusion is necessary for WES data interpretation. Frontiers Media S.A. 2018-12-18 /pmc/articles/PMC6305277/ /pubmed/30619486 http://dx.doi.org/10.3389/fgene.2018.00670 Text en Copyright © 2018 Zimmermann. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zimmermann, Michael T.
The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation
title The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation
title_full The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation
title_fullStr The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation
title_full_unstemmed The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation
title_short The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation
title_sort importance of biologic knowledge and gene expression context for genomic data interpretation
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6305277/
https://www.ncbi.nlm.nih.gov/pubmed/30619486
http://dx.doi.org/10.3389/fgene.2018.00670
work_keys_str_mv AT zimmermannmichaelt theimportanceofbiologicknowledgeandgeneexpressioncontextforgenomicdatainterpretation
AT zimmermannmichaelt importanceofbiologicknowledgeandgeneexpressioncontextforgenomicdatainterpretation