Cargando…

A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation

Efficient high-throughput transcriptomics (HTT) tools promise inexpensive, rapid assessment of possible biological consequences of human and environmental exposures to tens of thousands of chemicals in commerce. HTT systems have used relatively small sets of gene expression measurements coupled with...

Descripción completa

Detalles Bibliográficos
Autores principales: Haider, Saad, Black, Michael B., Parks, Bethany B., Foley, Briana, Wetmore, Barbara A., Andersen, Melvin E., Clewell, Rebecca A., Mansouri, Kamel, McMullen, Patrick D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6176017/
https://www.ncbi.nlm.nih.gov/pubmed/30333746
http://dx.doi.org/10.3389/fphar.2018.01072
_version_ 1783361620282441728
author Haider, Saad
Black, Michael B.
Parks, Bethany B.
Foley, Briana
Wetmore, Barbara A.
Andersen, Melvin E.
Clewell, Rebecca A.
Mansouri, Kamel
McMullen, Patrick D.
author_facet Haider, Saad
Black, Michael B.
Parks, Bethany B.
Foley, Briana
Wetmore, Barbara A.
Andersen, Melvin E.
Clewell, Rebecca A.
Mansouri, Kamel
McMullen, Patrick D.
author_sort Haider, Saad
collection PubMed
description Efficient high-throughput transcriptomics (HTT) tools promise inexpensive, rapid assessment of possible biological consequences of human and environmental exposures to tens of thousands of chemicals in commerce. HTT systems have used relatively small sets of gene expression measurements coupled with mathematical prediction methods to estimate genome-wide gene expression and are often trained and validated using pharmaceutical compounds. It is unclear whether these training sets are suitable for general toxicity testing applications and the more diverse chemical space represented by commercial chemicals and environmental contaminants. In this work, we built predictive computational models that inferred whole genome transcriptional profiles from a smaller sample of surrogate genes. The model was trained and validated using a large scale toxicogenomics database with gene expression data from exposure to heterogeneous chemicals from a wide range of classes (the Open TG-GATEs data base). The method of predictor selection was designed to allow high fidelity gene prediction from any pre-existing gene expression data set, regardless of animal species or data measurement platform. Predictive qualitative models were developed with this TG-GATES data that contained gene expression data of human primary hepatocytes with over 941 samples covering 158 compounds. A sequential forward search-based greedy algorithm, combining different fitting approaches and machine learning techniques, was used to find an optimal set of surrogate genes that predicted differential expression changes of the remaining genome. We then used pathway enrichment of up-regulated and down-regulated genes to assess the ability of a limited gene set to determine relevant patterns of tissue response. In addition, we compared prediction performance using the surrogate genes found from our greedy algorithm (referred to as the SV2000) with the landmark genes provided by existing technologies such as L1000 (Genometry) and S1500 (Tox21), finding better predictive performance for the SV2000. The ability of these predictive algorithms to predict pathway level responses is a positive step toward incorporating mode of action (MOA) analysis into the high throughput prioritization and testing of the large number of chemicals in need of safety evaluation.
format Online
Article
Text
id pubmed-6176017
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-61760172018-10-17 A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation Haider, Saad Black, Michael B. Parks, Bethany B. Foley, Briana Wetmore, Barbara A. Andersen, Melvin E. Clewell, Rebecca A. Mansouri, Kamel McMullen, Patrick D. Front Pharmacol Pharmacology Efficient high-throughput transcriptomics (HTT) tools promise inexpensive, rapid assessment of possible biological consequences of human and environmental exposures to tens of thousands of chemicals in commerce. HTT systems have used relatively small sets of gene expression measurements coupled with mathematical prediction methods to estimate genome-wide gene expression and are often trained and validated using pharmaceutical compounds. It is unclear whether these training sets are suitable for general toxicity testing applications and the more diverse chemical space represented by commercial chemicals and environmental contaminants. In this work, we built predictive computational models that inferred whole genome transcriptional profiles from a smaller sample of surrogate genes. The model was trained and validated using a large scale toxicogenomics database with gene expression data from exposure to heterogeneous chemicals from a wide range of classes (the Open TG-GATEs data base). The method of predictor selection was designed to allow high fidelity gene prediction from any pre-existing gene expression data set, regardless of animal species or data measurement platform. Predictive qualitative models were developed with this TG-GATES data that contained gene expression data of human primary hepatocytes with over 941 samples covering 158 compounds. A sequential forward search-based greedy algorithm, combining different fitting approaches and machine learning techniques, was used to find an optimal set of surrogate genes that predicted differential expression changes of the remaining genome. We then used pathway enrichment of up-regulated and down-regulated genes to assess the ability of a limited gene set to determine relevant patterns of tissue response. In addition, we compared prediction performance using the surrogate genes found from our greedy algorithm (referred to as the SV2000) with the landmark genes provided by existing technologies such as L1000 (Genometry) and S1500 (Tox21), finding better predictive performance for the SV2000. The ability of these predictive algorithms to predict pathway level responses is a positive step toward incorporating mode of action (MOA) analysis into the high throughput prioritization and testing of the large number of chemicals in need of safety evaluation. Frontiers Media S.A. 2018-10-02 /pmc/articles/PMC6176017/ /pubmed/30333746 http://dx.doi.org/10.3389/fphar.2018.01072 Text en Copyright © 2018 Haider, Black, Parks, Foley, Wetmore, Andersen, Clewell, Mansouri and McMullen. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
Haider, Saad
Black, Michael B.
Parks, Bethany B.
Foley, Briana
Wetmore, Barbara A.
Andersen, Melvin E.
Clewell, Rebecca A.
Mansouri, Kamel
McMullen, Patrick D.
A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation
title A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation
title_full A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation
title_fullStr A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation
title_full_unstemmed A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation
title_short A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation
title_sort qualitative modeling approach for whole genome prediction using high-throughput toxicogenomics data and pathway-based validation
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6176017/
https://www.ncbi.nlm.nih.gov/pubmed/30333746
http://dx.doi.org/10.3389/fphar.2018.01072
work_keys_str_mv AT haidersaad aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT blackmichaelb aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT parksbethanyb aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT foleybriana aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT wetmorebarbaraa aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT andersenmelvine aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT clewellrebeccaa aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT mansourikamel aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT mcmullenpatrickd aqualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT haidersaad qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT blackmichaelb qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT parksbethanyb qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT foleybriana qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT wetmorebarbaraa qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT andersenmelvine qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT clewellrebeccaa qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT mansourikamel qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation
AT mcmullenpatrickd qualitativemodelingapproachforwholegenomepredictionusinghighthroughputtoxicogenomicsdataandpathwaybasedvalidation