Cargando…

A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics

Changes in gene expression can help reveal the mechanisms of disease processes and the mode of action for toxicities and adverse effects on cellular responses induced by exposures to chemicals, drugs and environment agents. The U.S. Tox21 Federal collaboration, which currently quantifies the biologi...

Descripción completa

Detalles Bibliográficos
Autores principales: Mav, Deepak, Shah, Ruchir R., Howard, Brian E., Auerbach, Scott S., Bushel, Pierre R., Collins, Jennifer B., Gerhold, David L., Judson, Richard S., Karmaus, Agnes L., Maull, Elizabeth A., Mendrick, Donna L., Merrick, B. Alex, Sipes, Nisha S., Svoboda, Daniel, Paules, Richard S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5819766/
https://www.ncbi.nlm.nih.gov/pubmed/29462216
http://dx.doi.org/10.1371/journal.pone.0191105
_version_ 1783301262876344320
author Mav, Deepak
Shah, Ruchir R.
Howard, Brian E.
Auerbach, Scott S.
Bushel, Pierre R.
Collins, Jennifer B.
Gerhold, David L.
Judson, Richard S.
Karmaus, Agnes L.
Maull, Elizabeth A.
Mendrick, Donna L.
Merrick, B. Alex
Sipes, Nisha S.
Svoboda, Daniel
Paules, Richard S.
author_facet Mav, Deepak
Shah, Ruchir R.
Howard, Brian E.
Auerbach, Scott S.
Bushel, Pierre R.
Collins, Jennifer B.
Gerhold, David L.
Judson, Richard S.
Karmaus, Agnes L.
Maull, Elizabeth A.
Mendrick, Donna L.
Merrick, B. Alex
Sipes, Nisha S.
Svoboda, Daniel
Paules, Richard S.
author_sort Mav, Deepak
collection PubMed
description Changes in gene expression can help reveal the mechanisms of disease processes and the mode of action for toxicities and adverse effects on cellular responses induced by exposures to chemicals, drugs and environment agents. The U.S. Tox21 Federal collaboration, which currently quantifies the biological effects of nearly 10,000 chemicals via quantitative high-throughput screening(qHTS) in in vitro model systems, is now making an effort to incorporate gene expression profiling into the existing battery of assays. Whole transcriptome analyses performed on large numbers of samples using microarrays or RNA-Seq is currently cost-prohibitive. Accordingly, the Tox21 Program is pursuing a high-throughput transcriptomics (HTT) method that focuses on the targeted detection of gene expression for a carefully selected subset of the transcriptome that potentially can reduce the cost by a factor of 10-fold, allowing for the analysis of larger numbers of samples. To identify the optimal transcriptome subset, genes were sought that are (1) representative of the highly diverse biological space, (2) capable of serving as a proxy for expression changes in unmeasured genes, and (3) sufficient to provide coverage of well described biological pathways. A hybrid method for gene selection is presented herein that combines data-driven and knowledge-driven concepts into one cohesive method. Our approach is modular, applicable to any species, and facilitates a robust, quantitative evaluation of performance. In particular, we were able to perform gene selection such that the resulting set of “sentinel genes” adequately represents all known canonical pathways from Molecular Signature Database (MSigDB v4.0) and can be used to infer expression changes for the remainder of the transcriptome. The resulting computational model allowed us to choose a purely data-driven subset of 1500 sentinel genes, referred to as the S1500 set, which was then augmented using a knowledge-driven selection of additional genes to create the final S1500+ gene set. Our results indicate that the sentinel genes selected can be used to accurately predict pathway perturbations and biological relationships for samples under study.
format Online
Article
Text
id pubmed-5819766
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-58197662018-03-15 A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics Mav, Deepak Shah, Ruchir R. Howard, Brian E. Auerbach, Scott S. Bushel, Pierre R. Collins, Jennifer B. Gerhold, David L. Judson, Richard S. Karmaus, Agnes L. Maull, Elizabeth A. Mendrick, Donna L. Merrick, B. Alex Sipes, Nisha S. Svoboda, Daniel Paules, Richard S. PLoS One Research Article Changes in gene expression can help reveal the mechanisms of disease processes and the mode of action for toxicities and adverse effects on cellular responses induced by exposures to chemicals, drugs and environment agents. The U.S. Tox21 Federal collaboration, which currently quantifies the biological effects of nearly 10,000 chemicals via quantitative high-throughput screening(qHTS) in in vitro model systems, is now making an effort to incorporate gene expression profiling into the existing battery of assays. Whole transcriptome analyses performed on large numbers of samples using microarrays or RNA-Seq is currently cost-prohibitive. Accordingly, the Tox21 Program is pursuing a high-throughput transcriptomics (HTT) method that focuses on the targeted detection of gene expression for a carefully selected subset of the transcriptome that potentially can reduce the cost by a factor of 10-fold, allowing for the analysis of larger numbers of samples. To identify the optimal transcriptome subset, genes were sought that are (1) representative of the highly diverse biological space, (2) capable of serving as a proxy for expression changes in unmeasured genes, and (3) sufficient to provide coverage of well described biological pathways. A hybrid method for gene selection is presented herein that combines data-driven and knowledge-driven concepts into one cohesive method. Our approach is modular, applicable to any species, and facilitates a robust, quantitative evaluation of performance. In particular, we were able to perform gene selection such that the resulting set of “sentinel genes” adequately represents all known canonical pathways from Molecular Signature Database (MSigDB v4.0) and can be used to infer expression changes for the remainder of the transcriptome. The resulting computational model allowed us to choose a purely data-driven subset of 1500 sentinel genes, referred to as the S1500 set, which was then augmented using a knowledge-driven selection of additional genes to create the final S1500+ gene set. Our results indicate that the sentinel genes selected can be used to accurately predict pathway perturbations and biological relationships for samples under study. Public Library of Science 2018-02-20 /pmc/articles/PMC5819766/ /pubmed/29462216 http://dx.doi.org/10.1371/journal.pone.0191105 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Mav, Deepak
Shah, Ruchir R.
Howard, Brian E.
Auerbach, Scott S.
Bushel, Pierre R.
Collins, Jennifer B.
Gerhold, David L.
Judson, Richard S.
Karmaus, Agnes L.
Maull, Elizabeth A.
Mendrick, Donna L.
Merrick, B. Alex
Sipes, Nisha S.
Svoboda, Daniel
Paules, Richard S.
A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics
title A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics
title_full A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics
title_fullStr A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics
title_full_unstemmed A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics
title_short A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics
title_sort hybrid gene selection approach to create the s1500+ targeted gene sets for use in high-throughput transcriptomics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5819766/
https://www.ncbi.nlm.nih.gov/pubmed/29462216
http://dx.doi.org/10.1371/journal.pone.0191105
work_keys_str_mv AT mavdeepak ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT shahruchirr ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT howardbriane ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT auerbachscotts ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT bushelpierrer ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT collinsjenniferb ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT gerholddavidl ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT judsonrichards ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT karmausagnesl ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT maullelizabetha ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT mendrickdonnal ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT merrickbalex ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT sipesnishas ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT svobodadaniel ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT paulesrichards ahybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT mavdeepak hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT shahruchirr hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT howardbriane hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT auerbachscotts hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT bushelpierrer hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT collinsjenniferb hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT gerholddavidl hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT judsonrichards hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT karmausagnesl hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT maullelizabetha hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT mendrickdonnal hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT merrickbalex hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT sipesnishas hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT svobodadaniel hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics
AT paulesrichards hybridgeneselectionapproachtocreatethes1500targetedgenesetsforuseinhighthroughputtranscriptomics