Cargando…

Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms

Environmental toxicants affect human health in various ways. Of the thousands of chemicals present in the environment, those with adverse effects on the endocrine system are referred to as endocrine-disrupting chemicals (EDCs). Here, we focused on a subclass of EDCs that impacts the estrogen recepto...

Descripción completa

Detalles Bibliográficos
Autores principales: Mukherjee, Rajib, Beykal, Burcu, Szafran, Adam T., Onel, Melis, Stossi, Fabio, Mancini, Maureen G., Lloyd, Dillon, Wright, Fred A., Zhou, Lan, Mancini, Michael A., Pistikopoulos, Efstratios N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7538107/
https://www.ncbi.nlm.nih.gov/pubmed/32970665
http://dx.doi.org/10.1371/journal.pcbi.1008191
_version_ 1783590800813195264
author Mukherjee, Rajib
Beykal, Burcu
Szafran, Adam T.
Onel, Melis
Stossi, Fabio
Mancini, Maureen G.
Lloyd, Dillon
Wright, Fred A.
Zhou, Lan
Mancini, Michael A.
Pistikopoulos, Efstratios N.
author_facet Mukherjee, Rajib
Beykal, Burcu
Szafran, Adam T.
Onel, Melis
Stossi, Fabio
Mancini, Maureen G.
Lloyd, Dillon
Wright, Fred A.
Zhou, Lan
Mancini, Michael A.
Pistikopoulos, Efstratios N.
author_sort Mukherjee, Rajib
collection PubMed
description Environmental toxicants affect human health in various ways. Of the thousands of chemicals present in the environment, those with adverse effects on the endocrine system are referred to as endocrine-disrupting chemicals (EDCs). Here, we focused on a subclass of EDCs that impacts the estrogen receptor (ER), a pivotal transcriptional regulator in health and disease. Estrogenic activity of compounds can be measured by many in vitro or cell-based high throughput assays that record various endpoints from large pools of cells, and increasingly at the single-cell level. To simultaneously capture multiple mechanistic ER endpoints in individual cells that are affected by EDCs, we previously developed a sensitive high throughput/high content imaging assay that is based upon a stable cell line harboring a visible multicopy ER responsive transcription unit and expressing a green fluorescent protein (GFP) fusion of ER. High content analysis generates voluminous multiplex data comprised of minable features that describe numerous mechanistic endpoints. In this study, we present a machine learning pipeline for rapid, accurate, and sensitive assessment of the endocrine-disrupting potential of benchmark chemicals based on data generated from high content analysis. The multidimensional imaging data was used to train a classification model to ultimately predict the impact of unknown compounds on the ER, either as agonists or antagonists. To this end, both linear logistic regression and nonlinear Random Forest classifiers were benchmarked and evaluated for predicting the estrogenic activity of unknown compounds. Furthermore, through feature selection, data visualization, and model discrimination, the most informative features were identified for the classification of ER agonists/antagonists. The results of this data-driven study showed that highly accurate and generalized classification models with a minimum number of features can be constructed without loss of generality, where these machine learning models serve as a means for rapid mechanistic/phenotypic evaluation of the estrogenic potential of many chemicals.
format Online
Article
Text
id pubmed-7538107
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-75381072020-10-19 Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms Mukherjee, Rajib Beykal, Burcu Szafran, Adam T. Onel, Melis Stossi, Fabio Mancini, Maureen G. Lloyd, Dillon Wright, Fred A. Zhou, Lan Mancini, Michael A. Pistikopoulos, Efstratios N. PLoS Comput Biol Research Article Environmental toxicants affect human health in various ways. Of the thousands of chemicals present in the environment, those with adverse effects on the endocrine system are referred to as endocrine-disrupting chemicals (EDCs). Here, we focused on a subclass of EDCs that impacts the estrogen receptor (ER), a pivotal transcriptional regulator in health and disease. Estrogenic activity of compounds can be measured by many in vitro or cell-based high throughput assays that record various endpoints from large pools of cells, and increasingly at the single-cell level. To simultaneously capture multiple mechanistic ER endpoints in individual cells that are affected by EDCs, we previously developed a sensitive high throughput/high content imaging assay that is based upon a stable cell line harboring a visible multicopy ER responsive transcription unit and expressing a green fluorescent protein (GFP) fusion of ER. High content analysis generates voluminous multiplex data comprised of minable features that describe numerous mechanistic endpoints. In this study, we present a machine learning pipeline for rapid, accurate, and sensitive assessment of the endocrine-disrupting potential of benchmark chemicals based on data generated from high content analysis. The multidimensional imaging data was used to train a classification model to ultimately predict the impact of unknown compounds on the ER, either as agonists or antagonists. To this end, both linear logistic regression and nonlinear Random Forest classifiers were benchmarked and evaluated for predicting the estrogenic activity of unknown compounds. Furthermore, through feature selection, data visualization, and model discrimination, the most informative features were identified for the classification of ER agonists/antagonists. The results of this data-driven study showed that highly accurate and generalized classification models with a minimum number of features can be constructed without loss of generality, where these machine learning models serve as a means for rapid mechanistic/phenotypic evaluation of the estrogenic potential of many chemicals. Public Library of Science 2020-09-24 /pmc/articles/PMC7538107/ /pubmed/32970665 http://dx.doi.org/10.1371/journal.pcbi.1008191 Text en © 2020 Mukherjee et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Mukherjee, Rajib
Beykal, Burcu
Szafran, Adam T.
Onel, Melis
Stossi, Fabio
Mancini, Maureen G.
Lloyd, Dillon
Wright, Fred A.
Zhou, Lan
Mancini, Michael A.
Pistikopoulos, Efstratios N.
Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms
title Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms
title_full Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms
title_fullStr Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms
title_full_unstemmed Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms
title_short Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms
title_sort classification of estrogenic compounds by coupling high content analysis and machine learning algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7538107/
https://www.ncbi.nlm.nih.gov/pubmed/32970665
http://dx.doi.org/10.1371/journal.pcbi.1008191
work_keys_str_mv AT mukherjeerajib classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT beykalburcu classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT szafranadamt classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT onelmelis classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT stossifabio classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT mancinimaureeng classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT lloyddillon classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT wrightfreda classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT zhoulan classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT mancinimichaela classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms
AT pistikopoulosefstratiosn classificationofestrogeniccompoundsbycouplinghighcontentanalysisandmachinelearningalgorithms