Cargando…
Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes
In silico (quantitative) structure–activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. I...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10537630/ https://www.ncbi.nlm.nih.gov/pubmed/37755795 http://dx.doi.org/10.3390/toxics11090785 |
_version_ | 1785113145724895232 |
---|---|
author | Khondkaryan, Lusine Tevosyan, Ani Navasardyan, Hayk Khachatrian, Hrant Tadevosyan, Gohar Apresyan, Lilit Chilingaryan, Gayane Navoyan, Zaven Stopper, Helga Babayan, Nelly |
author_facet | Khondkaryan, Lusine Tevosyan, Ani Navasardyan, Hayk Khachatrian, Hrant Tadevosyan, Gohar Apresyan, Lilit Chilingaryan, Gayane Navoyan, Zaven Stopper, Helga Babayan, Nelly |
author_sort | Khondkaryan, Lusine |
collection | PubMed |
description | In silico (quantitative) structure–activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. In the present study, we collected experimental data on micronuclei in vitro and in vivo, utilizing databases and conducting a PubMed search, aided by text mining using the BioBERT large language model. Chemotype enrichment analysis on the updated datasets was performed to identify enriched substructures. Additionally, chemotypes common for both endpoints were found. Five machine learning models in combination with molecular descriptors, twelve fingerprints and two data balancing techniques were applied to construct individual models. The best-performing individual models were selected for the ensemble construction. The curated final dataset consists of 981 chemicals for micronuclei in vitro and 1309 for mouse micronuclei in vivo, respectively. Out of 18 chemotypes enriched in micronuclei in vitro, only 7 were found to be relevant for in vivo prediction. The ensemble model exhibited high accuracy and sensitivity when applied to an external test set of in vitro data. A good balanced predictive performance was also achieved for the micronucleus in vivo endpoint. |
format | Online Article Text |
id | pubmed-10537630 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-105376302023-09-29 Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes Khondkaryan, Lusine Tevosyan, Ani Navasardyan, Hayk Khachatrian, Hrant Tadevosyan, Gohar Apresyan, Lilit Chilingaryan, Gayane Navoyan, Zaven Stopper, Helga Babayan, Nelly Toxics Article In silico (quantitative) structure–activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. In the present study, we collected experimental data on micronuclei in vitro and in vivo, utilizing databases and conducting a PubMed search, aided by text mining using the BioBERT large language model. Chemotype enrichment analysis on the updated datasets was performed to identify enriched substructures. Additionally, chemotypes common for both endpoints were found. Five machine learning models in combination with molecular descriptors, twelve fingerprints and two data balancing techniques were applied to construct individual models. The best-performing individual models were selected for the ensemble construction. The curated final dataset consists of 981 chemicals for micronuclei in vitro and 1309 for mouse micronuclei in vivo, respectively. Out of 18 chemotypes enriched in micronuclei in vitro, only 7 were found to be relevant for in vivo prediction. The ensemble model exhibited high accuracy and sensitivity when applied to an external test set of in vitro data. A good balanced predictive performance was also achieved for the micronucleus in vivo endpoint. MDPI 2023-09-15 /pmc/articles/PMC10537630/ /pubmed/37755795 http://dx.doi.org/10.3390/toxics11090785 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Khondkaryan, Lusine Tevosyan, Ani Navasardyan, Hayk Khachatrian, Hrant Tadevosyan, Gohar Apresyan, Lilit Chilingaryan, Gayane Navoyan, Zaven Stopper, Helga Babayan, Nelly Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes |
title | Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes |
title_full | Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes |
title_fullStr | Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes |
title_full_unstemmed | Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes |
title_short | Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes |
title_sort | datasets construction and development of qsar models for predicting micronucleus in vitro and in vivo assay outcomes |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10537630/ https://www.ncbi.nlm.nih.gov/pubmed/37755795 http://dx.doi.org/10.3390/toxics11090785 |
work_keys_str_mv | AT khondkaryanlusine datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT tevosyanani datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT navasardyanhayk datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT khachatrianhrant datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT tadevosyangohar datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT apresyanlilit datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT chilingaryangayane datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT navoyanzaven datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT stopperhelga datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes AT babayannelly datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes |