Cargando…

Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes

In silico (quantitative) structure–activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. I...

Descripción completa

Detalles Bibliográficos
Autores principales: Khondkaryan, Lusine, Tevosyan, Ani, Navasardyan, Hayk, Khachatrian, Hrant, Tadevosyan, Gohar, Apresyan, Lilit, Chilingaryan, Gayane, Navoyan, Zaven, Stopper, Helga, Babayan, Nelly
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10537630/
https://www.ncbi.nlm.nih.gov/pubmed/37755795
http://dx.doi.org/10.3390/toxics11090785
_version_ 1785113145724895232
author Khondkaryan, Lusine
Tevosyan, Ani
Navasardyan, Hayk
Khachatrian, Hrant
Tadevosyan, Gohar
Apresyan, Lilit
Chilingaryan, Gayane
Navoyan, Zaven
Stopper, Helga
Babayan, Nelly
author_facet Khondkaryan, Lusine
Tevosyan, Ani
Navasardyan, Hayk
Khachatrian, Hrant
Tadevosyan, Gohar
Apresyan, Lilit
Chilingaryan, Gayane
Navoyan, Zaven
Stopper, Helga
Babayan, Nelly
author_sort Khondkaryan, Lusine
collection PubMed
description In silico (quantitative) structure–activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. In the present study, we collected experimental data on micronuclei in vitro and in vivo, utilizing databases and conducting a PubMed search, aided by text mining using the BioBERT large language model. Chemotype enrichment analysis on the updated datasets was performed to identify enriched substructures. Additionally, chemotypes common for both endpoints were found. Five machine learning models in combination with molecular descriptors, twelve fingerprints and two data balancing techniques were applied to construct individual models. The best-performing individual models were selected for the ensemble construction. The curated final dataset consists of 981 chemicals for micronuclei in vitro and 1309 for mouse micronuclei in vivo, respectively. Out of 18 chemotypes enriched in micronuclei in vitro, only 7 were found to be relevant for in vivo prediction. The ensemble model exhibited high accuracy and sensitivity when applied to an external test set of in vitro data. A good balanced predictive performance was also achieved for the micronucleus in vivo endpoint.
format Online
Article
Text
id pubmed-10537630
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105376302023-09-29 Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes Khondkaryan, Lusine Tevosyan, Ani Navasardyan, Hayk Khachatrian, Hrant Tadevosyan, Gohar Apresyan, Lilit Chilingaryan, Gayane Navoyan, Zaven Stopper, Helga Babayan, Nelly Toxics Article In silico (quantitative) structure–activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. In the present study, we collected experimental data on micronuclei in vitro and in vivo, utilizing databases and conducting a PubMed search, aided by text mining using the BioBERT large language model. Chemotype enrichment analysis on the updated datasets was performed to identify enriched substructures. Additionally, chemotypes common for both endpoints were found. Five machine learning models in combination with molecular descriptors, twelve fingerprints and two data balancing techniques were applied to construct individual models. The best-performing individual models were selected for the ensemble construction. The curated final dataset consists of 981 chemicals for micronuclei in vitro and 1309 for mouse micronuclei in vivo, respectively. Out of 18 chemotypes enriched in micronuclei in vitro, only 7 were found to be relevant for in vivo prediction. The ensemble model exhibited high accuracy and sensitivity when applied to an external test set of in vitro data. A good balanced predictive performance was also achieved for the micronucleus in vivo endpoint. MDPI 2023-09-15 /pmc/articles/PMC10537630/ /pubmed/37755795 http://dx.doi.org/10.3390/toxics11090785 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Khondkaryan, Lusine
Tevosyan, Ani
Navasardyan, Hayk
Khachatrian, Hrant
Tadevosyan, Gohar
Apresyan, Lilit
Chilingaryan, Gayane
Navoyan, Zaven
Stopper, Helga
Babayan, Nelly
Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes
title Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes
title_full Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes
title_fullStr Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes
title_full_unstemmed Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes
title_short Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes
title_sort datasets construction and development of qsar models for predicting micronucleus in vitro and in vivo assay outcomes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10537630/
https://www.ncbi.nlm.nih.gov/pubmed/37755795
http://dx.doi.org/10.3390/toxics11090785
work_keys_str_mv AT khondkaryanlusine datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT tevosyanani datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT navasardyanhayk datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT khachatrianhrant datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT tadevosyangohar datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT apresyanlilit datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT chilingaryangayane datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT navoyanzaven datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT stopperhelga datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes
AT babayannelly datasetsconstructionanddevelopmentofqsarmodelsforpredictingmicronucleusinvitroandinvivoassayoutcomes