Cargando…

enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways

MOTIVATION: Transformation products (TPs) of man-made chemicals, formed through microbially mediated transformation in the environment, can have serious adverse environmental effects, yet the analytical identification of TPs is challenging. Rule-based prediction tools are successful in predicting TP...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Kunyang, Fenner, Kathrin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10322654/
https://www.ncbi.nlm.nih.gov/pubmed/37354527
http://dx.doi.org/10.1093/bioinformatics/btad407
_version_ 1785068805234360320
author Zhang, Kunyang
Fenner, Kathrin
author_facet Zhang, Kunyang
Fenner, Kathrin
author_sort Zhang, Kunyang
collection PubMed
description MOTIVATION: Transformation products (TPs) of man-made chemicals, formed through microbially mediated transformation in the environment, can have serious adverse environmental effects, yet the analytical identification of TPs is challenging. Rule-based prediction tools are successful in predicting TPs, especially in environmental chemistry applications that typically have to rely on small datasets, by imparting the existing knowledge on enzyme-mediated biotransformation reactions. However, the rules extracted from biotransformation reaction databases usually face the issue of being over/under-generalized and are not flexible to be updated with new reactions. RESULTS: We developed an automatic rule extraction tool called enviRule. It clusters biotransformation reactions into different groups based on the similarities of reaction fingerprints, and then automatically extracts and generalizes rules for each reaction group in SMARTS format. It optimizes the genericity of automatic rules against the downstream TP prediction task. Models trained with automatic rules outperformed the models trained with manually curated rules by 30% in the area under curve (AUC) scores. Moreover, automatic rules can be easily updated with new reactions, highlighting enviRule’s strengths for both automatic extraction of optimized reactions rules and automated updating thereof. AVAILABILITY AND IMPLEMENTATION: enviRule code is freely available at https://github.com/zhangky12/enviRule.
format Online
Article
Text
id pubmed-10322654
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103226542023-07-07 enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways Zhang, Kunyang Fenner, Kathrin Bioinformatics Original Paper MOTIVATION: Transformation products (TPs) of man-made chemicals, formed through microbially mediated transformation in the environment, can have serious adverse environmental effects, yet the analytical identification of TPs is challenging. Rule-based prediction tools are successful in predicting TPs, especially in environmental chemistry applications that typically have to rely on small datasets, by imparting the existing knowledge on enzyme-mediated biotransformation reactions. However, the rules extracted from biotransformation reaction databases usually face the issue of being over/under-generalized and are not flexible to be updated with new reactions. RESULTS: We developed an automatic rule extraction tool called enviRule. It clusters biotransformation reactions into different groups based on the similarities of reaction fingerprints, and then automatically extracts and generalizes rules for each reaction group in SMARTS format. It optimizes the genericity of automatic rules against the downstream TP prediction task. Models trained with automatic rules outperformed the models trained with manually curated rules by 30% in the area under curve (AUC) scores. Moreover, automatic rules can be easily updated with new reactions, highlighting enviRule’s strengths for both automatic extraction of optimized reactions rules and automated updating thereof. AVAILABILITY AND IMPLEMENTATION: enviRule code is freely available at https://github.com/zhangky12/enviRule. Oxford University Press 2023-06-24 /pmc/articles/PMC10322654/ /pubmed/37354527 http://dx.doi.org/10.1093/bioinformatics/btad407 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Zhang, Kunyang
Fenner, Kathrin
enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways
title enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways
title_full enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways
title_fullStr enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways
title_full_unstemmed enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways
title_short enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways
title_sort envirule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10322654/
https://www.ncbi.nlm.nih.gov/pubmed/37354527
http://dx.doi.org/10.1093/bioinformatics/btad407
work_keys_str_mv AT zhangkunyang enviruleanendtoendsystemforautomaticextractionofreactionpatternsfromenvironmentalcontaminantbiotransformationpathways
AT fennerkathrin enviruleanendtoendsystemforautomaticextractionofreactionpatternsfromenvironmentalcontaminantbiotransformationpathways