Cargando…

Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification

Plant samples are potential sources of physiologically active secondary metabolites and their classification is an extremely important task in traditional medicine and other fields of research. In the production of herbal drugs, different plant parts of the same or related species can serve as adult...

Descripción completa

Detalles Bibliográficos
Autores principales: Turova, Polina, Stavrianidi, Andrey, Svekolkin, Viktor, Lyskov, Dmitry, Podolskiy, Ilya, Rodin, Igor, Shpigun, Oleg, Buryak, Aleksey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611620/
https://www.ncbi.nlm.nih.gov/pubmed/36295895
http://dx.doi.org/10.3390/metabo12100993
_version_ 1784819572990279680
author Turova, Polina
Stavrianidi, Andrey
Svekolkin, Viktor
Lyskov, Dmitry
Podolskiy, Ilya
Rodin, Igor
Shpigun, Oleg
Buryak, Aleksey
author_facet Turova, Polina
Stavrianidi, Andrey
Svekolkin, Viktor
Lyskov, Dmitry
Podolskiy, Ilya
Rodin, Igor
Shpigun, Oleg
Buryak, Aleksey
author_sort Turova, Polina
collection PubMed
description Plant samples are potential sources of physiologically active secondary metabolites and their classification is an extremely important task in traditional medicine and other fields of research. In the production of herbal drugs, different plant parts of the same or related species can serve as adulterants for primary plant material. The use of highly informative and relatively easily accessible tools, such as liquid chromatography and low-resolution mass spectrometry, helps to solve these tasks by means of fingerprint analysis. In this study, to reveal specific plant part features for 20 species from one family (Apiaceae), and to preserve the maximum information content, two approaches are suggested. In both cases, minimal raw data pretreatment, including rescaling of time and m/z axes and cutting off some uninformative regions, was applied. For the support vector machine (SVM) method, tensor unfolding was required, while neural networks (NNs) were able to work directly with squared heatmaps as input data. Moreover, five data augmentation variants are proposed, to overcome the typical problem of a lack of data. As a result, a comparable F1-score close to 0.75 was achieved by SVM and two employed NN architectures. Eight marker compounds belonging to chlorophylls, lipids, and coumarin apio-glucosides were tentatively identified as characteristic of their corresponding sample groups: roots, stems, leaves, and fruits. The proposed approaches are simple, information-saving and can be applied to a broad type of tasks in metabolomics.
format Online
Article
Text
id pubmed-9611620
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-96116202022-10-28 Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification Turova, Polina Stavrianidi, Andrey Svekolkin, Viktor Lyskov, Dmitry Podolskiy, Ilya Rodin, Igor Shpigun, Oleg Buryak, Aleksey Metabolites Article Plant samples are potential sources of physiologically active secondary metabolites and their classification is an extremely important task in traditional medicine and other fields of research. In the production of herbal drugs, different plant parts of the same or related species can serve as adulterants for primary plant material. The use of highly informative and relatively easily accessible tools, such as liquid chromatography and low-resolution mass spectrometry, helps to solve these tasks by means of fingerprint analysis. In this study, to reveal specific plant part features for 20 species from one family (Apiaceae), and to preserve the maximum information content, two approaches are suggested. In both cases, minimal raw data pretreatment, including rescaling of time and m/z axes and cutting off some uninformative regions, was applied. For the support vector machine (SVM) method, tensor unfolding was required, while neural networks (NNs) were able to work directly with squared heatmaps as input data. Moreover, five data augmentation variants are proposed, to overcome the typical problem of a lack of data. As a result, a comparable F1-score close to 0.75 was achieved by SVM and two employed NN architectures. Eight marker compounds belonging to chlorophylls, lipids, and coumarin apio-glucosides were tentatively identified as characteristic of their corresponding sample groups: roots, stems, leaves, and fruits. The proposed approaches are simple, information-saving and can be applied to a broad type of tasks in metabolomics. MDPI 2022-10-19 /pmc/articles/PMC9611620/ /pubmed/36295895 http://dx.doi.org/10.3390/metabo12100993 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Turova, Polina
Stavrianidi, Andrey
Svekolkin, Viktor
Lyskov, Dmitry
Podolskiy, Ilya
Rodin, Igor
Shpigun, Oleg
Buryak, Aleksey
Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification
title Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification
title_full Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification
title_fullStr Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification
title_full_unstemmed Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification
title_short Analysis of Primary Liquid Chromatography Mass Spectrometry Data by Neural Networks for Plant Samples Classification
title_sort analysis of primary liquid chromatography mass spectrometry data by neural networks for plant samples classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611620/
https://www.ncbi.nlm.nih.gov/pubmed/36295895
http://dx.doi.org/10.3390/metabo12100993
work_keys_str_mv AT turovapolina analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification
AT stavrianidiandrey analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification
AT svekolkinviktor analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification
AT lyskovdmitry analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification
AT podolskiyilya analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification
AT rodinigor analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification
AT shpigunoleg analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification
AT buryakaleksey analysisofprimaryliquidchromatographymassspectrometrydatabyneuralnetworksforplantsamplesclassification