Cargando…

Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task

A dataset of liquid chromatography-mass spectrometry measurements of medicinal plant extracts from 74 species was generated and used for training and validating plant species identification algorithms. Various strategies for data handling and feature space extraction were tested. Constrained Tucker...

Descripción completa

Detalles Bibliográficos
Autores principales: Kharyuk, Pavel, Nazarenko, Dmitry, Oseledets, Ivan, Rodin, Igor, Shpigun, Oleg, Tsitsilin, Andrey, Lavrentyev, Mikhail
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6243014/
https://www.ncbi.nlm.nih.gov/pubmed/30451976
http://dx.doi.org/10.1038/s41598-018-35399-z
_version_ 1783371890180489216
author Kharyuk, Pavel
Nazarenko, Dmitry
Oseledets, Ivan
Rodin, Igor
Shpigun, Oleg
Tsitsilin, Andrey
Lavrentyev, Mikhail
author_facet Kharyuk, Pavel
Nazarenko, Dmitry
Oseledets, Ivan
Rodin, Igor
Shpigun, Oleg
Tsitsilin, Andrey
Lavrentyev, Mikhail
author_sort Kharyuk, Pavel
collection PubMed
description A dataset of liquid chromatography-mass spectrometry measurements of medicinal plant extracts from 74 species was generated and used for training and validating plant species identification algorithms. Various strategies for data handling and feature space extraction were tested. Constrained Tucker decomposition, large-scale (more than 1500 variables) discrete Bayesian Networks and autoencoder based dimensionality reduction coupled with continuous Bayes classifier and logistic regression were optimized to achieve the best accuracy. Even with elimination of all retention time values accuracies of up to 96% and 92% were achieved on validation set for plant species and plant organ identification respectively. Benefits and drawbacks of used algortihms were discussed. Preliminary test showed that developed approaches exhibit tolerance to changes in data created by using different extraction methods and/or equipment. Dataset with more than 2200 chromatograms was published in an open repository.
format Online
Article
Text
id pubmed-6243014
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-62430142018-11-27 Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task Kharyuk, Pavel Nazarenko, Dmitry Oseledets, Ivan Rodin, Igor Shpigun, Oleg Tsitsilin, Andrey Lavrentyev, Mikhail Sci Rep Article A dataset of liquid chromatography-mass spectrometry measurements of medicinal plant extracts from 74 species was generated and used for training and validating plant species identification algorithms. Various strategies for data handling and feature space extraction were tested. Constrained Tucker decomposition, large-scale (more than 1500 variables) discrete Bayesian Networks and autoencoder based dimensionality reduction coupled with continuous Bayes classifier and logistic regression were optimized to achieve the best accuracy. Even with elimination of all retention time values accuracies of up to 96% and 92% were achieved on validation set for plant species and plant organ identification respectively. Benefits and drawbacks of used algortihms were discussed. Preliminary test showed that developed approaches exhibit tolerance to changes in data created by using different extraction methods and/or equipment. Dataset with more than 2200 chromatograms was published in an open repository. Nature Publishing Group UK 2018-11-19 /pmc/articles/PMC6243014/ /pubmed/30451976 http://dx.doi.org/10.1038/s41598-018-35399-z Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Kharyuk, Pavel
Nazarenko, Dmitry
Oseledets, Ivan
Rodin, Igor
Shpigun, Oleg
Tsitsilin, Andrey
Lavrentyev, Mikhail
Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task
title Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task
title_full Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task
title_fullStr Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task
title_full_unstemmed Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task
title_short Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task
title_sort employing fingerprinting of medicinal plants by means of lc-ms and machine learning for species identification task
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6243014/
https://www.ncbi.nlm.nih.gov/pubmed/30451976
http://dx.doi.org/10.1038/s41598-018-35399-z
work_keys_str_mv AT kharyukpavel employingfingerprintingofmedicinalplantsbymeansoflcmsandmachinelearningforspeciesidentificationtask
AT nazarenkodmitry employingfingerprintingofmedicinalplantsbymeansoflcmsandmachinelearningforspeciesidentificationtask
AT oseledetsivan employingfingerprintingofmedicinalplantsbymeansoflcmsandmachinelearningforspeciesidentificationtask
AT rodinigor employingfingerprintingofmedicinalplantsbymeansoflcmsandmachinelearningforspeciesidentificationtask
AT shpigunoleg employingfingerprintingofmedicinalplantsbymeansoflcmsandmachinelearningforspeciesidentificationtask
AT tsitsilinandrey employingfingerprintingofmedicinalplantsbymeansoflcmsandmachinelearningforspeciesidentificationtask
AT lavrentyevmikhail employingfingerprintingofmedicinalplantsbymeansoflcmsandmachinelearningforspeciesidentificationtask