Cargando…

Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data

[Image: see text] The rapid emergence of novel psychoactive substances (NPS) poses new challenges and requirements for forensic testing/analysis techniques. This paper aims to explore the application of unsupervised clustering of NPS compounds’ infrared spectra. Two statistical measures, Pearson and...

Descripción completa

Detalles Bibliográficos
Autor principal: He, Kedan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8638022/
https://www.ncbi.nlm.nih.gov/pubmed/34870036
http://dx.doi.org/10.1021/acsomega.1c04945
_version_ 1784608866750693376
author He, Kedan
author_facet He, Kedan
author_sort He, Kedan
collection PubMed
description [Image: see text] The rapid emergence of novel psychoactive substances (NPS) poses new challenges and requirements for forensic testing/analysis techniques. This paper aims to explore the application of unsupervised clustering of NPS compounds’ infrared spectra. Two statistical measures, Pearson and Spearman, were used to quantify the spectral similarity and to generate similarity matrices for hierarchical clustering. The correspondence of spectral similarity clustering trees to the commonly used structural/pharmacological categorization was evaluated and compared to the clustering generated using 2D/3D molecular fingerprints. Hybrid model feature selections were applied using different filter-based feature ranking algorithms developed for unsupervised clustering tasks. Since Spearman tends to overestimate the spectral similarity based on the overall pattern of the full spectrum, the clustering result shows the highest degree of improvement from having the nondiscriminative features removed. The loading plots of the first two principal components of the optimal feature subsets confirmed that the most important vibrational bands contributing to the clustering of NPS compounds were selected using non-negative discriminative feature selection (NDFS) algorithms.
format Online
Article
Text
id pubmed-8638022
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-86380222021-12-03 Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data He, Kedan ACS Omega [Image: see text] The rapid emergence of novel psychoactive substances (NPS) poses new challenges and requirements for forensic testing/analysis techniques. This paper aims to explore the application of unsupervised clustering of NPS compounds’ infrared spectra. Two statistical measures, Pearson and Spearman, were used to quantify the spectral similarity and to generate similarity matrices for hierarchical clustering. The correspondence of spectral similarity clustering trees to the commonly used structural/pharmacological categorization was evaluated and compared to the clustering generated using 2D/3D molecular fingerprints. Hybrid model feature selections were applied using different filter-based feature ranking algorithms developed for unsupervised clustering tasks. Since Spearman tends to overestimate the spectral similarity based on the overall pattern of the full spectrum, the clustering result shows the highest degree of improvement from having the nondiscriminative features removed. The loading plots of the first two principal components of the optimal feature subsets confirmed that the most important vibrational bands contributing to the clustering of NPS compounds were selected using non-negative discriminative feature selection (NDFS) algorithms. American Chemical Society 2021-11-16 /pmc/articles/PMC8638022/ /pubmed/34870036 http://dx.doi.org/10.1021/acsomega.1c04945 Text en © 2021 The Author. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle He, Kedan
Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data
title Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data
title_full Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data
title_fullStr Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data
title_full_unstemmed Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data
title_short Filter Feature Selection for Unsupervised Clustering of Designer Drugs Using DFT Simulated IR Spectra Data
title_sort filter feature selection for unsupervised clustering of designer drugs using dft simulated ir spectra data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8638022/
https://www.ncbi.nlm.nih.gov/pubmed/34870036
http://dx.doi.org/10.1021/acsomega.1c04945
work_keys_str_mv AT hekedan filterfeatureselectionforunsupervisedclusteringofdesignerdrugsusingdftsimulatedirspectradata