Cargando…

Feature analysis for classification of trace fluorescent labeled protein crystallization images

BACKGROUND: Large number of features are extracted from protein crystallization trial images to improve the accuracy of classifiers for predicting the presence of crystals or phases of the crystallization process. The excessive number of features and computationally intensive image processing method...

Descripción completa

Detalles Bibliográficos
Autores principales: Sigdel, Madhav, Dinc, Imren, Sigdel, Madhu S., Dinc, Semih, Pusey, Marc L., Aygun, Ramazan S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408444/
https://www.ncbi.nlm.nih.gov/pubmed/28465724
http://dx.doi.org/10.1186/s13040-017-0133-9
_version_ 1783232309232664576
author Sigdel, Madhav
Dinc, Imren
Sigdel, Madhu S.
Dinc, Semih
Pusey, Marc L.
Aygun, Ramazan S.
author_facet Sigdel, Madhav
Dinc, Imren
Sigdel, Madhu S.
Dinc, Semih
Pusey, Marc L.
Aygun, Ramazan S.
author_sort Sigdel, Madhav
collection PubMed
description BACKGROUND: Large number of features are extracted from protein crystallization trial images to improve the accuracy of classifiers for predicting the presence of crystals or phases of the crystallization process. The excessive number of features and computationally intensive image processing methods to extract these features make utilization of automated classification tools on stand-alone computing systems inconvenient due to the required time to complete the classification tasks. Combinations of image feature sets, feature reduction and classification techniques for crystallization images benefiting from trace fluorescence labeling are investigated. RESULTS: Features are categorized into intensity, graph, histogram, texture, shape adaptive, and region features (using binarized images generated by Otsu’s, green percentile, and morphological thresholding). The effects of normalization, feature reduction with principle components analysis (PCA), and feature selection using random forest classifier are also analyzed. The time required to extract feature categories is computed and an estimated time of extraction is provided for feature category combinations. We have conducted around 8624 experiments (different combinations of feature categories, binarization methods, feature reduction/selection, normalization, and crystal categories). The best experimental results are obtained using combinations of intensity features, region features using Otsu’s thresholding, region features using green percentile G (90) thresholding, region features using green percentile G (99) thresholding, graph features, and histogram features. Using this feature set combination, 96% accuracy (without misclassifying crystals as non-crystals) was achieved for the first level of classification to determine presence of crystals. Since missing a crystal is not desired, our algorithm is adjusted to achieve a high sensitivity rate. In the second level classification, 74.2% accuracy for (5-class) crystal sub-category classification. Best classification rates were achieved using random forest classifier. CONTRIBUTIONS: The feature extraction and classification could be completed in about 2 s per image on a stand-alone computing system, which is suitable for real time analysis. These results enable research groups to select features according to their hardware setups for real-time analysis.
format Online
Article
Text
id pubmed-5408444
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-54084442017-05-02 Feature analysis for classification of trace fluorescent labeled protein crystallization images Sigdel, Madhav Dinc, Imren Sigdel, Madhu S. Dinc, Semih Pusey, Marc L. Aygun, Ramazan S. BioData Min Research BACKGROUND: Large number of features are extracted from protein crystallization trial images to improve the accuracy of classifiers for predicting the presence of crystals or phases of the crystallization process. The excessive number of features and computationally intensive image processing methods to extract these features make utilization of automated classification tools on stand-alone computing systems inconvenient due to the required time to complete the classification tasks. Combinations of image feature sets, feature reduction and classification techniques for crystallization images benefiting from trace fluorescence labeling are investigated. RESULTS: Features are categorized into intensity, graph, histogram, texture, shape adaptive, and region features (using binarized images generated by Otsu’s, green percentile, and morphological thresholding). The effects of normalization, feature reduction with principle components analysis (PCA), and feature selection using random forest classifier are also analyzed. The time required to extract feature categories is computed and an estimated time of extraction is provided for feature category combinations. We have conducted around 8624 experiments (different combinations of feature categories, binarization methods, feature reduction/selection, normalization, and crystal categories). The best experimental results are obtained using combinations of intensity features, region features using Otsu’s thresholding, region features using green percentile G (90) thresholding, region features using green percentile G (99) thresholding, graph features, and histogram features. Using this feature set combination, 96% accuracy (without misclassifying crystals as non-crystals) was achieved for the first level of classification to determine presence of crystals. Since missing a crystal is not desired, our algorithm is adjusted to achieve a high sensitivity rate. In the second level classification, 74.2% accuracy for (5-class) crystal sub-category classification. Best classification rates were achieved using random forest classifier. CONTRIBUTIONS: The feature extraction and classification could be completed in about 2 s per image on a stand-alone computing system, which is suitable for real time analysis. These results enable research groups to select features according to their hardware setups for real-time analysis. BioMed Central 2017-04-27 /pmc/articles/PMC5408444/ /pubmed/28465724 http://dx.doi.org/10.1186/s13040-017-0133-9 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Sigdel, Madhav
Dinc, Imren
Sigdel, Madhu S.
Dinc, Semih
Pusey, Marc L.
Aygun, Ramazan S.
Feature analysis for classification of trace fluorescent labeled protein crystallization images
title Feature analysis for classification of trace fluorescent labeled protein crystallization images
title_full Feature analysis for classification of trace fluorescent labeled protein crystallization images
title_fullStr Feature analysis for classification of trace fluorescent labeled protein crystallization images
title_full_unstemmed Feature analysis for classification of trace fluorescent labeled protein crystallization images
title_short Feature analysis for classification of trace fluorescent labeled protein crystallization images
title_sort feature analysis for classification of trace fluorescent labeled protein crystallization images
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408444/
https://www.ncbi.nlm.nih.gov/pubmed/28465724
http://dx.doi.org/10.1186/s13040-017-0133-9
work_keys_str_mv AT sigdelmadhav featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages
AT dincimren featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages
AT sigdelmadhus featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages
AT dincsemih featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages
AT puseymarcl featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages
AT aygunramazans featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages