Cargando…
Feature analysis for classification of trace fluorescent labeled protein crystallization images
BACKGROUND: Large number of features are extracted from protein crystallization trial images to improve the accuracy of classifiers for predicting the presence of crystals or phases of the crystallization process. The excessive number of features and computationally intensive image processing method...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408444/ https://www.ncbi.nlm.nih.gov/pubmed/28465724 http://dx.doi.org/10.1186/s13040-017-0133-9 |
_version_ | 1783232309232664576 |
---|---|
author | Sigdel, Madhav Dinc, Imren Sigdel, Madhu S. Dinc, Semih Pusey, Marc L. Aygun, Ramazan S. |
author_facet | Sigdel, Madhav Dinc, Imren Sigdel, Madhu S. Dinc, Semih Pusey, Marc L. Aygun, Ramazan S. |
author_sort | Sigdel, Madhav |
collection | PubMed |
description | BACKGROUND: Large number of features are extracted from protein crystallization trial images to improve the accuracy of classifiers for predicting the presence of crystals or phases of the crystallization process. The excessive number of features and computationally intensive image processing methods to extract these features make utilization of automated classification tools on stand-alone computing systems inconvenient due to the required time to complete the classification tasks. Combinations of image feature sets, feature reduction and classification techniques for crystallization images benefiting from trace fluorescence labeling are investigated. RESULTS: Features are categorized into intensity, graph, histogram, texture, shape adaptive, and region features (using binarized images generated by Otsu’s, green percentile, and morphological thresholding). The effects of normalization, feature reduction with principle components analysis (PCA), and feature selection using random forest classifier are also analyzed. The time required to extract feature categories is computed and an estimated time of extraction is provided for feature category combinations. We have conducted around 8624 experiments (different combinations of feature categories, binarization methods, feature reduction/selection, normalization, and crystal categories). The best experimental results are obtained using combinations of intensity features, region features using Otsu’s thresholding, region features using green percentile G (90) thresholding, region features using green percentile G (99) thresholding, graph features, and histogram features. Using this feature set combination, 96% accuracy (without misclassifying crystals as non-crystals) was achieved for the first level of classification to determine presence of crystals. Since missing a crystal is not desired, our algorithm is adjusted to achieve a high sensitivity rate. In the second level classification, 74.2% accuracy for (5-class) crystal sub-category classification. Best classification rates were achieved using random forest classifier. CONTRIBUTIONS: The feature extraction and classification could be completed in about 2 s per image on a stand-alone computing system, which is suitable for real time analysis. These results enable research groups to select features according to their hardware setups for real-time analysis. |
format | Online Article Text |
id | pubmed-5408444 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-54084442017-05-02 Feature analysis for classification of trace fluorescent labeled protein crystallization images Sigdel, Madhav Dinc, Imren Sigdel, Madhu S. Dinc, Semih Pusey, Marc L. Aygun, Ramazan S. BioData Min Research BACKGROUND: Large number of features are extracted from protein crystallization trial images to improve the accuracy of classifiers for predicting the presence of crystals or phases of the crystallization process. The excessive number of features and computationally intensive image processing methods to extract these features make utilization of automated classification tools on stand-alone computing systems inconvenient due to the required time to complete the classification tasks. Combinations of image feature sets, feature reduction and classification techniques for crystallization images benefiting from trace fluorescence labeling are investigated. RESULTS: Features are categorized into intensity, graph, histogram, texture, shape adaptive, and region features (using binarized images generated by Otsu’s, green percentile, and morphological thresholding). The effects of normalization, feature reduction with principle components analysis (PCA), and feature selection using random forest classifier are also analyzed. The time required to extract feature categories is computed and an estimated time of extraction is provided for feature category combinations. We have conducted around 8624 experiments (different combinations of feature categories, binarization methods, feature reduction/selection, normalization, and crystal categories). The best experimental results are obtained using combinations of intensity features, region features using Otsu’s thresholding, region features using green percentile G (90) thresholding, region features using green percentile G (99) thresholding, graph features, and histogram features. Using this feature set combination, 96% accuracy (without misclassifying crystals as non-crystals) was achieved for the first level of classification to determine presence of crystals. Since missing a crystal is not desired, our algorithm is adjusted to achieve a high sensitivity rate. In the second level classification, 74.2% accuracy for (5-class) crystal sub-category classification. Best classification rates were achieved using random forest classifier. CONTRIBUTIONS: The feature extraction and classification could be completed in about 2 s per image on a stand-alone computing system, which is suitable for real time analysis. These results enable research groups to select features according to their hardware setups for real-time analysis. BioMed Central 2017-04-27 /pmc/articles/PMC5408444/ /pubmed/28465724 http://dx.doi.org/10.1186/s13040-017-0133-9 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Sigdel, Madhav Dinc, Imren Sigdel, Madhu S. Dinc, Semih Pusey, Marc L. Aygun, Ramazan S. Feature analysis for classification of trace fluorescent labeled protein crystallization images |
title | Feature analysis for classification of trace fluorescent labeled protein crystallization images |
title_full | Feature analysis for classification of trace fluorescent labeled protein crystallization images |
title_fullStr | Feature analysis for classification of trace fluorescent labeled protein crystallization images |
title_full_unstemmed | Feature analysis for classification of trace fluorescent labeled protein crystallization images |
title_short | Feature analysis for classification of trace fluorescent labeled protein crystallization images |
title_sort | feature analysis for classification of trace fluorescent labeled protein crystallization images |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408444/ https://www.ncbi.nlm.nih.gov/pubmed/28465724 http://dx.doi.org/10.1186/s13040-017-0133-9 |
work_keys_str_mv | AT sigdelmadhav featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages AT dincimren featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages AT sigdelmadhus featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages AT dincsemih featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages AT puseymarcl featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages AT aygunramazans featureanalysisforclassificationoftracefluorescentlabeledproteincrystallizationimages |