Cargando…

Protein subcellular location pattern classification in cellular images using latent discriminative models

Motivation: Knowledge of the subcellular location of a protein is crucial for understanding its functions. The subcellular pattern of a protein is typically represented as the set of cellular components in which it is located, and an important task is to determine this set from microscope images. In...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jieyue, Xiong, Liang, Schneider, Jeff, Murphy, Robert F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371862/
https://www.ncbi.nlm.nih.gov/pubmed/22689776
http://dx.doi.org/10.1093/bioinformatics/bts230
_version_ 1782235272992260096
author Li, Jieyue
Xiong, Liang
Schneider, Jeff
Murphy, Robert F.
author_facet Li, Jieyue
Xiong, Liang
Schneider, Jeff
Murphy, Robert F.
author_sort Li, Jieyue
collection PubMed
description Motivation: Knowledge of the subcellular location of a protein is crucial for understanding its functions. The subcellular pattern of a protein is typically represented as the set of cellular components in which it is located, and an important task is to determine this set from microscope images. In this article, we address this classification problem using confocal immunofluorescence images from the Human Protein Atlas (HPA) project. The HPA contains images of cells stained for many proteins; each is also stained for three reference components, but there are many other components that are invisible. Given one such cell, the task is to classify the pattern type of the stained protein. We first randomly select local image regions within the cells, and then extract various carefully designed features from these regions. This region-based approach enables us to explicitly study the relationship between proteins and different cell components, as well as the interactions between these components. To achieve these two goals, we propose two discriminative models that extend logistic regression with structured latent variables. The first model allows the same protein pattern class to be expressed differently according to the underlying components in different regions. The second model further captures the spatial dependencies between the components within the same cell so that we can better infer these components. To learn these models, we propose a fast approximate algorithm for inference, and then use gradient-based methods to maximize the data likelihood. Results: In the experiments, we show that the proposed models help improve the classification accuracies on synthetic data and real cellular images. The best overall accuracy we report in this article for classifying 942 proteins into 13 classes of patterns is about 84.6%, which to our knowledge is the best so far. In addition, the dependencies learned are consistent with prior knowledge of cell organization. Availability: http://murphylab.web.cmu.edu/software/. Contact: Jeff.Schneider@cs.cmu.edu, murphy@cmu.edu
format Online
Article
Text
id pubmed-3371862
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33718622012-06-11 Protein subcellular location pattern classification in cellular images using latent discriminative models Li, Jieyue Xiong, Liang Schneider, Jeff Murphy, Robert F. Bioinformatics Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa Motivation: Knowledge of the subcellular location of a protein is crucial for understanding its functions. The subcellular pattern of a protein is typically represented as the set of cellular components in which it is located, and an important task is to determine this set from microscope images. In this article, we address this classification problem using confocal immunofluorescence images from the Human Protein Atlas (HPA) project. The HPA contains images of cells stained for many proteins; each is also stained for three reference components, but there are many other components that are invisible. Given one such cell, the task is to classify the pattern type of the stained protein. We first randomly select local image regions within the cells, and then extract various carefully designed features from these regions. This region-based approach enables us to explicitly study the relationship between proteins and different cell components, as well as the interactions between these components. To achieve these two goals, we propose two discriminative models that extend logistic regression with structured latent variables. The first model allows the same protein pattern class to be expressed differently according to the underlying components in different regions. The second model further captures the spatial dependencies between the components within the same cell so that we can better infer these components. To learn these models, we propose a fast approximate algorithm for inference, and then use gradient-based methods to maximize the data likelihood. Results: In the experiments, we show that the proposed models help improve the classification accuracies on synthetic data and real cellular images. The best overall accuracy we report in this article for classifying 942 proteins into 13 classes of patterns is about 84.6%, which to our knowledge is the best so far. In addition, the dependencies learned are consistent with prior knowledge of cell organization. Availability: http://murphylab.web.cmu.edu/software/. Contact: Jeff.Schneider@cs.cmu.edu, murphy@cmu.edu Oxford University Press 2012-06-15 2012-06-09 /pmc/articles/PMC3371862/ /pubmed/22689776 http://dx.doi.org/10.1093/bioinformatics/bts230 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa
Li, Jieyue
Xiong, Liang
Schneider, Jeff
Murphy, Robert F.
Protein subcellular location pattern classification in cellular images using latent discriminative models
title Protein subcellular location pattern classification in cellular images using latent discriminative models
title_full Protein subcellular location pattern classification in cellular images using latent discriminative models
title_fullStr Protein subcellular location pattern classification in cellular images using latent discriminative models
title_full_unstemmed Protein subcellular location pattern classification in cellular images using latent discriminative models
title_short Protein subcellular location pattern classification in cellular images using latent discriminative models
title_sort protein subcellular location pattern classification in cellular images using latent discriminative models
topic Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371862/
https://www.ncbi.nlm.nih.gov/pubmed/22689776
http://dx.doi.org/10.1093/bioinformatics/bts230
work_keys_str_mv AT lijieyue proteinsubcellularlocationpatternclassificationincellularimagesusinglatentdiscriminativemodels
AT xiongliang proteinsubcellularlocationpatternclassificationincellularimagesusinglatentdiscriminativemodels
AT schneiderjeff proteinsubcellularlocationpatternclassificationincellularimagesusinglatentdiscriminativemodels
AT murphyrobertf proteinsubcellularlocationpatternclassificationincellularimagesusinglatentdiscriminativemodels