Cargando…

Automatic image analysis for gene expression patterns of fly embryos

BACKGROUND: Staining the mRNA of a gene via in situ hybridization (ISH) during the development of a D. melanogaster embryo delivers the detailed spatio-temporal pattern of expression of the gene. Many biological problems such as the detection of co-expressed genes, co-regulated genes, and transcript...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Hanchuan, Long, Fuhui, Zhou, Jie, Leung, Garmay, Eisen, Michael B, Myers, Eugene W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1924512/
https://www.ncbi.nlm.nih.gov/pubmed/17634097
http://dx.doi.org/10.1186/1471-2121-8-S1-S7
_version_ 1782134210762375168
author Peng, Hanchuan
Long, Fuhui
Zhou, Jie
Leung, Garmay
Eisen, Michael B
Myers, Eugene W
author_facet Peng, Hanchuan
Long, Fuhui
Zhou, Jie
Leung, Garmay
Eisen, Michael B
Myers, Eugene W
author_sort Peng, Hanchuan
collection PubMed
description BACKGROUND: Staining the mRNA of a gene via in situ hybridization (ISH) during the development of a D. melanogaster embryo delivers the detailed spatio-temporal pattern of expression of the gene. Many biological problems such as the detection of co-expressed genes, co-regulated genes, and transcription factor binding motifs rely heavily on the analyses of these image patterns. The increasing availability of ISH image data motivates the development of automated computational approaches to the analysis of gene expression patterns. RESULTS: We have developed algorithms and associated software that extracts a feature representation of a gene expression pattern from an ISH image, that clusters genes sharing the same spatio-temporal pattern of expression, that suggests transcription factor binding (TFB) site motifs for genes that appear to be co-regulated (based on the clustering), and that automatically identifies the anatomical regions that express a gene given a training set of annotations. In fact, we developed three different feature representations, based on Gaussian Mixture Models (GMM), Principal Component Analysis (PCA), and wavelet functions, each having different merits with respect to the tasks above. For clustering image patterns, we developed a minimum spanning tree method (MSTCUT), and for proposing TFB sites we used standard motif finders on clustered/co-expressed genes with the added twist of requiring conservation across the genomes of 8 related fly species. Lastly, we trained a suite of binary-classifiers, one for each anatomical annotation term in a controlled vocabulary or ontology that operate on the wavelet feature representation. We report the results of applying these methods to the Berkeley Drosophila Genome Project (BDGP) gene expression database. CONCLUSION: Our automatic image analysis methods recapitulate known co-regulated genes and give correct developmental-stage classifications with 99+% accuracy, despite variations in morphology, orientation, and focal plane suggesting that these techniques form a set of useful tools for the large-scale computational analysis of fly embryonic gene expression patterns.
format Text
id pubmed-1924512
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19245122007-07-18 Automatic image analysis for gene expression patterns of fly embryos Peng, Hanchuan Long, Fuhui Zhou, Jie Leung, Garmay Eisen, Michael B Myers, Eugene W BMC Cell Biol Research BACKGROUND: Staining the mRNA of a gene via in situ hybridization (ISH) during the development of a D. melanogaster embryo delivers the detailed spatio-temporal pattern of expression of the gene. Many biological problems such as the detection of co-expressed genes, co-regulated genes, and transcription factor binding motifs rely heavily on the analyses of these image patterns. The increasing availability of ISH image data motivates the development of automated computational approaches to the analysis of gene expression patterns. RESULTS: We have developed algorithms and associated software that extracts a feature representation of a gene expression pattern from an ISH image, that clusters genes sharing the same spatio-temporal pattern of expression, that suggests transcription factor binding (TFB) site motifs for genes that appear to be co-regulated (based on the clustering), and that automatically identifies the anatomical regions that express a gene given a training set of annotations. In fact, we developed three different feature representations, based on Gaussian Mixture Models (GMM), Principal Component Analysis (PCA), and wavelet functions, each having different merits with respect to the tasks above. For clustering image patterns, we developed a minimum spanning tree method (MSTCUT), and for proposing TFB sites we used standard motif finders on clustered/co-expressed genes with the added twist of requiring conservation across the genomes of 8 related fly species. Lastly, we trained a suite of binary-classifiers, one for each anatomical annotation term in a controlled vocabulary or ontology that operate on the wavelet feature representation. We report the results of applying these methods to the Berkeley Drosophila Genome Project (BDGP) gene expression database. CONCLUSION: Our automatic image analysis methods recapitulate known co-regulated genes and give correct developmental-stage classifications with 99+% accuracy, despite variations in morphology, orientation, and focal plane suggesting that these techniques form a set of useful tools for the large-scale computational analysis of fly embryonic gene expression patterns. BioMed Central 2007-07-10 /pmc/articles/PMC1924512/ /pubmed/17634097 http://dx.doi.org/10.1186/1471-2121-8-S1-S7 Text en Copyright © 2007 Peng et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Peng, Hanchuan
Long, Fuhui
Zhou, Jie
Leung, Garmay
Eisen, Michael B
Myers, Eugene W
Automatic image analysis for gene expression patterns of fly embryos
title Automatic image analysis for gene expression patterns of fly embryos
title_full Automatic image analysis for gene expression patterns of fly embryos
title_fullStr Automatic image analysis for gene expression patterns of fly embryos
title_full_unstemmed Automatic image analysis for gene expression patterns of fly embryos
title_short Automatic image analysis for gene expression patterns of fly embryos
title_sort automatic image analysis for gene expression patterns of fly embryos
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1924512/
https://www.ncbi.nlm.nih.gov/pubmed/17634097
http://dx.doi.org/10.1186/1471-2121-8-S1-S7
work_keys_str_mv AT penghanchuan automaticimageanalysisforgeneexpressionpatternsofflyembryos
AT longfuhui automaticimageanalysisforgeneexpressionpatternsofflyembryos
AT zhoujie automaticimageanalysisforgeneexpressionpatternsofflyembryos
AT leunggarmay automaticimageanalysisforgeneexpressionpatternsofflyembryos
AT eisenmichaelb automaticimageanalysisforgeneexpressionpatternsofflyembryos
AT myerseugenew automaticimageanalysisforgeneexpressionpatternsofflyembryos