Cargando…
Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice
BACKGROUND: Machine learning sustains successful application to many diagnostic and prognostic problems in computational histopathology. Yet, few efforts have been made to model gene expression from histopathology. This study proposes a methodology which predicts selected gene expression values (mic...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8138606/ https://www.ncbi.nlm.nih.gov/pubmed/34000621 http://dx.doi.org/10.1016/j.ebiom.2021.103388 |
_version_ | 1783695844555358208 |
---|---|
author | Tavolara, Thomas E. Niazi, M.K.K. Gower, Adam C. Ginese, Melanie Beamer, Gillian Gurcan, Metin N. |
author_facet | Tavolara, Thomas E. Niazi, M.K.K. Gower, Adam C. Ginese, Melanie Beamer, Gillian Gurcan, Metin N. |
author_sort | Tavolara, Thomas E. |
collection | PubMed |
description | BACKGROUND: Machine learning sustains successful application to many diagnostic and prognostic problems in computational histopathology. Yet, few efforts have been made to model gene expression from histopathology. This study proposes a methodology which predicts selected gene expression values (microarray) from haematoxylin and eosin whole-slide images as an intermediate data modality to identify fulminant-like pulmonary tuberculosis ('supersusceptible') in an experimentally infected cohort of Diversity Outbred mice (n=77). METHODS: Gradient-boosted trees were utilized as a novel feature selector to identify gene transcripts predictive of fulminant-like pulmonary tuberculosis. A novel attention-based multiple instance learning model for regression was used to predict selected genes' expression from whole-slide images. Gene expression predictions were shown to be sufficiently replicated to identify supersusceptible mice using gradient-boosted trees trained on ground truth gene expression data. FINDINGS: The model was accurate, showing high positive correlations with ground truth gene expression on both cross-validation (n = 77, 0.63 ≤ ρ ≤ 0.84) and external testing sets (n = 33, 0.65 ≤ ρ ≤ 0.84). The sensitivity and specificity for gene expression predictions to identify supersusceptible mice (n=77) were 0.88 and 0.95, respectively, and for an external set of mice (n=33) 0.88 and 0.93, respectively. IMPLICATIONS: Our methodology maps histopathology to gene expression with sufficient accuracy to predict a clinical outcome. The proposed methodology exemplifies a computational template for gene expression panels, in which relatively inexpensive and widely available tissue histopathology may be mapped to specific genes' expression to serve as a diagnostic or prognostic tool. FUNDING: National Institutes of Health and American Lung Association. |
format | Online Article Text |
id | pubmed-8138606 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-81386062021-05-24 Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice Tavolara, Thomas E. Niazi, M.K.K. Gower, Adam C. Ginese, Melanie Beamer, Gillian Gurcan, Metin N. EBioMedicine Research Paper BACKGROUND: Machine learning sustains successful application to many diagnostic and prognostic problems in computational histopathology. Yet, few efforts have been made to model gene expression from histopathology. This study proposes a methodology which predicts selected gene expression values (microarray) from haematoxylin and eosin whole-slide images as an intermediate data modality to identify fulminant-like pulmonary tuberculosis ('supersusceptible') in an experimentally infected cohort of Diversity Outbred mice (n=77). METHODS: Gradient-boosted trees were utilized as a novel feature selector to identify gene transcripts predictive of fulminant-like pulmonary tuberculosis. A novel attention-based multiple instance learning model for regression was used to predict selected genes' expression from whole-slide images. Gene expression predictions were shown to be sufficiently replicated to identify supersusceptible mice using gradient-boosted trees trained on ground truth gene expression data. FINDINGS: The model was accurate, showing high positive correlations with ground truth gene expression on both cross-validation (n = 77, 0.63 ≤ ρ ≤ 0.84) and external testing sets (n = 33, 0.65 ≤ ρ ≤ 0.84). The sensitivity and specificity for gene expression predictions to identify supersusceptible mice (n=77) were 0.88 and 0.95, respectively, and for an external set of mice (n=33) 0.88 and 0.93, respectively. IMPLICATIONS: Our methodology maps histopathology to gene expression with sufficient accuracy to predict a clinical outcome. The proposed methodology exemplifies a computational template for gene expression panels, in which relatively inexpensive and widely available tissue histopathology may be mapped to specific genes' expression to serve as a diagnostic or prognostic tool. FUNDING: National Institutes of Health and American Lung Association. Elsevier 2021-05-14 /pmc/articles/PMC8138606/ /pubmed/34000621 http://dx.doi.org/10.1016/j.ebiom.2021.103388 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Paper Tavolara, Thomas E. Niazi, M.K.K. Gower, Adam C. Ginese, Melanie Beamer, Gillian Gurcan, Metin N. Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice |
title | Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice |
title_full | Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice |
title_fullStr | Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice |
title_full_unstemmed | Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice |
title_short | Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected Diversity Outbred mice |
title_sort | deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in mycobacterium tuberculosis infected diversity outbred mice |
topic | Research Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8138606/ https://www.ncbi.nlm.nih.gov/pubmed/34000621 http://dx.doi.org/10.1016/j.ebiom.2021.103388 |
work_keys_str_mv | AT tavolarathomase deeplearningpredictsgeneexpressionasanintermediatedatamodalitytoidentifysusceptibilitypatternsinmycobacteriumtuberculosisinfecteddiversityoutbredmice AT niazimkk deeplearningpredictsgeneexpressionasanintermediatedatamodalitytoidentifysusceptibilitypatternsinmycobacteriumtuberculosisinfecteddiversityoutbredmice AT goweradamc deeplearningpredictsgeneexpressionasanintermediatedatamodalitytoidentifysusceptibilitypatternsinmycobacteriumtuberculosisinfecteddiversityoutbredmice AT ginesemelanie deeplearningpredictsgeneexpressionasanintermediatedatamodalitytoidentifysusceptibilitypatternsinmycobacteriumtuberculosisinfecteddiversityoutbredmice AT beamergillian deeplearningpredictsgeneexpressionasanintermediatedatamodalitytoidentifysusceptibilitypatternsinmycobacteriumtuberculosisinfecteddiversityoutbredmice AT gurcanmetinn deeplearningpredictsgeneexpressionasanintermediatedatamodalitytoidentifysusceptibilitypatternsinmycobacteriumtuberculosisinfecteddiversityoutbredmice |