Cargando…
Deep Learning With Sampling in Colon Cancer Histology
This study applied a deep-learning cell identification algorithm to diagnostic images from the colon cancer repository at The Cancer Genome Atlas (TCGA). Within-image sampling improved performance without loss of accuracy. The features thus derived were associated with various clinical variables inc...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445856/ https://www.ncbi.nlm.nih.gov/pubmed/30972333 http://dx.doi.org/10.3389/fbioe.2019.00052 |
_version_ | 1783408255922339840 |
---|---|
author | Shapcott, Mary Hewitt, Katherine J. Rajpoot, Nasir |
author_facet | Shapcott, Mary Hewitt, Katherine J. Rajpoot, Nasir |
author_sort | Shapcott, Mary |
collection | PubMed |
description | This study applied a deep-learning cell identification algorithm to diagnostic images from the colon cancer repository at The Cancer Genome Atlas (TCGA). Within-image sampling improved performance without loss of accuracy. The features thus derived were associated with various clinical variables including metastasis, residual tumor, venous invasion, and lymphatic invasion. The deep-learning algorithm was trained using images from a locally available data set, then applied to the TCGA images by tiling them, and identifying cells in each patch defined by the tiling. In this application the average number of patches containing tissue in an image was ~900. Processing a random sample of patches greatly reduced computation costs. The cell identification algorithm was applied directly to each sampled patch, resulting in a list of cells. Each cell was labeled with its location and classification (“epithelial,” “inflammatory,” “fibroblast,” or “other”). The number of cells of a given type in the patch was calculated, resulting in a patch profile containing four features. A morphological profile that applied to the entire image was obtained by averaging profiles over all patches. Two sampling policies were examined. The first policy was random sampling which samples patches with uniform weighting. The second policy was systematic random sampling which takes spatial dependencies into account. Compared with the processing of complete whole slide images there was a seven-fold improvement in performance when systematic random spatial sampling was used to select 100 tiles from the whole-slide image for processing, with very little loss of accuracy (~4% on average). We found links between the predicted features and clinical variables in the TCGA colon cancer data set. Several significant associations were found: increased fibroblast numbers were associated with the presence of metastasis, venous invasion, lymphatic invasion and residual tumor while decreased numbers of inflammatory cells were associated with mucinous carcinomas. Regarding the four different types of cell, deep learning has generated morphological features that are indicators of cell density. The features are related to cellularity, the numbers, degree, or quality of cells present in a tumor. Cellularity has been reported to be related to patient survival and other diagnostic and prognostic indicators, indicating that the features calculated here may be of general usefulness. |
format | Online Article Text |
id | pubmed-6445856 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-64458562019-04-10 Deep Learning With Sampling in Colon Cancer Histology Shapcott, Mary Hewitt, Katherine J. Rajpoot, Nasir Front Bioeng Biotechnol Bioengineering and Biotechnology This study applied a deep-learning cell identification algorithm to diagnostic images from the colon cancer repository at The Cancer Genome Atlas (TCGA). Within-image sampling improved performance without loss of accuracy. The features thus derived were associated with various clinical variables including metastasis, residual tumor, venous invasion, and lymphatic invasion. The deep-learning algorithm was trained using images from a locally available data set, then applied to the TCGA images by tiling them, and identifying cells in each patch defined by the tiling. In this application the average number of patches containing tissue in an image was ~900. Processing a random sample of patches greatly reduced computation costs. The cell identification algorithm was applied directly to each sampled patch, resulting in a list of cells. Each cell was labeled with its location and classification (“epithelial,” “inflammatory,” “fibroblast,” or “other”). The number of cells of a given type in the patch was calculated, resulting in a patch profile containing four features. A morphological profile that applied to the entire image was obtained by averaging profiles over all patches. Two sampling policies were examined. The first policy was random sampling which samples patches with uniform weighting. The second policy was systematic random sampling which takes spatial dependencies into account. Compared with the processing of complete whole slide images there was a seven-fold improvement in performance when systematic random spatial sampling was used to select 100 tiles from the whole-slide image for processing, with very little loss of accuracy (~4% on average). We found links between the predicted features and clinical variables in the TCGA colon cancer data set. Several significant associations were found: increased fibroblast numbers were associated with the presence of metastasis, venous invasion, lymphatic invasion and residual tumor while decreased numbers of inflammatory cells were associated with mucinous carcinomas. Regarding the four different types of cell, deep learning has generated morphological features that are indicators of cell density. The features are related to cellularity, the numbers, degree, or quality of cells present in a tumor. Cellularity has been reported to be related to patient survival and other diagnostic and prognostic indicators, indicating that the features calculated here may be of general usefulness. Frontiers Media S.A. 2019-03-27 /pmc/articles/PMC6445856/ /pubmed/30972333 http://dx.doi.org/10.3389/fbioe.2019.00052 Text en Copyright © 2019 Shapcott, Hewitt and Rajpoot. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Bioengineering and Biotechnology Shapcott, Mary Hewitt, Katherine J. Rajpoot, Nasir Deep Learning With Sampling in Colon Cancer Histology |
title | Deep Learning With Sampling in Colon Cancer Histology |
title_full | Deep Learning With Sampling in Colon Cancer Histology |
title_fullStr | Deep Learning With Sampling in Colon Cancer Histology |
title_full_unstemmed | Deep Learning With Sampling in Colon Cancer Histology |
title_short | Deep Learning With Sampling in Colon Cancer Histology |
title_sort | deep learning with sampling in colon cancer histology |
topic | Bioengineering and Biotechnology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445856/ https://www.ncbi.nlm.nih.gov/pubmed/30972333 http://dx.doi.org/10.3389/fbioe.2019.00052 |
work_keys_str_mv | AT shapcottmary deeplearningwithsamplingincoloncancerhistology AT hewittkatherinej deeplearningwithsamplingincoloncancerhistology AT rajpootnasir deeplearningwithsamplingincoloncancerhistology |