Cargando…

Deep Learning With Sampling in Colon Cancer Histology

This study applied a deep-learning cell identification algorithm to diagnostic images from the colon cancer repository at The Cancer Genome Atlas (TCGA). Within-image sampling improved performance without loss of accuracy. The features thus derived were associated with various clinical variables inc...

Descripción completa

Detalles Bibliográficos
Autores principales: Shapcott, Mary, Hewitt, Katherine J., Rajpoot, Nasir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445856/
https://www.ncbi.nlm.nih.gov/pubmed/30972333
http://dx.doi.org/10.3389/fbioe.2019.00052
_version_ 1783408255922339840
author Shapcott, Mary
Hewitt, Katherine J.
Rajpoot, Nasir
author_facet Shapcott, Mary
Hewitt, Katherine J.
Rajpoot, Nasir
author_sort Shapcott, Mary
collection PubMed
description This study applied a deep-learning cell identification algorithm to diagnostic images from the colon cancer repository at The Cancer Genome Atlas (TCGA). Within-image sampling improved performance without loss of accuracy. The features thus derived were associated with various clinical variables including metastasis, residual tumor, venous invasion, and lymphatic invasion. The deep-learning algorithm was trained using images from a locally available data set, then applied to the TCGA images by tiling them, and identifying cells in each patch defined by the tiling. In this application the average number of patches containing tissue in an image was ~900. Processing a random sample of patches greatly reduced computation costs. The cell identification algorithm was applied directly to each sampled patch, resulting in a list of cells. Each cell was labeled with its location and classification (“epithelial,” “inflammatory,” “fibroblast,” or “other”). The number of cells of a given type in the patch was calculated, resulting in a patch profile containing four features. A morphological profile that applied to the entire image was obtained by averaging profiles over all patches. Two sampling policies were examined. The first policy was random sampling which samples patches with uniform weighting. The second policy was systematic random sampling which takes spatial dependencies into account. Compared with the processing of complete whole slide images there was a seven-fold improvement in performance when systematic random spatial sampling was used to select 100 tiles from the whole-slide image for processing, with very little loss of accuracy (~4% on average). We found links between the predicted features and clinical variables in the TCGA colon cancer data set. Several significant associations were found: increased fibroblast numbers were associated with the presence of metastasis, venous invasion, lymphatic invasion and residual tumor while decreased numbers of inflammatory cells were associated with mucinous carcinomas. Regarding the four different types of cell, deep learning has generated morphological features that are indicators of cell density. The features are related to cellularity, the numbers, degree, or quality of cells present in a tumor. Cellularity has been reported to be related to patient survival and other diagnostic and prognostic indicators, indicating that the features calculated here may be of general usefulness.
format Online
Article
Text
id pubmed-6445856
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-64458562019-04-10 Deep Learning With Sampling in Colon Cancer Histology Shapcott, Mary Hewitt, Katherine J. Rajpoot, Nasir Front Bioeng Biotechnol Bioengineering and Biotechnology This study applied a deep-learning cell identification algorithm to diagnostic images from the colon cancer repository at The Cancer Genome Atlas (TCGA). Within-image sampling improved performance without loss of accuracy. The features thus derived were associated with various clinical variables including metastasis, residual tumor, venous invasion, and lymphatic invasion. The deep-learning algorithm was trained using images from a locally available data set, then applied to the TCGA images by tiling them, and identifying cells in each patch defined by the tiling. In this application the average number of patches containing tissue in an image was ~900. Processing a random sample of patches greatly reduced computation costs. The cell identification algorithm was applied directly to each sampled patch, resulting in a list of cells. Each cell was labeled with its location and classification (“epithelial,” “inflammatory,” “fibroblast,” or “other”). The number of cells of a given type in the patch was calculated, resulting in a patch profile containing four features. A morphological profile that applied to the entire image was obtained by averaging profiles over all patches. Two sampling policies were examined. The first policy was random sampling which samples patches with uniform weighting. The second policy was systematic random sampling which takes spatial dependencies into account. Compared with the processing of complete whole slide images there was a seven-fold improvement in performance when systematic random spatial sampling was used to select 100 tiles from the whole-slide image for processing, with very little loss of accuracy (~4% on average). We found links between the predicted features and clinical variables in the TCGA colon cancer data set. Several significant associations were found: increased fibroblast numbers were associated with the presence of metastasis, venous invasion, lymphatic invasion and residual tumor while decreased numbers of inflammatory cells were associated with mucinous carcinomas. Regarding the four different types of cell, deep learning has generated morphological features that are indicators of cell density. The features are related to cellularity, the numbers, degree, or quality of cells present in a tumor. Cellularity has been reported to be related to patient survival and other diagnostic and prognostic indicators, indicating that the features calculated here may be of general usefulness. Frontiers Media S.A. 2019-03-27 /pmc/articles/PMC6445856/ /pubmed/30972333 http://dx.doi.org/10.3389/fbioe.2019.00052 Text en Copyright © 2019 Shapcott, Hewitt and Rajpoot. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Shapcott, Mary
Hewitt, Katherine J.
Rajpoot, Nasir
Deep Learning With Sampling in Colon Cancer Histology
title Deep Learning With Sampling in Colon Cancer Histology
title_full Deep Learning With Sampling in Colon Cancer Histology
title_fullStr Deep Learning With Sampling in Colon Cancer Histology
title_full_unstemmed Deep Learning With Sampling in Colon Cancer Histology
title_short Deep Learning With Sampling in Colon Cancer Histology
title_sort deep learning with sampling in colon cancer histology
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445856/
https://www.ncbi.nlm.nih.gov/pubmed/30972333
http://dx.doi.org/10.3389/fbioe.2019.00052
work_keys_str_mv AT shapcottmary deeplearningwithsamplingincoloncancerhistology
AT hewittkatherinej deeplearningwithsamplingincoloncancerhistology
AT rajpootnasir deeplearningwithsamplingincoloncancerhistology