Cargando…

Tens of images can suffice to train neural networks for malignant leukocyte detection

Convolutional neural networks (CNNs) excel as powerful tools for biomedical image classification. It is commonly assumed that training CNNs requires large amounts of annotated data. This is a bottleneck in many medical applications where annotation relies on expert knowledge. Here, we analyze the bi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Schouten, Jens P. E., Matek, Christian, Jacobs, Luuk F. P., Buck, Michèle C., Bošnački, Dragan, Marr, Carsten
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8042012/ https://www.ncbi.nlm.nih.gov/pubmed/33846442 http://dx.doi.org/10.1038/s41598-021-86995-5

_version_	1783678043403845632
author	Schouten, Jens P. E. Matek, Christian Jacobs, Luuk F. P. Buck, Michèle C. Bošnački, Dragan Marr, Carsten
author_facet	Schouten, Jens P. E. Matek, Christian Jacobs, Luuk F. P. Buck, Michèle C. Bošnački, Dragan Marr, Carsten
author_sort	Schouten, Jens P. E.
collection	PubMed
description	Convolutional neural networks (CNNs) excel as powerful tools for biomedical image classification. It is commonly assumed that training CNNs requires large amounts of annotated data. This is a bottleneck in many medical applications where annotation relies on expert knowledge. Here, we analyze the binary classification performance of a CNN on two independent cytomorphology datasets as a function of training set size. Specifically, we train a sequential model to discriminate non-malignant leukocytes from blast cells, whose appearance in the peripheral blood is a hallmark of leukemia. We systematically vary training set size, finding that tens of training images suffice for a binary classification with an ROC-AUC over 90%. Saliency maps and layer-wise relevance propagation visualizations suggest that the network learns to increasingly focus on nuclear structures of leukocytes as the number of training images is increased. A low dimensional tSNE representation reveals that while the two classes are separated already for a few training images, the distinction between the classes becomes clearer when more training images are used. To evaluate the performance in a multi-class problem, we annotated single-cell images from a acute lymphoblastic leukemia dataset into six different hematopoietic classes. Multi-class prediction suggests that also here few single-cell images suffice if differences between morphological classes are large enough. The incorporation of deep learning algorithms into clinical practice has the potential to reduce variability and cost, democratize usage of expertise, and allow for early detection of disease onset and relapse. Our approach evaluates the performance of a deep learning based cytology classifier with respect to size and complexity of the training data and the classification task.
format	Online Article Text
id	pubmed-8042012
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-80420122021-04-14 Tens of images can suffice to train neural networks for malignant leukocyte detection Schouten, Jens P. E. Matek, Christian Jacobs, Luuk F. P. Buck, Michèle C. Bošnački, Dragan Marr, Carsten Sci Rep Article Convolutional neural networks (CNNs) excel as powerful tools for biomedical image classification. It is commonly assumed that training CNNs requires large amounts of annotated data. This is a bottleneck in many medical applications where annotation relies on expert knowledge. Here, we analyze the binary classification performance of a CNN on two independent cytomorphology datasets as a function of training set size. Specifically, we train a sequential model to discriminate non-malignant leukocytes from blast cells, whose appearance in the peripheral blood is a hallmark of leukemia. We systematically vary training set size, finding that tens of training images suffice for a binary classification with an ROC-AUC over 90%. Saliency maps and layer-wise relevance propagation visualizations suggest that the network learns to increasingly focus on nuclear structures of leukocytes as the number of training images is increased. A low dimensional tSNE representation reveals that while the two classes are separated already for a few training images, the distinction between the classes becomes clearer when more training images are used. To evaluate the performance in a multi-class problem, we annotated single-cell images from a acute lymphoblastic leukemia dataset into six different hematopoietic classes. Multi-class prediction suggests that also here few single-cell images suffice if differences between morphological classes are large enough. The incorporation of deep learning algorithms into clinical practice has the potential to reduce variability and cost, democratize usage of expertise, and allow for early detection of disease onset and relapse. Our approach evaluates the performance of a deep learning based cytology classifier with respect to size and complexity of the training data and the classification task. Nature Publishing Group UK 2021-04-12 /pmc/articles/PMC8042012/ /pubmed/33846442 http://dx.doi.org/10.1038/s41598-021-86995-5 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Schouten, Jens P. E. Matek, Christian Jacobs, Luuk F. P. Buck, Michèle C. Bošnački, Dragan Marr, Carsten Tens of images can suffice to train neural networks for malignant leukocyte detection
title	Tens of images can suffice to train neural networks for malignant leukocyte detection
title_full	Tens of images can suffice to train neural networks for malignant leukocyte detection
title_fullStr	Tens of images can suffice to train neural networks for malignant leukocyte detection
title_full_unstemmed	Tens of images can suffice to train neural networks for malignant leukocyte detection
title_short	Tens of images can suffice to train neural networks for malignant leukocyte detection
title_sort	tens of images can suffice to train neural networks for malignant leukocyte detection
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8042012/ https://www.ncbi.nlm.nih.gov/pubmed/33846442 http://dx.doi.org/10.1038/s41598-021-86995-5
work_keys_str_mv	AT schoutenjenspe tensofimagescansufficetotrainneuralnetworksformalignantleukocytedetection AT matekchristian tensofimagescansufficetotrainneuralnetworksformalignantleukocytedetection AT jacobsluukfp tensofimagescansufficetotrainneuralnetworksformalignantleukocytedetection AT buckmichelec tensofimagescansufficetotrainneuralnetworksformalignantleukocytedetection AT bosnackidragan tensofimagescansufficetotrainneuralnetworksformalignantleukocytedetection AT marrcarsten tensofimagescansufficetotrainneuralnetworksformalignantleukocytedetection

Tens of images can suffice to train neural networks for malignant leukocyte detection

Ejemplares similares