Cargando…

Effective and efficient active learning for deep learning-based tissue image analysis

MOTIVATION: Deep learning attained excellent results in digital pathology recently. A challenge with its use is that high quality, representative training datasets are required to build robust models. Data annotation in the domain is labor intensive and demands substantial time commitment from exper...

Descripción completa

Detalles Bibliográficos
Autores principales: Meirelles, André L S, Kurc, Tahsin, Kong, Jun, Ferreira, Renato, Saltz, Joel, Teodoro, George
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10079352/
https://www.ncbi.nlm.nih.gov/pubmed/36943380
http://dx.doi.org/10.1093/bioinformatics/btad138
_version_ 1785020710615252992
author Meirelles, André L S
Kurc, Tahsin
Kong, Jun
Ferreira, Renato
Saltz, Joel
Teodoro, George
author_facet Meirelles, André L S
Kurc, Tahsin
Kong, Jun
Ferreira, Renato
Saltz, Joel
Teodoro, George
author_sort Meirelles, André L S
collection PubMed
description MOTIVATION: Deep learning attained excellent results in digital pathology recently. A challenge with its use is that high quality, representative training datasets are required to build robust models. Data annotation in the domain is labor intensive and demands substantial time commitment from expert pathologists. Active learning (AL) is a strategy to minimize annotation. The goal is to select samples from the pool of unlabeled data for annotation that improves model accuracy. However, AL is a very compute demanding approach. The benefits for model learning may vary according to the strategy used, and it may be hard for a domain specialist to fine tune the solution without an integrated interface. RESULTS: We developed a framework that includes a friendly user interface along with run-time optimizations to reduce annotation and execution time in AL in digital pathology. Our solution implements several AL strategies along with our diversity-aware data acquisition (DADA) acquisition function, which enforces data diversity to improve the prediction performance of a model. In this work, we employed a model simplification strategy [Network Auto-Reduction (NAR)] that significantly improves AL execution time when coupled with DADA. NAR produces less compute demanding models, which replace the target models during the AL process to reduce processing demands. An evaluation with a tumor-infiltrating lymphocytes classification application shows that: (i) DADA attains superior performance compared to state-of-the-art AL strategies for different convolutional neural networks (CNNs), (ii) NAR improves the AL execution time by up to 4.3×, and (iii) target models trained with patches/data selected by the NAR reduced versions achieve similar or superior classification quality to using target CNNs for data selection. AVAILABILITY AND IMPLEMENTATION: Source code: https://github.com/alsmeirelles/DADA.
format Online
Article
Text
id pubmed-10079352
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-100793522023-04-07 Effective and efficient active learning for deep learning-based tissue image analysis Meirelles, André L S Kurc, Tahsin Kong, Jun Ferreira, Renato Saltz, Joel Teodoro, George Bioinformatics Original Paper MOTIVATION: Deep learning attained excellent results in digital pathology recently. A challenge with its use is that high quality, representative training datasets are required to build robust models. Data annotation in the domain is labor intensive and demands substantial time commitment from expert pathologists. Active learning (AL) is a strategy to minimize annotation. The goal is to select samples from the pool of unlabeled data for annotation that improves model accuracy. However, AL is a very compute demanding approach. The benefits for model learning may vary according to the strategy used, and it may be hard for a domain specialist to fine tune the solution without an integrated interface. RESULTS: We developed a framework that includes a friendly user interface along with run-time optimizations to reduce annotation and execution time in AL in digital pathology. Our solution implements several AL strategies along with our diversity-aware data acquisition (DADA) acquisition function, which enforces data diversity to improve the prediction performance of a model. In this work, we employed a model simplification strategy [Network Auto-Reduction (NAR)] that significantly improves AL execution time when coupled with DADA. NAR produces less compute demanding models, which replace the target models during the AL process to reduce processing demands. An evaluation with a tumor-infiltrating lymphocytes classification application shows that: (i) DADA attains superior performance compared to state-of-the-art AL strategies for different convolutional neural networks (CNNs), (ii) NAR improves the AL execution time by up to 4.3×, and (iii) target models trained with patches/data selected by the NAR reduced versions achieve similar or superior classification quality to using target CNNs for data selection. AVAILABILITY AND IMPLEMENTATION: Source code: https://github.com/alsmeirelles/DADA. Oxford University Press 2023-03-21 /pmc/articles/PMC10079352/ /pubmed/36943380 http://dx.doi.org/10.1093/bioinformatics/btad138 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Meirelles, André L S
Kurc, Tahsin
Kong, Jun
Ferreira, Renato
Saltz, Joel
Teodoro, George
Effective and efficient active learning for deep learning-based tissue image analysis
title Effective and efficient active learning for deep learning-based tissue image analysis
title_full Effective and efficient active learning for deep learning-based tissue image analysis
title_fullStr Effective and efficient active learning for deep learning-based tissue image analysis
title_full_unstemmed Effective and efficient active learning for deep learning-based tissue image analysis
title_short Effective and efficient active learning for deep learning-based tissue image analysis
title_sort effective and efficient active learning for deep learning-based tissue image analysis
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10079352/
https://www.ncbi.nlm.nih.gov/pubmed/36943380
http://dx.doi.org/10.1093/bioinformatics/btad138
work_keys_str_mv AT meirellesandrels effectiveandefficientactivelearningfordeeplearningbasedtissueimageanalysis
AT kurctahsin effectiveandefficientactivelearningfordeeplearningbasedtissueimageanalysis
AT kongjun effectiveandefficientactivelearningfordeeplearningbasedtissueimageanalysis
AT ferreirarenato effectiveandefficientactivelearningfordeeplearningbasedtissueimageanalysis
AT saltzjoel effectiveandefficientactivelearningfordeeplearningbasedtissueimageanalysis
AT teodorogeorge effectiveandefficientactivelearningfordeeplearningbasedtissueimageanalysis