Cargando…

Biological data annotation via a human-augmenting AI-based labeling system

Biology has become a prime area for the deployment of deep learning and artificial intelligence (AI), enabled largely by the massive data sets that the field can generate. Key to most AI tasks is the availability of a sufficiently large, labeled data set with which to train AI models. In the context...

Descripción completa

Detalles Bibliográficos
Autores principales: van der Wal, Douwe, Jhun, Iny, Laklouk, Israa, Nirschl, Jeff, Richer, Lara, Rojansky, Rebecca, Theparee, Talent, Wheeler, Joshua, Sander, Jörg, Feng, Felix, Mohamad, Osama, Savarese, Silvio, Socher, Richard, Esteva, Andre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8497580/
https://www.ncbi.nlm.nih.gov/pubmed/34620993
http://dx.doi.org/10.1038/s41746-021-00520-6
_version_ 1784579986583191552
author van der Wal, Douwe
Jhun, Iny
Laklouk, Israa
Nirschl, Jeff
Richer, Lara
Rojansky, Rebecca
Theparee, Talent
Wheeler, Joshua
Sander, Jörg
Feng, Felix
Mohamad, Osama
Savarese, Silvio
Socher, Richard
Esteva, Andre
author_facet van der Wal, Douwe
Jhun, Iny
Laklouk, Israa
Nirschl, Jeff
Richer, Lara
Rojansky, Rebecca
Theparee, Talent
Wheeler, Joshua
Sander, Jörg
Feng, Felix
Mohamad, Osama
Savarese, Silvio
Socher, Richard
Esteva, Andre
author_sort van der Wal, Douwe
collection PubMed
description Biology has become a prime area for the deployment of deep learning and artificial intelligence (AI), enabled largely by the massive data sets that the field can generate. Key to most AI tasks is the availability of a sufficiently large, labeled data set with which to train AI models. In the context of microscopy, it is easy to generate image data sets containing millions of cells and structures. However, it is challenging to obtain large-scale high-quality annotations for AI models. Here, we present HALS (Human-Augmenting Labeling System), a human-in-the-loop data labeling AI, which begins uninitialized and learns annotations from a human, in real-time. Using a multi-part AI composed of three deep learning models, HALS learns from just a few examples and immediately decreases the workload of the annotator, while increasing the quality of their annotations. Using a highly repetitive use-case—annotating cell types—and running experiments with seven pathologists—experts at the microscopic analysis of biological specimens—we demonstrate a manual work reduction of 90.60%, and an average data-quality boost of 4.34%, measured across four use-cases and two tissue stain types.
format Online
Article
Text
id pubmed-8497580
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-84975802021-10-08 Biological data annotation via a human-augmenting AI-based labeling system van der Wal, Douwe Jhun, Iny Laklouk, Israa Nirschl, Jeff Richer, Lara Rojansky, Rebecca Theparee, Talent Wheeler, Joshua Sander, Jörg Feng, Felix Mohamad, Osama Savarese, Silvio Socher, Richard Esteva, Andre NPJ Digit Med Article Biology has become a prime area for the deployment of deep learning and artificial intelligence (AI), enabled largely by the massive data sets that the field can generate. Key to most AI tasks is the availability of a sufficiently large, labeled data set with which to train AI models. In the context of microscopy, it is easy to generate image data sets containing millions of cells and structures. However, it is challenging to obtain large-scale high-quality annotations for AI models. Here, we present HALS (Human-Augmenting Labeling System), a human-in-the-loop data labeling AI, which begins uninitialized and learns annotations from a human, in real-time. Using a multi-part AI composed of three deep learning models, HALS learns from just a few examples and immediately decreases the workload of the annotator, while increasing the quality of their annotations. Using a highly repetitive use-case—annotating cell types—and running experiments with seven pathologists—experts at the microscopic analysis of biological specimens—we demonstrate a manual work reduction of 90.60%, and an average data-quality boost of 4.34%, measured across four use-cases and two tissue stain types. Nature Publishing Group UK 2021-10-07 /pmc/articles/PMC8497580/ /pubmed/34620993 http://dx.doi.org/10.1038/s41746-021-00520-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
van der Wal, Douwe
Jhun, Iny
Laklouk, Israa
Nirschl, Jeff
Richer, Lara
Rojansky, Rebecca
Theparee, Talent
Wheeler, Joshua
Sander, Jörg
Feng, Felix
Mohamad, Osama
Savarese, Silvio
Socher, Richard
Esteva, Andre
Biological data annotation via a human-augmenting AI-based labeling system
title Biological data annotation via a human-augmenting AI-based labeling system
title_full Biological data annotation via a human-augmenting AI-based labeling system
title_fullStr Biological data annotation via a human-augmenting AI-based labeling system
title_full_unstemmed Biological data annotation via a human-augmenting AI-based labeling system
title_short Biological data annotation via a human-augmenting AI-based labeling system
title_sort biological data annotation via a human-augmenting ai-based labeling system
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8497580/
https://www.ncbi.nlm.nih.gov/pubmed/34620993
http://dx.doi.org/10.1038/s41746-021-00520-6
work_keys_str_mv AT vanderwaldouwe biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT jhuniny biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT lakloukisraa biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT nirschljeff biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT richerlara biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT rojanskyrebecca biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT thepareetalent biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT wheelerjoshua biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT sanderjorg biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT fengfelix biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT mohamadosama biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT savaresesilvio biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT socherrichard biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem
AT estevaandre biologicaldataannotationviaahumanaugmentingaibasedlabelingsystem