Cargando…

Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy

OBJECTIVE: To explore a centralized approach to build test sets and assess the performance of an artificial intelligence medical device (AIMD) which is intended for computer-aided diagnosis of diabetic retinopathy (DR). METHOD: A framework was proposed to conduct data collection, data curation, and...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Hao, Meng, Xiangfeng, Tang, Qiaohong, Hao, Ye, Luo, Yan, Li, Jiage
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931476/
https://www.ncbi.nlm.nih.gov/pubmed/36818382
http://dx.doi.org/10.1155/2023/7139560
_version_ 1784889258640670720
author Wang, Hao
Meng, Xiangfeng
Tang, Qiaohong
Hao, Ye
Luo, Yan
Li, Jiage
author_facet Wang, Hao
Meng, Xiangfeng
Tang, Qiaohong
Hao, Ye
Luo, Yan
Li, Jiage
author_sort Wang, Hao
collection PubMed
description OBJECTIVE: To explore a centralized approach to build test sets and assess the performance of an artificial intelligence medical device (AIMD) which is intended for computer-aided diagnosis of diabetic retinopathy (DR). METHOD: A framework was proposed to conduct data collection, data curation, and annotation. Deidentified colour fundus photographs were collected from 11 partner hospitals with raw labels. Photographs with sensitive information or authenticity issues were excluded during vetting. A team of annotators was recruited through qualification examinations and trained. The annotation process included three steps: initial annotation, review, and arbitration. The annotated data then composed a standardized test set, which was further imported to algorithms under test (AUT) from different developers. The algorithm outputs were compared with the final annotation results (reference standard). RESULT: The test set consists of 6327 digital colour fundus photographs. The final labels include 5 stages of DR and non-DR, as well as other ocular diseases and photographs with unacceptable quality. The Fleiss Kappa was 0.75 among the annotators. The Cohen's kappa between raw labels and final labels is 0.5. Using this test set, five AUTs were tested and compared quantitatively. The metrics include accuracy, sensitivity, and specificity. The AUTs showed inhomogeneous capabilities to classify different types of fundus photographs. CONCLUSIONS: This article demonstrated a workflow to build standardized test sets and conduct algorithm testing of the AIMD for computer-aided diagnosis of diabetic retinopathy. It may provide a reference to develop technical standards that promote product verification and quality control, improving the comparability of products.
format Online
Article
Text
id pubmed-9931476
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-99314762023-02-16 Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy Wang, Hao Meng, Xiangfeng Tang, Qiaohong Hao, Ye Luo, Yan Li, Jiage J Healthc Eng Research Article OBJECTIVE: To explore a centralized approach to build test sets and assess the performance of an artificial intelligence medical device (AIMD) which is intended for computer-aided diagnosis of diabetic retinopathy (DR). METHOD: A framework was proposed to conduct data collection, data curation, and annotation. Deidentified colour fundus photographs were collected from 11 partner hospitals with raw labels. Photographs with sensitive information or authenticity issues were excluded during vetting. A team of annotators was recruited through qualification examinations and trained. The annotation process included three steps: initial annotation, review, and arbitration. The annotated data then composed a standardized test set, which was further imported to algorithms under test (AUT) from different developers. The algorithm outputs were compared with the final annotation results (reference standard). RESULT: The test set consists of 6327 digital colour fundus photographs. The final labels include 5 stages of DR and non-DR, as well as other ocular diseases and photographs with unacceptable quality. The Fleiss Kappa was 0.75 among the annotators. The Cohen's kappa between raw labels and final labels is 0.5. Using this test set, five AUTs were tested and compared quantitatively. The metrics include accuracy, sensitivity, and specificity. The AUTs showed inhomogeneous capabilities to classify different types of fundus photographs. CONCLUSIONS: This article demonstrated a workflow to build standardized test sets and conduct algorithm testing of the AIMD for computer-aided diagnosis of diabetic retinopathy. It may provide a reference to develop technical standards that promote product verification and quality control, improving the comparability of products. Hindawi 2023-02-08 /pmc/articles/PMC9931476/ /pubmed/36818382 http://dx.doi.org/10.1155/2023/7139560 Text en Copyright © 2023 Hao Wang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wang, Hao
Meng, Xiangfeng
Tang, Qiaohong
Hao, Ye
Luo, Yan
Li, Jiage
Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy
title Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy
title_full Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy
title_fullStr Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy
title_full_unstemmed Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy
title_short Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy
title_sort development and application of a standardized testset for an artificial intelligence medical device intended for the computer-aided diagnosis of diabetic retinopathy
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931476/
https://www.ncbi.nlm.nih.gov/pubmed/36818382
http://dx.doi.org/10.1155/2023/7139560
work_keys_str_mv AT wanghao developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy
AT mengxiangfeng developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy
AT tangqiaohong developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy
AT haoye developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy
AT luoyan developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy
AT lijiage developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy