Cargando…
Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy
OBJECTIVE: To explore a centralized approach to build test sets and assess the performance of an artificial intelligence medical device (AIMD) which is intended for computer-aided diagnosis of diabetic retinopathy (DR). METHOD: A framework was proposed to conduct data collection, data curation, and...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931476/ https://www.ncbi.nlm.nih.gov/pubmed/36818382 http://dx.doi.org/10.1155/2023/7139560 |
_version_ | 1784889258640670720 |
---|---|
author | Wang, Hao Meng, Xiangfeng Tang, Qiaohong Hao, Ye Luo, Yan Li, Jiage |
author_facet | Wang, Hao Meng, Xiangfeng Tang, Qiaohong Hao, Ye Luo, Yan Li, Jiage |
author_sort | Wang, Hao |
collection | PubMed |
description | OBJECTIVE: To explore a centralized approach to build test sets and assess the performance of an artificial intelligence medical device (AIMD) which is intended for computer-aided diagnosis of diabetic retinopathy (DR). METHOD: A framework was proposed to conduct data collection, data curation, and annotation. Deidentified colour fundus photographs were collected from 11 partner hospitals with raw labels. Photographs with sensitive information or authenticity issues were excluded during vetting. A team of annotators was recruited through qualification examinations and trained. The annotation process included three steps: initial annotation, review, and arbitration. The annotated data then composed a standardized test set, which was further imported to algorithms under test (AUT) from different developers. The algorithm outputs were compared with the final annotation results (reference standard). RESULT: The test set consists of 6327 digital colour fundus photographs. The final labels include 5 stages of DR and non-DR, as well as other ocular diseases and photographs with unacceptable quality. The Fleiss Kappa was 0.75 among the annotators. The Cohen's kappa between raw labels and final labels is 0.5. Using this test set, five AUTs were tested and compared quantitatively. The metrics include accuracy, sensitivity, and specificity. The AUTs showed inhomogeneous capabilities to classify different types of fundus photographs. CONCLUSIONS: This article demonstrated a workflow to build standardized test sets and conduct algorithm testing of the AIMD for computer-aided diagnosis of diabetic retinopathy. It may provide a reference to develop technical standards that promote product verification and quality control, improving the comparability of products. |
format | Online Article Text |
id | pubmed-9931476 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-99314762023-02-16 Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy Wang, Hao Meng, Xiangfeng Tang, Qiaohong Hao, Ye Luo, Yan Li, Jiage J Healthc Eng Research Article OBJECTIVE: To explore a centralized approach to build test sets and assess the performance of an artificial intelligence medical device (AIMD) which is intended for computer-aided diagnosis of diabetic retinopathy (DR). METHOD: A framework was proposed to conduct data collection, data curation, and annotation. Deidentified colour fundus photographs were collected from 11 partner hospitals with raw labels. Photographs with sensitive information or authenticity issues were excluded during vetting. A team of annotators was recruited through qualification examinations and trained. The annotation process included three steps: initial annotation, review, and arbitration. The annotated data then composed a standardized test set, which was further imported to algorithms under test (AUT) from different developers. The algorithm outputs were compared with the final annotation results (reference standard). RESULT: The test set consists of 6327 digital colour fundus photographs. The final labels include 5 stages of DR and non-DR, as well as other ocular diseases and photographs with unacceptable quality. The Fleiss Kappa was 0.75 among the annotators. The Cohen's kappa between raw labels and final labels is 0.5. Using this test set, five AUTs were tested and compared quantitatively. The metrics include accuracy, sensitivity, and specificity. The AUTs showed inhomogeneous capabilities to classify different types of fundus photographs. CONCLUSIONS: This article demonstrated a workflow to build standardized test sets and conduct algorithm testing of the AIMD for computer-aided diagnosis of diabetic retinopathy. It may provide a reference to develop technical standards that promote product verification and quality control, improving the comparability of products. Hindawi 2023-02-08 /pmc/articles/PMC9931476/ /pubmed/36818382 http://dx.doi.org/10.1155/2023/7139560 Text en Copyright © 2023 Hao Wang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Wang, Hao Meng, Xiangfeng Tang, Qiaohong Hao, Ye Luo, Yan Li, Jiage Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy |
title | Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy |
title_full | Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy |
title_fullStr | Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy |
title_full_unstemmed | Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy |
title_short | Development and Application of a Standardized Testset for an Artificial Intelligence Medical Device Intended for the Computer-Aided Diagnosis of Diabetic Retinopathy |
title_sort | development and application of a standardized testset for an artificial intelligence medical device intended for the computer-aided diagnosis of diabetic retinopathy |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931476/ https://www.ncbi.nlm.nih.gov/pubmed/36818382 http://dx.doi.org/10.1155/2023/7139560 |
work_keys_str_mv | AT wanghao developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy AT mengxiangfeng developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy AT tangqiaohong developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy AT haoye developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy AT luoyan developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy AT lijiage developmentandapplicationofastandardizedtestsetforanartificialintelligencemedicaldeviceintendedforthecomputeraideddiagnosisofdiabeticretinopathy |