Cargando…

Key concepts, common pitfalls, and best practices in artificial intelligence and machine learning: focus on radiomics

Artificial intelligence (AI) and machine learning (ML) are increasingly used in radiology research to deal with large and complex imaging data sets. Nowadays, ML tools have become easily accessible to anyone. Such a low threshold to accessibility might lead to inappropriate usage and misinterpretati...

Descripción completa

Detalles Bibliográficos
Autor principal: Koçak, Burak
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Turkish Society of Radiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9682557/
https://www.ncbi.nlm.nih.gov/pubmed/36218149
http://dx.doi.org/10.5152/dir.2022.211297
Descripción
Sumario:Artificial intelligence (AI) and machine learning (ML) are increasingly used in radiology research to deal with large and complex imaging data sets. Nowadays, ML tools have become easily accessible to anyone. Such a low threshold to accessibility might lead to inappropriate usage and misinterpretation, without a clear intention. Therefore, ensuring methodological rigor is of paramount importance. Getting closer to the real-world clinical implementation of AI, a basic understanding of the main concepts should be a must for every radiology professional. In this respect, simplified explanations of the key concepts along with pitfalls and recommendations would be helpful for general radiology community to develop and improve their AI mindset. In this work, 22 key issues are reviewed within 3 categories: pre-modeling, modeling, and post-modeling. Firstly, the concept is shortly defined for each issue. Then, related common pitfalls and best practices are provided. Specifically, the issues included in this article are validity of the scientific question, unrepresentative samples, sample size, missing data, quality of reference standard, batch effect, reliability of features, feature scaling, multi-collinearity, class imbalance, data and target leakage, high-dimensional data, optimization, overfitting, generalization, performance metrics, clinical utility, comparison with conventional statistical and clinical methods, interpretability and explainability, randomness, transparent reporting, and sharing data.