Cargando…

Use of automatic SQL generation interface to enhance transparency and validity of health-data analysis

Analysis of health data typically requires development of queries using structured query language (SQL) by a data-analyst. As the SQL queries are manually created, they are prone to errors. In addition, accurate implementation of the queries depends on effective communication with clinical experts,...

Descripción completa

Detalles Bibliográficos
Autores principales: Wagholikar, Kavishwar B., Zelle, David, Ainsworth, Layne, Chaney, Kira, Blood, Alexander J., Miller, Angela, Chulyadyo, Rupendra, Oates, Michael, Gordon, William J., Aronson, Samuel J., Scirica, Benjamin M., Murphy, Shawn N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9306316/
https://www.ncbi.nlm.nih.gov/pubmed/35874460
http://dx.doi.org/10.1016/j.imu.2022.100996
Descripción
Sumario:Analysis of health data typically requires development of queries using structured query language (SQL) by a data-analyst. As the SQL queries are manually created, they are prone to errors. In addition, accurate implementation of the queries depends on effective communication with clinical experts, that further makes the analysis error prone. As a potential resolution, we explore an alternative approach wherein a graphical interface that automatically generates the SQL queries is used to perform the analysis. The latter allows clinical experts to directly perform complex queries on the data, despite their unfamiliarity with SQL syntax. The interface provides an intuitive understanding of the query logic which makes the analysis transparent and comprehensible to the clinical study-staff, thereby enhancing the transparency and validity of the analysis. This study demonstrates the feasibility of using a user-friendly interface that automatically generate SQL for analysis of health data. It outlines challenges that will be useful for designing user-friendly tools to improve transparency and reproducibility of data analysis.