Cargando…

Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis

BACKGROUND: Since the onset of the COVID-19 pandemic, there has been an unprecedented effort in genomic epidemiology to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, GISAID and GenBank, which collect...

Descripción completa

Detalles Bibliográficos
Autores principales: O’Connor, Karen, Weissenbacher, Davy, Elyaderani, Amir, Scotch, Matthew, Gonzalez-Hernandez, Graciela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10371180/
https://www.ncbi.nlm.nih.gov/pubmed/37503241
http://dx.doi.org/10.1101/2023.07.14.23292681
Descripción
Sumario:BACKGROUND: Since the onset of the COVID-19 pandemic, there has been an unprecedented effort in genomic epidemiology to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, GISAID and GenBank, which collectively hold millions of SARS-CoV-2 sequence records. However, genomic epidemiology seeks to go beyond phylogenetic analysis by linking genetic information to patient demographics and disease outcomes, enabling a comprehensive understanding of transmission dynamics and disease impact. While these repositories include some patient-related information, such as the location of the infected host, the granularity of this data and the inclusion of demographic and clinical details are inconsistent. Additionally, the extent to which patient-related metadata is reported in published sequencing studies remains largely unexplored. Therefore, it is essential to assess the extent and quality of patient-related metadata reported in SARS-CoV-2 sequencing studies. Moreover, there is limited linkage between published articles and sequence repositories, hindering the identification of relevant studies. Traditional search strategies based on keywords may miss relevant articles. To overcome these challenges, this study proposes the use of an automated classifier to identify relevant articles. OBJECTIVE: This study aims to conduct a systematic and comprehensive scoping review, along with a bibliometric analysis, to assess the reporting of patient-related metadata in SARS-CoV-2 sequencing studies. METHODS: The NIH’s LitCovid collection will be used for the machine learning classification, while an independent search will be conducted in PubMed. Data extraction will be conducted using Covidence, and the extracted data will be synthesized and summarized to quantify the availability of patient metadata in the published literature of SARS-CoV-2 sequencing studies. For the bibliometric analysis, relevant data points, such as author affiliations, journal information, and citation metrics, will be extracted. RESULTS: The study will report findings on the extent and types of patient-related metadata reported in genomic viral sequencing studies of SARS-CoV-2. The scoping review will identify gaps in the reporting of patient metadata and make recommendations for improving the quality and consistency of reporting in this area. The bibliometric analysis will uncover trends and patterns in the reporting of patient-related metadata, such as differences in reporting based on study types or geographic regions. Co-occurrence networks of author keywords will also be presented to highlight frequent themes and their associations with patient metadata reporting. CONCLUSION: This study will contribute to advancing knowledge in the field of genomic epidemiology by providing a comprehensive overview of the reporting of patient-related metadata in SARS-CoV-2 sequencing studies. The insights gained from this study may help improve the quality and consistency of reporting patient metadata, enhancing the utility of sequence metadata and facilitating future research on infectious diseases. The findings may also inform the development of machine learning methods to automatically extract patient-related information from sequencing studies.