Cargando…

Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis

BACKGROUND: Since the onset of the COVID-19 pandemic, there has been an unprecedented effort in genomic epidemiology to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, GISAID and GenBank, which collect...

Descripción completa

Detalles Bibliográficos
Autores principales: O’Connor, Karen, Weissenbacher, Davy, Elyaderani, Amir, Scotch, Matthew, Gonzalez-Hernandez, Graciela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10371180/
https://www.ncbi.nlm.nih.gov/pubmed/37503241
http://dx.doi.org/10.1101/2023.07.14.23292681
_version_ 1785078100056342528
author O’Connor, Karen
Weissenbacher, Davy
Elyaderani, Amir
Scotch, Matthew
Gonzalez-Hernandez, Graciela
author_facet O’Connor, Karen
Weissenbacher, Davy
Elyaderani, Amir
Scotch, Matthew
Gonzalez-Hernandez, Graciela
author_sort O’Connor, Karen
collection PubMed
description BACKGROUND: Since the onset of the COVID-19 pandemic, there has been an unprecedented effort in genomic epidemiology to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, GISAID and GenBank, which collectively hold millions of SARS-CoV-2 sequence records. However, genomic epidemiology seeks to go beyond phylogenetic analysis by linking genetic information to patient demographics and disease outcomes, enabling a comprehensive understanding of transmission dynamics and disease impact. While these repositories include some patient-related information, such as the location of the infected host, the granularity of this data and the inclusion of demographic and clinical details are inconsistent. Additionally, the extent to which patient-related metadata is reported in published sequencing studies remains largely unexplored. Therefore, it is essential to assess the extent and quality of patient-related metadata reported in SARS-CoV-2 sequencing studies. Moreover, there is limited linkage between published articles and sequence repositories, hindering the identification of relevant studies. Traditional search strategies based on keywords may miss relevant articles. To overcome these challenges, this study proposes the use of an automated classifier to identify relevant articles. OBJECTIVE: This study aims to conduct a systematic and comprehensive scoping review, along with a bibliometric analysis, to assess the reporting of patient-related metadata in SARS-CoV-2 sequencing studies. METHODS: The NIH’s LitCovid collection will be used for the machine learning classification, while an independent search will be conducted in PubMed. Data extraction will be conducted using Covidence, and the extracted data will be synthesized and summarized to quantify the availability of patient metadata in the published literature of SARS-CoV-2 sequencing studies. For the bibliometric analysis, relevant data points, such as author affiliations, journal information, and citation metrics, will be extracted. RESULTS: The study will report findings on the extent and types of patient-related metadata reported in genomic viral sequencing studies of SARS-CoV-2. The scoping review will identify gaps in the reporting of patient metadata and make recommendations for improving the quality and consistency of reporting in this area. The bibliometric analysis will uncover trends and patterns in the reporting of patient-related metadata, such as differences in reporting based on study types or geographic regions. Co-occurrence networks of author keywords will also be presented to highlight frequent themes and their associations with patient metadata reporting. CONCLUSION: This study will contribute to advancing knowledge in the field of genomic epidemiology by providing a comprehensive overview of the reporting of patient-related metadata in SARS-CoV-2 sequencing studies. The insights gained from this study may help improve the quality and consistency of reporting patient metadata, enhancing the utility of sequence metadata and facilitating future research on infectious diseases. The findings may also inform the development of machine learning methods to automatically extract patient-related information from sequencing studies.
format Online
Article
Text
id pubmed-10371180
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-103711802023-07-27 Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis O’Connor, Karen Weissenbacher, Davy Elyaderani, Amir Scotch, Matthew Gonzalez-Hernandez, Graciela medRxiv Article BACKGROUND: Since the onset of the COVID-19 pandemic, there has been an unprecedented effort in genomic epidemiology to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, GISAID and GenBank, which collectively hold millions of SARS-CoV-2 sequence records. However, genomic epidemiology seeks to go beyond phylogenetic analysis by linking genetic information to patient demographics and disease outcomes, enabling a comprehensive understanding of transmission dynamics and disease impact. While these repositories include some patient-related information, such as the location of the infected host, the granularity of this data and the inclusion of demographic and clinical details are inconsistent. Additionally, the extent to which patient-related metadata is reported in published sequencing studies remains largely unexplored. Therefore, it is essential to assess the extent and quality of patient-related metadata reported in SARS-CoV-2 sequencing studies. Moreover, there is limited linkage between published articles and sequence repositories, hindering the identification of relevant studies. Traditional search strategies based on keywords may miss relevant articles. To overcome these challenges, this study proposes the use of an automated classifier to identify relevant articles. OBJECTIVE: This study aims to conduct a systematic and comprehensive scoping review, along with a bibliometric analysis, to assess the reporting of patient-related metadata in SARS-CoV-2 sequencing studies. METHODS: The NIH’s LitCovid collection will be used for the machine learning classification, while an independent search will be conducted in PubMed. Data extraction will be conducted using Covidence, and the extracted data will be synthesized and summarized to quantify the availability of patient metadata in the published literature of SARS-CoV-2 sequencing studies. For the bibliometric analysis, relevant data points, such as author affiliations, journal information, and citation metrics, will be extracted. RESULTS: The study will report findings on the extent and types of patient-related metadata reported in genomic viral sequencing studies of SARS-CoV-2. The scoping review will identify gaps in the reporting of patient metadata and make recommendations for improving the quality and consistency of reporting in this area. The bibliometric analysis will uncover trends and patterns in the reporting of patient-related metadata, such as differences in reporting based on study types or geographic regions. Co-occurrence networks of author keywords will also be presented to highlight frequent themes and their associations with patient metadata reporting. CONCLUSION: This study will contribute to advancing knowledge in the field of genomic epidemiology by providing a comprehensive overview of the reporting of patient-related metadata in SARS-CoV-2 sequencing studies. The insights gained from this study may help improve the quality and consistency of reporting patient metadata, enhancing the utility of sequence metadata and facilitating future research on infectious diseases. The findings may also inform the development of machine learning methods to automatically extract patient-related information from sequencing studies. Cold Spring Harbor Laboratory 2023-07-16 /pmc/articles/PMC10371180/ /pubmed/37503241 http://dx.doi.org/10.1101/2023.07.14.23292681 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
O’Connor, Karen
Weissenbacher, Davy
Elyaderani, Amir
Scotch, Matthew
Gonzalez-Hernandez, Graciela
Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis
title Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis
title_full Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis
title_fullStr Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis
title_full_unstemmed Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis
title_short Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis
title_sort patient-related metadata reported in sequencing studies of sars-cov-2: protocol for a scoping review and bibliometric analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10371180/
https://www.ncbi.nlm.nih.gov/pubmed/37503241
http://dx.doi.org/10.1101/2023.07.14.23292681
work_keys_str_mv AT oconnorkaren patientrelatedmetadatareportedinsequencingstudiesofsarscov2protocolforascopingreviewandbibliometricanalysis
AT weissenbacherdavy patientrelatedmetadatareportedinsequencingstudiesofsarscov2protocolforascopingreviewandbibliometricanalysis
AT elyaderaniamir patientrelatedmetadatareportedinsequencingstudiesofsarscov2protocolforascopingreviewandbibliometricanalysis
AT scotchmatthew patientrelatedmetadatareportedinsequencingstudiesofsarscov2protocolforascopingreviewandbibliometricanalysis
AT gonzalezhernandezgraciela patientrelatedmetadatareportedinsequencingstudiesofsarscov2protocolforascopingreviewandbibliometricanalysis