Cargando…

CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic

CONTEXT: Data collection involving a large number of patients is usually known as a tedious and time-consuming task by healthcare professionals. Current patient load makes collecting clinical data almost impossible even though we need that information more than ever. OBJECTIVE: We wanted to deploy a...

Descripción completa

Detalles Bibliográficos
Autores principales: Melchor, Raul Azibeiro, Fonseca, Marta, Rey, Beatriz, Hernandez, Alberto, Puertas, Borja, Gomez, Sandra, Palomino, Danylo, Román, Luz Gema, Peña, Andres Felipe, Mateos, Maria Victoria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7834180/
http://dx.doi.org/10.1016/S2152-2650(20)30778-3
_version_ 1783642223165833216
author Melchor, Raul Azibeiro
Fonseca, Marta
Rey, Beatriz
Hernandez, Alberto
Puertas, Borja
Gomez, Sandra
Palomino, Danylo
Román, Luz Gema
Peña, Andres Felipe
Mateos, Maria Victoria
author_facet Melchor, Raul Azibeiro
Fonseca, Marta
Rey, Beatriz
Hernandez, Alberto
Puertas, Borja
Gomez, Sandra
Palomino, Danylo
Román, Luz Gema
Peña, Andres Felipe
Mateos, Maria Victoria
author_sort Melchor, Raul Azibeiro
collection PubMed
description CONTEXT: Data collection involving a large number of patients is usually known as a tedious and time-consuming task by healthcare professionals. Current patient load makes collecting clinical data almost impossible even though we need that information more than ever. OBJECTIVE: We wanted to deploy a system that automatically and autonomously retrieves clinical data from our patients suffering from SARS-CoV2 that arrive at hospital admission to collect that information for further analysis. DESIGN: We designed a daemon in PHP programming language connected to a MySQL MariaDB database that continuously searches for new patients consulting at hospital. We collected medical history, disease records, regular medication, physical exploration, vital signs, blood chemistry and count, and finally, microbiology testing of SARS-CoV2 (both PCR and ELISA antibody testing). As we don't have access to any API service (out-of-the-box connection to the data mainframe), we took advantage of web-scraping (brute-force data extraction from webpages using HTTP protocol) applied to our hospital web interface. SETTING: Monitoring was made between 1(st) March, 2020 and 15(th) April, 2020 (during worst Coronavirus outbreak phase of the country), using only one computer connected to the hospital network. The number of patients identified was 259, each one with 344 clinical and testing variables. RESULTS: Using this technique, we collected data of 259 hematologic patients without human intervention and more than 300 variables have been analyzed. Nowadays, manual revision of certain aspects of the database (e.g., comorbidities) is needed and some data needs to be manually entered due to the lack of proper codification. In the future, with the development of semantic-matching technologies, fully autonomous building of the databases will be possible. In the meantime, our technique can solve the capture of enormous amount of clinical information without effort. With that information, observational studies, even a prognosis score using machine learning, have been developed in our center. CONCLUSIONS: Data collection for further analysis is usually a vital, but time-consuming, task in order to answer clinical questions. We developed a technique that helped our center retrieve patients' clinical information autonomously during the SARS-Cov-2 pandemic.
format Online
Article
Text
id pubmed-7834180
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier Inc.
record_format MEDLINE/PubMed
spelling pubmed-78341802021-01-26 CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic Melchor, Raul Azibeiro Fonseca, Marta Rey, Beatriz Hernandez, Alberto Puertas, Borja Gomez, Sandra Palomino, Danylo Román, Luz Gema Peña, Andres Felipe Mateos, Maria Victoria Clin Lymphoma Myeloma Leuk Submitted Abstracts CONTEXT: Data collection involving a large number of patients is usually known as a tedious and time-consuming task by healthcare professionals. Current patient load makes collecting clinical data almost impossible even though we need that information more than ever. OBJECTIVE: We wanted to deploy a system that automatically and autonomously retrieves clinical data from our patients suffering from SARS-CoV2 that arrive at hospital admission to collect that information for further analysis. DESIGN: We designed a daemon in PHP programming language connected to a MySQL MariaDB database that continuously searches for new patients consulting at hospital. We collected medical history, disease records, regular medication, physical exploration, vital signs, blood chemistry and count, and finally, microbiology testing of SARS-CoV2 (both PCR and ELISA antibody testing). As we don't have access to any API service (out-of-the-box connection to the data mainframe), we took advantage of web-scraping (brute-force data extraction from webpages using HTTP protocol) applied to our hospital web interface. SETTING: Monitoring was made between 1(st) March, 2020 and 15(th) April, 2020 (during worst Coronavirus outbreak phase of the country), using only one computer connected to the hospital network. The number of patients identified was 259, each one with 344 clinical and testing variables. RESULTS: Using this technique, we collected data of 259 hematologic patients without human intervention and more than 300 variables have been analyzed. Nowadays, manual revision of certain aspects of the database (e.g., comorbidities) is needed and some data needs to be manually entered due to the lack of proper codification. In the future, with the development of semantic-matching technologies, fully autonomous building of the databases will be possible. In the meantime, our technique can solve the capture of enormous amount of clinical information without effort. With that information, observational studies, even a prognosis score using machine learning, have been developed in our center. CONCLUSIONS: Data collection for further analysis is usually a vital, but time-consuming, task in order to answer clinical questions. We developed a technique that helped our center retrieve patients' clinical information autonomously during the SARS-Cov-2 pandemic. Elsevier Inc. 2020-09 2020-12-09 /pmc/articles/PMC7834180/ http://dx.doi.org/10.1016/S2152-2650(20)30778-3 Text en Copyright © 2020 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Submitted Abstracts
Melchor, Raul Azibeiro
Fonseca, Marta
Rey, Beatriz
Hernandez, Alberto
Puertas, Borja
Gomez, Sandra
Palomino, Danylo
Román, Luz Gema
Peña, Andres Felipe
Mateos, Maria Victoria
CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic
title CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic
title_full CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic
title_fullStr CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic
title_full_unstemmed CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic
title_short CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic
title_sort ct-152: application of web-scraping techniques for autonomous massive retrieval of hematologic patients' information during sars-cov2 pandemic
topic Submitted Abstracts
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7834180/
http://dx.doi.org/10.1016/S2152-2650(20)30778-3
work_keys_str_mv AT melchorraulazibeiro ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT fonsecamarta ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT reybeatriz ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT hernandezalberto ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT puertasborja ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT gomezsandra ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT palominodanylo ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT romanluzgema ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT penaandresfelipe ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic
AT mateosmariavictoria ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic