Cargando…
CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic
CONTEXT: Data collection involving a large number of patients is usually known as a tedious and time-consuming task by healthcare professionals. Current patient load makes collecting clinical data almost impossible even though we need that information more than ever. OBJECTIVE: We wanted to deploy a...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7834180/ http://dx.doi.org/10.1016/S2152-2650(20)30778-3 |
_version_ | 1783642223165833216 |
---|---|
author | Melchor, Raul Azibeiro Fonseca, Marta Rey, Beatriz Hernandez, Alberto Puertas, Borja Gomez, Sandra Palomino, Danylo Román, Luz Gema Peña, Andres Felipe Mateos, Maria Victoria |
author_facet | Melchor, Raul Azibeiro Fonseca, Marta Rey, Beatriz Hernandez, Alberto Puertas, Borja Gomez, Sandra Palomino, Danylo Román, Luz Gema Peña, Andres Felipe Mateos, Maria Victoria |
author_sort | Melchor, Raul Azibeiro |
collection | PubMed |
description | CONTEXT: Data collection involving a large number of patients is usually known as a tedious and time-consuming task by healthcare professionals. Current patient load makes collecting clinical data almost impossible even though we need that information more than ever. OBJECTIVE: We wanted to deploy a system that automatically and autonomously retrieves clinical data from our patients suffering from SARS-CoV2 that arrive at hospital admission to collect that information for further analysis. DESIGN: We designed a daemon in PHP programming language connected to a MySQL MariaDB database that continuously searches for new patients consulting at hospital. We collected medical history, disease records, regular medication, physical exploration, vital signs, blood chemistry and count, and finally, microbiology testing of SARS-CoV2 (both PCR and ELISA antibody testing). As we don't have access to any API service (out-of-the-box connection to the data mainframe), we took advantage of web-scraping (brute-force data extraction from webpages using HTTP protocol) applied to our hospital web interface. SETTING: Monitoring was made between 1(st) March, 2020 and 15(th) April, 2020 (during worst Coronavirus outbreak phase of the country), using only one computer connected to the hospital network. The number of patients identified was 259, each one with 344 clinical and testing variables. RESULTS: Using this technique, we collected data of 259 hematologic patients without human intervention and more than 300 variables have been analyzed. Nowadays, manual revision of certain aspects of the database (e.g., comorbidities) is needed and some data needs to be manually entered due to the lack of proper codification. In the future, with the development of semantic-matching technologies, fully autonomous building of the databases will be possible. In the meantime, our technique can solve the capture of enormous amount of clinical information without effort. With that information, observational studies, even a prognosis score using machine learning, have been developed in our center. CONCLUSIONS: Data collection for further analysis is usually a vital, but time-consuming, task in order to answer clinical questions. We developed a technique that helped our center retrieve patients' clinical information autonomously during the SARS-Cov-2 pandemic. |
format | Online Article Text |
id | pubmed-7834180 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Elsevier Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-78341802021-01-26 CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic Melchor, Raul Azibeiro Fonseca, Marta Rey, Beatriz Hernandez, Alberto Puertas, Borja Gomez, Sandra Palomino, Danylo Román, Luz Gema Peña, Andres Felipe Mateos, Maria Victoria Clin Lymphoma Myeloma Leuk Submitted Abstracts CONTEXT: Data collection involving a large number of patients is usually known as a tedious and time-consuming task by healthcare professionals. Current patient load makes collecting clinical data almost impossible even though we need that information more than ever. OBJECTIVE: We wanted to deploy a system that automatically and autonomously retrieves clinical data from our patients suffering from SARS-CoV2 that arrive at hospital admission to collect that information for further analysis. DESIGN: We designed a daemon in PHP programming language connected to a MySQL MariaDB database that continuously searches for new patients consulting at hospital. We collected medical history, disease records, regular medication, physical exploration, vital signs, blood chemistry and count, and finally, microbiology testing of SARS-CoV2 (both PCR and ELISA antibody testing). As we don't have access to any API service (out-of-the-box connection to the data mainframe), we took advantage of web-scraping (brute-force data extraction from webpages using HTTP protocol) applied to our hospital web interface. SETTING: Monitoring was made between 1(st) March, 2020 and 15(th) April, 2020 (during worst Coronavirus outbreak phase of the country), using only one computer connected to the hospital network. The number of patients identified was 259, each one with 344 clinical and testing variables. RESULTS: Using this technique, we collected data of 259 hematologic patients without human intervention and more than 300 variables have been analyzed. Nowadays, manual revision of certain aspects of the database (e.g., comorbidities) is needed and some data needs to be manually entered due to the lack of proper codification. In the future, with the development of semantic-matching technologies, fully autonomous building of the databases will be possible. In the meantime, our technique can solve the capture of enormous amount of clinical information without effort. With that information, observational studies, even a prognosis score using machine learning, have been developed in our center. CONCLUSIONS: Data collection for further analysis is usually a vital, but time-consuming, task in order to answer clinical questions. We developed a technique that helped our center retrieve patients' clinical information autonomously during the SARS-Cov-2 pandemic. Elsevier Inc. 2020-09 2020-12-09 /pmc/articles/PMC7834180/ http://dx.doi.org/10.1016/S2152-2650(20)30778-3 Text en Copyright © 2020 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Submitted Abstracts Melchor, Raul Azibeiro Fonseca, Marta Rey, Beatriz Hernandez, Alberto Puertas, Borja Gomez, Sandra Palomino, Danylo Román, Luz Gema Peña, Andres Felipe Mateos, Maria Victoria CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic |
title | CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic |
title_full | CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic |
title_fullStr | CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic |
title_full_unstemmed | CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic |
title_short | CT-152: Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic |
title_sort | ct-152: application of web-scraping techniques for autonomous massive retrieval of hematologic patients' information during sars-cov2 pandemic |
topic | Submitted Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7834180/ http://dx.doi.org/10.1016/S2152-2650(20)30778-3 |
work_keys_str_mv | AT melchorraulazibeiro ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT fonsecamarta ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT reybeatriz ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT hernandezalberto ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT puertasborja ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT gomezsandra ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT palominodanylo ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT romanluzgema ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT penaandresfelipe ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic AT mateosmariavictoria ct152applicationofwebscrapingtechniquesforautonomousmassiveretrievalofhematologicpatientsinformationduringsarscov2pandemic |