Cargando…

New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial

OBJECTIVES: To identify separate publications that report outcomes from the same underlying clinical trial, in order to avoid over-counting these as independent pieces of evidence. MATERIALS AND METHODS: We updated our previous model by creating larger, more recent, and more diverse positive and neg...

Descripción completa

Detalles Bibliográficos
Autores principales: Smalheiser, Neil R, Holt, Arthur W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660960/
https://www.ncbi.nlm.nih.gov/pubmed/33215068
http://dx.doi.org/10.1093/jamiaopen/ooaa042
Descripción
Sumario:OBJECTIVES: To identify separate publications that report outcomes from the same underlying clinical trial, in order to avoid over-counting these as independent pieces of evidence. MATERIALS AND METHODS: We updated our previous model by creating larger, more recent, and more diverse positive and negative training sets consisting of article pairs that were (or not) linked to the same ClinicalTrials.gov trial registry number. Features were extracted from PubMed metadata; pairwise similarity scores were modeled using logistic regression and used to form clusters of articles that are likely to arise from the same registered clinical trial. RESULTS: Articles from the same trial were identified with high accuracy (F1 = 0.859), nominally better than the previous model (F1 = 0.843). Predicted clusters showed a low error rate of splitting of 8–11% (ie, when 2 articles belonged to the same trial but were assigned to different clusters). Performance was similar whether only randomized controlled trial articles or a more diverse set of clinical trial articles were processed. DISCUSSION: Metadata are surprisingly accurate in predicting when 2 articles derive from the same underlying clinical trial. CONCLUSION: We have continued confidence in the Aggregator tool which can be accessed publicly at http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/RCT_Tagger.cgi.