Cargando…
New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial
OBJECTIVES: To identify separate publications that report outcomes from the same underlying clinical trial, in order to avoid over-counting these as independent pieces of evidence. MATERIALS AND METHODS: We updated our previous model by creating larger, more recent, and more diverse positive and neg...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660960/ https://www.ncbi.nlm.nih.gov/pubmed/33215068 http://dx.doi.org/10.1093/jamiaopen/ooaa042 |
_version_ | 1783609121556135936 |
---|---|
author | Smalheiser, Neil R Holt, Arthur W |
author_facet | Smalheiser, Neil R Holt, Arthur W |
author_sort | Smalheiser, Neil R |
collection | PubMed |
description | OBJECTIVES: To identify separate publications that report outcomes from the same underlying clinical trial, in order to avoid over-counting these as independent pieces of evidence. MATERIALS AND METHODS: We updated our previous model by creating larger, more recent, and more diverse positive and negative training sets consisting of article pairs that were (or not) linked to the same ClinicalTrials.gov trial registry number. Features were extracted from PubMed metadata; pairwise similarity scores were modeled using logistic regression and used to form clusters of articles that are likely to arise from the same registered clinical trial. RESULTS: Articles from the same trial were identified with high accuracy (F1 = 0.859), nominally better than the previous model (F1 = 0.843). Predicted clusters showed a low error rate of splitting of 8–11% (ie, when 2 articles belonged to the same trial but were assigned to different clusters). Performance was similar whether only randomized controlled trial articles or a more diverse set of clinical trial articles were processed. DISCUSSION: Metadata are surprisingly accurate in predicting when 2 articles derive from the same underlying clinical trial. CONCLUSION: We have continued confidence in the Aggregator tool which can be accessed publicly at http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/RCT_Tagger.cgi. |
format | Online Article Text |
id | pubmed-7660960 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-76609602020-11-18 New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial Smalheiser, Neil R Holt, Arthur W JAMIA Open Application Notes OBJECTIVES: To identify separate publications that report outcomes from the same underlying clinical trial, in order to avoid over-counting these as independent pieces of evidence. MATERIALS AND METHODS: We updated our previous model by creating larger, more recent, and more diverse positive and negative training sets consisting of article pairs that were (or not) linked to the same ClinicalTrials.gov trial registry number. Features were extracted from PubMed metadata; pairwise similarity scores were modeled using logistic regression and used to form clusters of articles that are likely to arise from the same registered clinical trial. RESULTS: Articles from the same trial were identified with high accuracy (F1 = 0.859), nominally better than the previous model (F1 = 0.843). Predicted clusters showed a low error rate of splitting of 8–11% (ie, when 2 articles belonged to the same trial but were assigned to different clusters). Performance was similar whether only randomized controlled trial articles or a more diverse set of clinical trial articles were processed. DISCUSSION: Metadata are surprisingly accurate in predicting when 2 articles derive from the same underlying clinical trial. CONCLUSION: We have continued confidence in the Aggregator tool which can be accessed publicly at http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/RCT_Tagger.cgi. Oxford University Press 2020-10-28 /pmc/articles/PMC7660960/ /pubmed/33215068 http://dx.doi.org/10.1093/jamiaopen/ooaa042 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Application Notes Smalheiser, Neil R Holt, Arthur W New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial |
title | New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial |
title_full | New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial |
title_fullStr | New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial |
title_full_unstemmed | New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial |
title_short | New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial |
title_sort | new improved aggregator: predicting which clinical trial articles derive from the same registered clinical trial |
topic | Application Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660960/ https://www.ncbi.nlm.nih.gov/pubmed/33215068 http://dx.doi.org/10.1093/jamiaopen/ooaa042 |
work_keys_str_mv | AT smalheiserneilr newimprovedaggregatorpredictingwhichclinicaltrialarticlesderivefromthesameregisteredclinicaltrial AT holtarthurw newimprovedaggregatorpredictingwhichclinicaltrialarticlesderivefromthesameregisteredclinicaltrial |