Cargando…

PEDL+: protein-centered relation extraction from PubMed at your fingertip

SUMMARY: Relation extraction (RE) from large text collections is an important tool for database curation, pathway reconstruction, or functional omics data analysis. In practice, RE often is part of a complex data analysis pipeline requiring specific adaptations like restricting the types of relation...

Descripción completa

Detalles Bibliográficos
Autores principales: Weber, Leon, Barth, Fabio, Lorenz, Leonie, Konrath, Fabian, Huska, Kirsten, Wolf, Jana, Leser, Ulf
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10660277/
https://www.ncbi.nlm.nih.gov/pubmed/37950510
http://dx.doi.org/10.1093/bioinformatics/btad603
Descripción
Sumario:SUMMARY: Relation extraction (RE) from large text collections is an important tool for database curation, pathway reconstruction, or functional omics data analysis. In practice, RE often is part of a complex data analysis pipeline requiring specific adaptations like restricting the types of relations or the set of proteins to be considered. However, current systems are either non-programmable web sites or research code with fixed functionality. We present PEDL+, a user-friendly tool for extracting protein–protein and protein–chemical associations from PubMed articles. PEDL+ combines state-of-the-art NLP technology with adaptable ranking and filtering options and can easily be integrated into analysis pipelines. We evaluated PEDL+ in two pathway curation projects and found that 59% to 80% of its extractions were helpful. AVAILABILITY AND IMPLEMENTATION: PEDL+ is freely available at https://github.com/leonweber/pedl.