Cargando…

A large dataset of scientific text reuse in Open-Access publications

We present the Webis-STEREO-21 dataset, a massive collection of Scientific Text Reuse in Open-access publications. It contains 91 million cases of reused text passages found in 4.2 million unique open-access publications. Cases range from overlap of as few as eight words to near-duplicate publicatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Gienapp, Lukas, Kircheis, Wolfgang, Sievers, Bjarne, Stein, Benno, Potthast, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879940/
https://www.ncbi.nlm.nih.gov/pubmed/36702840
http://dx.doi.org/10.1038/s41597-022-01908-z