Cargando…

Supervised Learning for Detection of Duplicates in Genomic Sequence Databases

MOTIVATION: First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicat...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Qingyu, Zobel, Justin, Zhang, Xiuzhen, Verspoor, Karin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4973881/
https://www.ncbi.nlm.nih.gov/pubmed/27489953
http://dx.doi.org/10.1371/journal.pone.0159644