Cargando…

Incremental Knowledge Base Construction Using DeepDive

Populating a database with unstructured information is a long-standing problem in industry and research that encompasses problems of extraction, cleaning, and integration. Recent names used for this problem include dealing with dark data and knowledge base construction (KBC). In this work, we descri...

Descripción completa

Detalles Bibliográficos
Autores principales: Shin, Jaeho, Wu, Sen, Wang, Feiran, De Sa, Christopher, Zhang, Ce, Ré, Christopher
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4852149/
https://www.ncbi.nlm.nih.gov/pubmed/27144081
http://dx.doi.org/10.14778/2809974.2809991
_version_ 1782429893721587712
author Shin, Jaeho
Wu, Sen
Wang, Feiran
De Sa, Christopher
Zhang, Ce
Ré, Christopher
author_facet Shin, Jaeho
Wu, Sen
Wang, Feiran
De Sa, Christopher
Zhang, Ce
Ré, Christopher
author_sort Shin, Jaeho
collection PubMed
description Populating a database with unstructured information is a long-standing problem in industry and research that encompasses problems of extraction, cleaning, and integration. Recent names used for this problem include dealing with dark data and knowledge base construction (KBC). In this work, we describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems, and we present techniques to make the KBC process more efficient. We observe that the KBC process is iterative, and we develop techniques to incrementally produce inference results for KBC systems. We propose two methods for incremental inference, based respectively on sampling and variational techniques. We also study the tradeoff space of these methods and develop a simple rule-based optimizer. DeepDive includes all of these contributions, and we evaluate Deep-Dive on five KBC systems, showing that it can speed up KBC inference tasks by up to two orders of magnitude with negligible impact on quality.
format Online
Article
Text
id pubmed-4852149
institution National Center for Biotechnology Information
language English
publishDate 2015
record_format MEDLINE/PubMed
spelling pubmed-48521492016-05-01 Incremental Knowledge Base Construction Using DeepDive Shin, Jaeho Wu, Sen Wang, Feiran De Sa, Christopher Zhang, Ce Ré, Christopher Proceedings VLDB Endowment Article Populating a database with unstructured information is a long-standing problem in industry and research that encompasses problems of extraction, cleaning, and integration. Recent names used for this problem include dealing with dark data and knowledge base construction (KBC). In this work, we describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems, and we present techniques to make the KBC process more efficient. We observe that the KBC process is iterative, and we develop techniques to incrementally produce inference results for KBC systems. We propose two methods for incremental inference, based respectively on sampling and variational techniques. We also study the tradeoff space of these methods and develop a simple rule-based optimizer. DeepDive includes all of these contributions, and we evaluate Deep-Dive on five KBC systems, showing that it can speed up KBC inference tasks by up to two orders of magnitude with negligible impact on quality. 2015-07 /pmc/articles/PMC4852149/ /pubmed/27144081 http://dx.doi.org/10.14778/2809974.2809991 Text en http://creativecommons.org/licenses/by-nc-nd/3.0/ This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/.Obtain permission prior to any use beyond those covered by the license.
spellingShingle Article
Shin, Jaeho
Wu, Sen
Wang, Feiran
De Sa, Christopher
Zhang, Ce
Ré, Christopher
Incremental Knowledge Base Construction Using DeepDive
title Incremental Knowledge Base Construction Using DeepDive
title_full Incremental Knowledge Base Construction Using DeepDive
title_fullStr Incremental Knowledge Base Construction Using DeepDive
title_full_unstemmed Incremental Knowledge Base Construction Using DeepDive
title_short Incremental Knowledge Base Construction Using DeepDive
title_sort incremental knowledge base construction using deepdive
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4852149/
https://www.ncbi.nlm.nih.gov/pubmed/27144081
http://dx.doi.org/10.14778/2809974.2809991
work_keys_str_mv AT shinjaeho incrementalknowledgebaseconstructionusingdeepdive
AT wusen incrementalknowledgebaseconstructionusingdeepdive
AT wangfeiran incrementalknowledgebaseconstructionusingdeepdive
AT desachristopher incrementalknowledgebaseconstructionusingdeepdive
AT zhangce incrementalknowledgebaseconstructionusingdeepdive
AT rechristopher incrementalknowledgebaseconstructionusingdeepdive