Cargando…

Self-Supervised Chinese Ontology Learning from Online Encyclopedias

Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for o...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Fanghuai, Shao, Zhiqing, Ruan, Tong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3970055/
https://www.ncbi.nlm.nih.gov/pubmed/24715819
http://dx.doi.org/10.1155/2014/848631
_version_ 1782309328808574976
author Hu, Fanghuai
Shao, Zhiqing
Ruan, Tong
author_facet Hu, Fanghuai
Shao, Zhiqing
Ruan, Tong
author_sort Hu, Fanghuai
collection PubMed
description Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO.
format Online
Article
Text
id pubmed-3970055
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-39700552014-04-08 Self-Supervised Chinese Ontology Learning from Online Encyclopedias Hu, Fanghuai Shao, Zhiqing Ruan, Tong ScientificWorldJournal Research Article Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO. Hindawi Publishing Corporation 2014-03-13 /pmc/articles/PMC3970055/ /pubmed/24715819 http://dx.doi.org/10.1155/2014/848631 Text en Copyright © 2014 Fanghuai Hu et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Hu, Fanghuai
Shao, Zhiqing
Ruan, Tong
Self-Supervised Chinese Ontology Learning from Online Encyclopedias
title Self-Supervised Chinese Ontology Learning from Online Encyclopedias
title_full Self-Supervised Chinese Ontology Learning from Online Encyclopedias
title_fullStr Self-Supervised Chinese Ontology Learning from Online Encyclopedias
title_full_unstemmed Self-Supervised Chinese Ontology Learning from Online Encyclopedias
title_short Self-Supervised Chinese Ontology Learning from Online Encyclopedias
title_sort self-supervised chinese ontology learning from online encyclopedias
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3970055/
https://www.ncbi.nlm.nih.gov/pubmed/24715819
http://dx.doi.org/10.1155/2014/848631
work_keys_str_mv AT hufanghuai selfsupervisedchineseontologylearningfromonlineencyclopedias
AT shaozhiqing selfsupervisedchineseontologylearningfromonlineencyclopedias
AT ruantong selfsupervisedchineseontologylearningfromonlineencyclopedias