Cargando…
Self-Supervised Chinese Ontology Learning from Online Encyclopedias
Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for o...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3970055/ https://www.ncbi.nlm.nih.gov/pubmed/24715819 http://dx.doi.org/10.1155/2014/848631 |
_version_ | 1782309328808574976 |
---|---|
author | Hu, Fanghuai Shao, Zhiqing Ruan, Tong |
author_facet | Hu, Fanghuai Shao, Zhiqing Ruan, Tong |
author_sort | Hu, Fanghuai |
collection | PubMed |
description | Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO. |
format | Online Article Text |
id | pubmed-3970055 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-39700552014-04-08 Self-Supervised Chinese Ontology Learning from Online Encyclopedias Hu, Fanghuai Shao, Zhiqing Ruan, Tong ScientificWorldJournal Research Article Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO. Hindawi Publishing Corporation 2014-03-13 /pmc/articles/PMC3970055/ /pubmed/24715819 http://dx.doi.org/10.1155/2014/848631 Text en Copyright © 2014 Fanghuai Hu et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Hu, Fanghuai Shao, Zhiqing Ruan, Tong Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title | Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_full | Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_fullStr | Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_full_unstemmed | Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_short | Self-Supervised Chinese Ontology Learning from Online Encyclopedias |
title_sort | self-supervised chinese ontology learning from online encyclopedias |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3970055/ https://www.ncbi.nlm.nih.gov/pubmed/24715819 http://dx.doi.org/10.1155/2014/848631 |
work_keys_str_mv | AT hufanghuai selfsupervisedchineseontologylearningfromonlineencyclopedias AT shaozhiqing selfsupervisedchineseontologylearningfromonlineencyclopedias AT ruantong selfsupervisedchineseontologylearningfromonlineencyclopedias |