Cargando…
O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information
Previous approaches to create a controlled vocabulary for Japanese have resorted to existing bilingual dictionary and transformation rules to allow such mappings. However, given the possible new terms introduced due to coronavirus disease 2019 (COVID-19) and the emphasis on respiratory and infection...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korea Genome Organization
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8510863/ https://www.ncbi.nlm.nih.gov/pubmed/34638173 http://dx.doi.org/10.5808/gi.21014 |
_version_ | 1784582662937116672 |
---|---|
author | Soares, Felipe Tateisi, Yuka Takatsuki, Terue Yamaguchi, Atsuko |
author_facet | Soares, Felipe Tateisi, Yuka Takatsuki, Terue Yamaguchi, Atsuko |
author_sort | Soares, Felipe |
collection | PubMed |
description | Previous approaches to create a controlled vocabulary for Japanese have resorted to existing bilingual dictionary and transformation rules to allow such mappings. However, given the possible new terms introduced due to coronavirus disease 2019 (COVID-19) and the emphasis on respiratory and infection-related terms, coverage might not be guaranteed. We propose creating a Japanese bilingual controlled vocabulary based on MeSH terms assigned to COVID-19 related publications in this work. For such, we resorted to manual curation of several bilingual dictionaries and a computational approach based on machine translation of sentences containing such terms and the ranking of possible translations for the individual terms by mutual information. Our results show that we achieved nearly 99% occurrence coverage in LitCovid, while our computational approach presented average accuracy of 63.33% for all terms, and 84.51% for drugs and chemicals. |
format | Online Article Text |
id | pubmed-8510863 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Korea Genome Organization |
record_format | MEDLINE/PubMed |
spelling | pubmed-85108632021-10-22 O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information Soares, Felipe Tateisi, Yuka Takatsuki, Terue Yamaguchi, Atsuko Genomics Inform Blah7 Previous approaches to create a controlled vocabulary for Japanese have resorted to existing bilingual dictionary and transformation rules to allow such mappings. However, given the possible new terms introduced due to coronavirus disease 2019 (COVID-19) and the emphasis on respiratory and infection-related terms, coverage might not be guaranteed. We propose creating a Japanese bilingual controlled vocabulary based on MeSH terms assigned to COVID-19 related publications in this work. For such, we resorted to manual curation of several bilingual dictionaries and a computational approach based on machine translation of sentences containing such terms and the ranking of possible translations for the individual terms by mutual information. Our results show that we achieved nearly 99% occurrence coverage in LitCovid, while our computational approach presented average accuracy of 63.33% for all terms, and 84.51% for drugs and chemicals. Korea Genome Organization 2021-09-30 /pmc/articles/PMC8510863/ /pubmed/34638173 http://dx.doi.org/10.5808/gi.21014 Text en (c) 2021, Korea Genome Organization https://creativecommons.org/licenses/by/4.0/(CC) This is an open-access article distributed under the terms of the Creative Commons Attribution license(https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Blah7 Soares, Felipe Tateisi, Yuka Takatsuki, Terue Yamaguchi, Atsuko O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information |
title | O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information |
title_full | O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information |
title_fullStr | O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information |
title_full_unstemmed | O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information |
title_short | O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information |
title_sort | o-jmesh: creating a bilingual english-japanese controlled vocabulary of mesh uids through machine translation and mutual information |
topic | Blah7 |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8510863/ https://www.ncbi.nlm.nih.gov/pubmed/34638173 http://dx.doi.org/10.5808/gi.21014 |
work_keys_str_mv | AT soaresfelipe ojmeshcreatingabilingualenglishjapanesecontrolledvocabularyofmeshuidsthroughmachinetranslationandmutualinformation AT tateisiyuka ojmeshcreatingabilingualenglishjapanesecontrolledvocabularyofmeshuidsthroughmachinetranslationandmutualinformation AT takatsukiterue ojmeshcreatingabilingualenglishjapanesecontrolledvocabularyofmeshuidsthroughmachinetranslationandmutualinformation AT yamaguchiatsuko ojmeshcreatingabilingualenglishjapanesecontrolledvocabularyofmeshuidsthroughmachinetranslationandmutualinformation |