Cargando…

Creating a medical English-Swedish dictionary using interactive word alignment

BACKGROUND: This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European...

Descripción completa

Detalles Bibliográficos
Autores principales: Nyström, Mikael, Merkel, Magnus, Ahrenberg, Lars, Zweigenbaum, Pierre, Petersson, Håkan, Åhlfeldt, Hans
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1624822/
https://www.ncbi.nlm.nih.gov/pubmed/17034649
http://dx.doi.org/10.1186/1472-6947-6-35
_version_ 1782130568614379520
author Nyström, Mikael
Merkel, Magnus
Ahrenberg, Lars
Zweigenbaum, Pierre
Petersson, Håkan
Åhlfeldt, Hans
author_facet Nyström, Mikael
Merkel, Magnus
Ahrenberg, Lars
Zweigenbaum, Pierre
Petersson, Håkan
Åhlfeldt, Hans
author_sort Nyström, Mikael
collection PubMed
description BACKGROUND: This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish. METHODS: The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates. RESULTS: A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems and these results indicate the power of the method for finding inconsistencies in terminology translations. We also report on some factors that may contribute to making the process of dictionary creation with similar tools even more expedient. Finally, the contribution is discussed in relation to other ongoing efforts in constructing medical lexicons for non-English languages. CONCLUSION: In three man weeks we were able to produce a medical English-Swedish dictionary consisting of 31,000 entries and also found hidden translation errors in the utilized medical terminology systems.
format Text
id pubmed-1624822
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16248222006-10-26 Creating a medical English-Swedish dictionary using interactive word alignment Nyström, Mikael Merkel, Magnus Ahrenberg, Lars Zweigenbaum, Pierre Petersson, Håkan Åhlfeldt, Hans BMC Med Inform Decis Mak Research Article BACKGROUND: This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish. METHODS: The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates. RESULTS: A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems and these results indicate the power of the method for finding inconsistencies in terminology translations. We also report on some factors that may contribute to making the process of dictionary creation with similar tools even more expedient. Finally, the contribution is discussed in relation to other ongoing efforts in constructing medical lexicons for non-English languages. CONCLUSION: In three man weeks we were able to produce a medical English-Swedish dictionary consisting of 31,000 entries and also found hidden translation errors in the utilized medical terminology systems. BioMed Central 2006-10-12 /pmc/articles/PMC1624822/ /pubmed/17034649 http://dx.doi.org/10.1186/1472-6947-6-35 Text en Copyright © 2006 Nyström et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Nyström, Mikael
Merkel, Magnus
Ahrenberg, Lars
Zweigenbaum, Pierre
Petersson, Håkan
Åhlfeldt, Hans
Creating a medical English-Swedish dictionary using interactive word alignment
title Creating a medical English-Swedish dictionary using interactive word alignment
title_full Creating a medical English-Swedish dictionary using interactive word alignment
title_fullStr Creating a medical English-Swedish dictionary using interactive word alignment
title_full_unstemmed Creating a medical English-Swedish dictionary using interactive word alignment
title_short Creating a medical English-Swedish dictionary using interactive word alignment
title_sort creating a medical english-swedish dictionary using interactive word alignment
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1624822/
https://www.ncbi.nlm.nih.gov/pubmed/17034649
http://dx.doi.org/10.1186/1472-6947-6-35
work_keys_str_mv AT nystrommikael creatingamedicalenglishswedishdictionaryusinginteractivewordalignment
AT merkelmagnus creatingamedicalenglishswedishdictionaryusinginteractivewordalignment
AT ahrenberglars creatingamedicalenglishswedishdictionaryusinginteractivewordalignment
AT zweigenbaumpierre creatingamedicalenglishswedishdictionaryusinginteractivewordalignment
AT peterssonhakan creatingamedicalenglishswedishdictionaryusinginteractivewordalignment
AT ahlfeldthans creatingamedicalenglishswedishdictionaryusinginteractivewordalignment