Cargando…

Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean

BACKGROUND: There are limited data on the accuracy of cloud-based speech recognition (SR) open application programming interfaces (APIs) for medical terminology. This study aimed to evaluate the medical term recognition accuracy of current available cloud-based SR open APIs in Korean. METHODS: We an...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Seung-Hwa, Park, Jungchan, Yang, Kwangmo, Min, Jeongwon, Choi, Jinwook
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Korean Academy of Medical Sciences 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9091429/
https://www.ncbi.nlm.nih.gov/pubmed/35535371
http://dx.doi.org/10.3346/jkms.2022.37.e144
_version_ 1784704920630329344
author Lee, Seung-Hwa
Park, Jungchan
Yang, Kwangmo
Min, Jeongwon
Choi, Jinwook
author_facet Lee, Seung-Hwa
Park, Jungchan
Yang, Kwangmo
Min, Jeongwon
Choi, Jinwook
author_sort Lee, Seung-Hwa
collection PubMed
description BACKGROUND: There are limited data on the accuracy of cloud-based speech recognition (SR) open application programming interfaces (APIs) for medical terminology. This study aimed to evaluate the medical term recognition accuracy of current available cloud-based SR open APIs in Korean. METHODS: We analyzed the SR accuracy of currently available cloud-based SR open APIs using real doctor–patient conversation recordings collected from an outpatient clinic at a large tertiary medical center in Korea. For each original and SR transcription, we analyzed the accuracy rate of each cloud-based SR open API (i.e., the number of medical terms in the SR transcription per number of medical terms in the original transcription). RESULTS: A total of 112 doctor–patient conversation recordings were converted with three cloud-based SR open APIs (Naver Clova SR from Naver Corporation; Google Speech-to-Text from Alphabet Inc.; and Amazon Transcribe from Amazon), and each transcription was compared. Naver Clova SR (75.1%) showed the highest accuracy with the recognition of medical terms compared to the other open APIs (Google Speech-to-Text, 50.9%, P < 0.001; Amazon Transcribe, 57.9%, P < 0.001), and Amazon Transcribe demonstrated higher recognition accuracy compared to Google Speech-to-Text (P < 0.001). In the sub-analysis, Naver Clova SR showed the highest accuracy in all areas according to word classes, but the accuracy of words longer than five characters showed no statistical differences (Naver Clova SR, 52.6%; Google Speech-to-Text, 56.3%; Amazon Transcribe, 36.6%). CONCLUSION: Among three current cloud-based SR open APIs, Naver Clova SR which manufactured by Korean company showed highest accuracy of medical terms in Korean, compared to Google Speech-to-Text and Amazon Transcribe. Although limitations are existing in the recognition of medical terminology, there is a lot of rooms for improvement of this promising technology by combining strengths of each SR engines.
format Online
Article
Text
id pubmed-9091429
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Korean Academy of Medical Sciences
record_format MEDLINE/PubMed
spelling pubmed-90914292022-05-17 Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean Lee, Seung-Hwa Park, Jungchan Yang, Kwangmo Min, Jeongwon Choi, Jinwook J Korean Med Sci Original Article BACKGROUND: There are limited data on the accuracy of cloud-based speech recognition (SR) open application programming interfaces (APIs) for medical terminology. This study aimed to evaluate the medical term recognition accuracy of current available cloud-based SR open APIs in Korean. METHODS: We analyzed the SR accuracy of currently available cloud-based SR open APIs using real doctor–patient conversation recordings collected from an outpatient clinic at a large tertiary medical center in Korea. For each original and SR transcription, we analyzed the accuracy rate of each cloud-based SR open API (i.e., the number of medical terms in the SR transcription per number of medical terms in the original transcription). RESULTS: A total of 112 doctor–patient conversation recordings were converted with three cloud-based SR open APIs (Naver Clova SR from Naver Corporation; Google Speech-to-Text from Alphabet Inc.; and Amazon Transcribe from Amazon), and each transcription was compared. Naver Clova SR (75.1%) showed the highest accuracy with the recognition of medical terms compared to the other open APIs (Google Speech-to-Text, 50.9%, P < 0.001; Amazon Transcribe, 57.9%, P < 0.001), and Amazon Transcribe demonstrated higher recognition accuracy compared to Google Speech-to-Text (P < 0.001). In the sub-analysis, Naver Clova SR showed the highest accuracy in all areas according to word classes, but the accuracy of words longer than five characters showed no statistical differences (Naver Clova SR, 52.6%; Google Speech-to-Text, 56.3%; Amazon Transcribe, 36.6%). CONCLUSION: Among three current cloud-based SR open APIs, Naver Clova SR which manufactured by Korean company showed highest accuracy of medical terms in Korean, compared to Google Speech-to-Text and Amazon Transcribe. Although limitations are existing in the recognition of medical terminology, there is a lot of rooms for improvement of this promising technology by combining strengths of each SR engines. The Korean Academy of Medical Sciences 2022-05-04 /pmc/articles/PMC9091429/ /pubmed/35535371 http://dx.doi.org/10.3346/jkms.2022.37.e144 Text en © 2022 The Korean Academy of Medical Sciences. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Lee, Seung-Hwa
Park, Jungchan
Yang, Kwangmo
Min, Jeongwon
Choi, Jinwook
Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean
title Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean
title_full Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean
title_fullStr Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean
title_full_unstemmed Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean
title_short Accuracy of Cloud-Based Speech Recognition Open Application Programming Interface for Medical Terms of Korean
title_sort accuracy of cloud-based speech recognition open application programming interface for medical terms of korean
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9091429/
https://www.ncbi.nlm.nih.gov/pubmed/35535371
http://dx.doi.org/10.3346/jkms.2022.37.e144
work_keys_str_mv AT leeseunghwa accuracyofcloudbasedspeechrecognitionopenapplicationprogramminginterfaceformedicaltermsofkorean
AT parkjungchan accuracyofcloudbasedspeechrecognitionopenapplicationprogramminginterfaceformedicaltermsofkorean
AT yangkwangmo accuracyofcloudbasedspeechrecognitionopenapplicationprogramminginterfaceformedicaltermsofkorean
AT minjeongwon accuracyofcloudbasedspeechrecognitionopenapplicationprogramminginterfaceformedicaltermsofkorean
AT choijinwook accuracyofcloudbasedspeechrecognitionopenapplicationprogramminginterfaceformedicaltermsofkorean