Cargando…

R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation

Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satis...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xiqi, Zheng, Shunyi, Zhang, Ce, Li, Rui, Gui, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7865800/
https://www.ncbi.nlm.nih.gov/pubmed/33525619
http://dx.doi.org/10.3390/s21030888
_version_ 1783647931944927232
author Wang, Xiqi
Zheng, Shunyi
Zhang, Ce
Li, Rui
Gui, Li
author_facet Wang, Xiqi
Zheng, Shunyi
Zhang, Ce
Li, Rui
Gui, Li
author_sort Wang, Xiqi
collection PubMed
description Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satisfy practical detection requirements for various real-world images such as image streams or videos. To address this lacuna, we propose a novel method called Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrarily-oriented texts in natural image scenes. First, a rotated anchor box with angle information is used as the text bounding box over various orientations. Second, features of various scales are extracted from the input image to determine the probability, confidence, and inclined bounding boxes of the text. Finally, Rotational Distance Intersection over Union Non-Maximum Suppression is used to eliminate redundancy and acquire detection results with the highest accuracy. Experiments on benchmark comparison are conducted upon four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and ICDAR2017-MLT. The results indicate that the proposed R-YOLO method significantly outperforms state-of-the-art methods in terms of detection efficiency while maintaining high accuracy; for example, the proposed R-YOLO method achieves an F-measure of 82.3% at 62.5 fps with 720 p resolution on the ICDAR2015 dataset.
format Online
Article
Text
id pubmed-7865800
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-78658002021-02-07 R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation Wang, Xiqi Zheng, Shunyi Zhang, Ce Li, Rui Gui, Li Sensors (Basel) Article Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satisfy practical detection requirements for various real-world images such as image streams or videos. To address this lacuna, we propose a novel method called Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrarily-oriented texts in natural image scenes. First, a rotated anchor box with angle information is used as the text bounding box over various orientations. Second, features of various scales are extracted from the input image to determine the probability, confidence, and inclined bounding boxes of the text. Finally, Rotational Distance Intersection over Union Non-Maximum Suppression is used to eliminate redundancy and acquire detection results with the highest accuracy. Experiments on benchmark comparison are conducted upon four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and ICDAR2017-MLT. The results indicate that the proposed R-YOLO method significantly outperforms state-of-the-art methods in terms of detection efficiency while maintaining high accuracy; for example, the proposed R-YOLO method achieves an F-measure of 82.3% at 62.5 fps with 720 p resolution on the ICDAR2015 dataset. MDPI 2021-01-28 /pmc/articles/PMC7865800/ /pubmed/33525619 http://dx.doi.org/10.3390/s21030888 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Xiqi
Zheng, Shunyi
Zhang, Ce
Li, Rui
Gui, Li
R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_full R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_fullStr R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_full_unstemmed R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_short R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_sort r-yolo: a real-time text detector for natural scenes with arbitrary rotation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7865800/
https://www.ncbi.nlm.nih.gov/pubmed/33525619
http://dx.doi.org/10.3390/s21030888
work_keys_str_mv AT wangxiqi ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation
AT zhengshunyi ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation
AT zhangce ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation
AT lirui ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation
AT guili ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation