Cargando…

Scene text detection via extremal region based double threshold convolutional network classification

In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Wei, Lou, Jing, Chen, Longtao, Xia, Qingyuan, Ren, Mingwu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5562312/
https://www.ncbi.nlm.nih.gov/pubmed/28820891
http://dx.doi.org/10.1371/journal.pone.0182227
_version_ 1783257948687958016
author Zhu, Wei
Lou, Jing
Chen, Longtao
Xia, Qingyuan
Ren, Mingwu
author_facet Zhu, Wei
Lou, Jing
Chen, Longtao
Xia, Qingyuan
Ren, Mingwu
author_sort Zhu, Wei
collection PubMed
description In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN) is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013.
format Online
Article
Text
id pubmed-5562312
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-55623122017-08-25 Scene text detection via extremal region based double threshold convolutional network classification Zhu, Wei Lou, Jing Chen, Longtao Xia, Qingyuan Ren, Mingwu PLoS One Research Article In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN) is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013. Public Library of Science 2017-08-18 /pmc/articles/PMC5562312/ /pubmed/28820891 http://dx.doi.org/10.1371/journal.pone.0182227 Text en © 2017 Zhu et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zhu, Wei
Lou, Jing
Chen, Longtao
Xia, Qingyuan
Ren, Mingwu
Scene text detection via extremal region based double threshold convolutional network classification
title Scene text detection via extremal region based double threshold convolutional network classification
title_full Scene text detection via extremal region based double threshold convolutional network classification
title_fullStr Scene text detection via extremal region based double threshold convolutional network classification
title_full_unstemmed Scene text detection via extremal region based double threshold convolutional network classification
title_short Scene text detection via extremal region based double threshold convolutional network classification
title_sort scene text detection via extremal region based double threshold convolutional network classification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5562312/
https://www.ncbi.nlm.nih.gov/pubmed/28820891
http://dx.doi.org/10.1371/journal.pone.0182227
work_keys_str_mv AT zhuwei scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification
AT loujing scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification
AT chenlongtao scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification
AT xiaqingyuan scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification
AT renmingwu scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification