OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Artificial Intelligence and Robotics Research 2024

基于分割的自然场景文本检测技术应用综述
A Review on the Application of Segmentation-Based Text Detection Techniques for Natural Scenes

DOI: 10.12677/airr.2024.132041, PP. 399-407

陈伟杰, 夏易行, 杜世杰

Keywords: 文本检测，分割，综述
Text Detection, Segmentation, Overview

Full-Text Cite this paper Add to My Lib

Abstract:

场景文本检测旨在从自然场景中准确检测出存在的文本。目前基于分割的场景文本检测技术面临文字种类多样、背景复杂、形状不规则等挑战，但是缺少相应的综合技术，因此，本文将对自然场景文本检测技术进行综述。以下是本文主要内容：1) 阐述场景文本检测领域基于分割的检测算法，包括语义分割和实例分割。2) 介绍一些经典模型和近年提出的创新模型，对其进行分析整合。3) 介绍常用自然场景文本数据集以及对比不同算法的优缺点、性能等。4) 展望基于分割的自然场景文本检测算法未来发展趋势。
Scene text detection aims to accurately detect the presence of text from natural scenes. The current segmentation-based scene text detection technology faces challenges such as diverse text types, complex backgrounds, irregular shapes, etc., but lacks the corresponding comprehensive technology; therefore, this paper will review the natural scene text detection technology. The following is the main content of this paper: 1) Explaining the segmentation-based detection algorithms in the field of scene text detection, including semantic segmentation and instance segmentation. 2) Introducing some classical models and innovative models proposed in recent years, and analyzing and integrating them. 3) Introducing the commonly used natural scene text datasets as well as comparing the strengths and weaknesses of different algorithms and their performances, etc. 4) Prospecting the future development of segmentation-based natural scene text detection algorithms, looking ahead to the future development trends of segmentation-based natural scene text detection algorithms.

References

[1]	Tian, Z., Huang, W., He, T., et al. (2016) Detecting Text in Natural Image with Connectionist Text Proposal Network. Computer Vision—ECCV 2016, Amsterdam, 11-14 October 2016, 56-72. https://doi.org/10.1007/978-3-319-46484-8_4
[2]	Ren, S., He, K., Girshick, R., et al. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. https://doi.org/10.1109/tpami.2016.2577031
[3]	Liao, M., Shi, B., Bai, X., et al. (2022) TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Proceedings of the AAAI Conference on Artificial Intelligence, 31. https://doi.org/10.1609/aaai.v31i1.11196
[4]	Liao, M., Shi, B. and Bai, X. (2018) TextBoxes : A Single-Shot Oriented Scene Text Detector. IEEE Transactions on Image Processing, 27, 3676-3690. https://doi.org/10.1109/tip.2018.2825107
[5]	Guo, L., Chen Z. and Chen, X. (2022) Arbitrary-Shaped Text Detection with Gaussian Probability Distance Distribution. 2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET), Beijing, 19-21 August 2022, 58-64. https://doi.org/10.1109/CCET55412.2022.9906393
[6]	Cui, C., Lu, L., Tan, Z. and Hussain, A. (2021) Conceptual Text Region Network: Cognition-Inspired Accurate Scene Text Detection. Neurocomputing, 464, 252-264.
[7]	Liu, F., Gu, D. and Chen, C. (2019) IoU-Related Arbitrary Shape Text Scoring Detector. IEEE Access, 7, 180428-180437. https://doi.org/10.1109/access.2019.2959018
[8]	Wu, Y., Kong, Q., Lai, Y., Narducci, F. and Wan, S. (2023) CDText: Scene Text Detector Based on Context-Aware Deformable Transformer. Pattern Recognition Letters, 172, 8-14. https://doi.org/10.1016/j.patrec.2023.05.025
[9]	Naim, S. and Moumkine, N. (2023) Semantic Segmentation Architecture for Text Detection with an Attention Module. In: Kacprzyk, J., Ezziyyani, M. and Balas, V.E., Eds., International Conference on Advanced Intelligent Systems for Sustainable Development, Springer, Cham, 359-367. https://doi.org/10.1007/978-3-031-35251-5_35
[10]	Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. Lecture Notes in Computer Science. Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, 5-9 October 2015, 234-241. https://doi.org/10.1007/978-3-319-24574-4_28
[11]	Wang, Z., et al. (2022) A Robust Method: Arbitrary Shape Text Detection Combining Semantic and Position Information. Sensors, 22, Article 9982. https://doi.org/10.3390/s22249982
[12]	Zhang, Z, et al. (2016) Multi-Oriented Text Detection with Fully Convolutional Networks. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 4159-4167. https://doi.org/10.1109/CVPR.2016.451
[13]	Chen, J., et al. (2019) Irregular Scene Text Detection via Attention Guided Border Labeling. Science China Information Sciences, 62, Article No. 220103. https://doi.org/10.1007/s11432-019-2673-8
[14]	Baek, Y., et al. (2019) Character Region Awareness for Text Detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 9357-9366. https://doi.org/10.1109/cvpr.2019.00959
[15]	Zhao, L., et al. (2022) Background-Insensitive Scene Text Recognition with Text Semantic Segmentation. Springer, Cham. https://doi.org/10.1007/978-3-031-19806-9_10
[16]	Liao, M., et al. (2020) Real-Time Scene Text Detection with Differentiable Binarization. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 11474-11481. https://doi.org/10.1609/aaai.v34i07.6812
[17]	Liu, Y., et al. (2022) Efficient and Accurate Text Detection Combining Differentiable Binarization with Semantic Segmentation. Lecture Notes in Computer Science. Artificial Neural Networks and Machine Learning—ICANN 2022, Bristol, 6-9 September 2022, 630-642. https://doi.org/10.1007/978-3-031-15934-3_52
[18]	Liao, M., et al. (2023) Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 919-931. https://doi.org/10.1109/tpami.2022.3155612
[19]	Liu, C., et al. (2019) Enhancing Scene Text Detection via Fused Semantic Segmentation Network with Attention. MultiMedia Modeling, Thessaloniki, 8-11 January 2019, 531-542. https://doi.org/10.1007/978-3-030-05710-7_44
[20]	He, K., et al. (2020) Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 386-397. https://doi.org/10.1109/tpami.2018.2844175
[21]	Liao, M., et al. (2021) Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 532-548. https://doi.org/10.1109/tpami.2019.2937086
[22]	Liao, M.H., Lyu, P.Y., He, M.H., et al. (2019) Mask TextSpotter: An End-to End Trainable Neural Network for Spotting Text with Arbitrary Shapes. IEEE Trans Pattern Anal Machine Intelligence, 43, 532-548. https://doi.org/0.1109/TPAMI.2019.2937086
[23]	Xie, E., et al. (2019) Scene Text Detection with Supervised Pyramid Context Network. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 9038-9045. https://doi.org/10.1609/aaai.v33i01.33019038
[24]	Deng, D., et al. (2022) PixelLink: Detecting Scene Text via Instance Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 32. https://doi.org/10.1609/aaai.v32i1.12269
[25]	Wang, W., et al. (2019) Shape Robust Text Detection with Progressive Scale Expansion Network. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 9328-9337. https://doi.org/10.1109/cvpr.2019.00956
[26]	Wang, W., et al. (2019) Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 8439-8448. https://doi.org/10.1109/iccv.2019.00853
[27]	Liu, Y., et al. (2021) FCENet: An Instance Segmentation Model for Extracting Figures and Captions from Material Documents. IEEE Access, 9, 551-564. https://doi.org/10.1109/access.2020.3046496
[28]	Chen, H., et al. (2020) BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 8570-8578. https://doi.org/10.1109/cvpr42600.2020.00860
[29]	Wang, W., et al. (2021) PAN : Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 5349-5367. https://doi.org/10.1109/tpami.2021.3077555
[30]	Qian, X., et al. (2020) MGPAN: Mask Guided Pixel Aggregation Network. 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, 25-28 October 2020, 1981-1985. https://doi.org/10.1109/icip40778.2020.9190897
[31]	Fu, Z., et al. (2023) Learning Pixel Affinity Pyramid for Arbitrary-Shaped Text Detection. ACM Transactions on Multimedia Computing, Communications, and Applications, 19, Article No. 29. https://doi.org/10.1145/3524617
[32]	Li, H., et al. (2023) Arbitrary Shape Scene Text Detector with Accurate Text Instance Generation Based on Instance-Relevant Contexts. Multimedia Tools and Applications, 82, 17827-17852. https://doi.org/10.1007/s11042-022-13897-7
[33]	Zhang, S.-X., et al. (2022) Arbitrary Shape Text Detection via Segmentation with Probability Maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 2736-2750. https://doi.org/10.1109/tpami.2022.3176122
[34]	Ye, J, et al. (2020) TextFuseNet: Scene Text Detection with Richer Fused Features. Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence (IJCAI-PRICAI-20 2020), 516-522. https://doi.org/10.24963/ijcai.2020/72
[35]	Xu, Y., et al. (2019) TextField: Learning a Deep Direction Field for Irregular Scene Text Detection. IEEE Transactions on Image Processing, 28, 5566-5579. https://doi.org/10.1109/tip.2019.2900589
[36]	Liu, Z., et al. (2021) MFECN: Multi-Level Feature Enhanced Cumulative Network for Scene Text Detection. ACM Transactions on Multimedia Computing, Communications, and Applications, 17, Article No. 78. https://doi.org/10.1145/3440087
[37]	Song, X., et al. (2020) TK-Text: Multi-Shaped Scene Text Detection via Instance Segmentation. MultiMedia Modeling, Daejeon, 5-8 January 2020, 201-213. https://doi.org/10.1007/978-3-030-37734-2_17
[38]	Wu, Y., et al. (2021) Multiple Attention Encoded Cascade R-CNN for Scene Text Detection. Journal of Visual Communication and Image Representation, 80, Article 103261. https://doi.org/10.1016/j.jvcir.2021.103261
[39]	Yang, P., et al. (2020) Instance Segmentation Network with Self-Distillation for Scene Text Detection. IEEE Access, 8, 45825-45836. https://doi.org/10.1109/access.2020.2978225
[40]	Sheng, T., et al. (2021) CentripetalText: An Efficient Text Instance Representation for Scene Text Detection. https://doi.org/10.48550/arXiv.2107.05945
[41]	Zhu, Y. and Du, J. (2021) TextMountain: Accurate Scene Text Detection via Instance Segmentation. Pattern Recognition, 110, Article 107336. https://doi.org/10.1016/j.patcog.2020.107336
[42]	Hu, Z., et al. (2021) TCATD: Text Contour Attention for Scene Text Detection. 2020 25th International Conference on Pattern Recognition (ICPR), Milan, 10-15 January 2021, 1083-1088. https://doi.org/10.1109/icpr48806.2021.9412223

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413

基于分割的自然场景文本检测技术应用综述A Review on the Application of Segmentation-Based Text Detection Techniques for Natural Scenes

基于分割的自然场景文本检测技术应用综述
A Review on the Application of Segmentation-Based Text Detection Techniques for Natural Scenes