全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

融合空间结构权重优化注意力机制的建筑物立面元素检测
Building Facade Element Detection Based on Spatial Structure Weight Optimization Attention Mechanism

DOI: 10.12677/GST.2023.112014, PP. 122-134

Keywords: 立面解析,建筑物立面,立面元素检测;Facade Parsing, Building Facade, Facade Elements Detection

Full-Text   Cite this paper   Add to My Lib

Abstract:

本文针对街景图像立面元素检测问题,提出了融合空间结构权重优化注意力机制的立面元素目标检测网络。在主干网络部分使用嵌入基于空间结构优化坐标注意力机制的C3模块,增加横纵坐标权重分支,有效利用空间结构编码信息,提升立面元素定位精度;其次针对立面最主要组成元素窗户、阳台的小目标特性,使用改进的递归门控卷积模块替换原始卷积模块,融合丰富的多尺度上下文信息,并增加小目标检测分支,提升检测精度;最后设计了ECIOU损失同时对检测框的长宽比以及定位中心进行监督,增强网络对立面元素的感知能力,提升网络收敛速度。在FacadeWHU数据集上实验结果表明,本文模型的平均精度比相较于基线网络Yolov5s而言整体平均精度提升了16.4%,窗户目标的平均精度提升了22.4%,阳台目标的平均精度提升了25.5%,可以有效检测立面元素,更好的服务于病害检测、能耗分析等下游任务。
Aiming at the problem of facade element detection in street view image, this paper proposes a fa-cade element object detection network integrating spatial structure weight optimization mecha-nism. C3 module embedded in the coordinate attention mechanism based on spatial structure op-timization is used in the backbone network to increase the weight branches of horizontal and verti-cal coordinates, effectively use the spatial structure coding information, and improve the position-ing accuracy of elevation elements. Secondly, in view of the small target characteristics of Windows and balconies, which are the main components of the facade, an improved recursive gated convolu-tional module is used to replace the original convolutional module, integrate rich multi-scale con-text information, and add small target detection branches to improve detection accuracy. Finally, ECIOU loss is designed to supervise the aspect ratio of the detection frame and the positioning cen-ter, which enhances the perception ability of the opposite elements of the network and improves the convergence speed of the network. Experimental results on Facade WHU data set show that compared with baseline network yolov5s, the average accuracy of the proposed model is improved by 16.4% overall, 22.4% for window target and 25.5% for balcony target, which can effectively de-tect facade elements. Better service for disease analysis, energy consumption analysis and other downstream tasks.

References

[1]  傅一平. 智慧城市必不可少的五大关键技术[J]. 计算机与网络, 2020, 46(11): 44-45.
[2]  赵玲玲. 《实景三维中国建设技术大纲(2021版)》印发[J]. 资源导刊, 2021(8): 6.
[3]  Klimkowska, A., Cavazzi, S., Leach, R., et al. (2022) Detailed Three-Dimensional Building Fa?ade Reconstruction: A Review on Applications, Data and Technologies. Remote Sensing, 14, 2579.
https://doi.org/10.3390/rs14112579
[4]  Teboul, O., Simon, L., Koutsourakis, P., et al. (2010) Segmentation of Building Facades Using Procedural Shape Priors. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, 13-18 June 2010, 3105-3112.
https://doi.org/10.1109/CVPR.2010.5540068
[5]  Teboul, O., Kokkinos, I., Simon, L., et al. (2011) Shape Grammar Parsing via Reinforcement Learning. CVPR 2011, Colorado Springs, 20-25 June 2011, 2273-2280.
https://doi.org/10.1109/CVPR.2011.5995319
[6]  Gadde, R., Marlet, R. and Paragios, N. (2016) Learning Grammars for Architecture-Specific Facade Parsing. International Journal of Computer Vision, 117, 290-316.
https://doi.org/10.1007/s11263-016-0887-4
[7]  Müller, P., Zeng, G., Wonka, P., et al. (2007) Image-Based Procedural Modeling of Facades. ACM Transactions on Graphics, 26, 85.
[8]  Wu, C.C., Frahm, J.-M. and Pollefeys, M. (2010) Detect-ing Large Repetitive Structures with Salient Boundaries. 11th European Conference on Computer Vision, Heraklion, 5-11 Sep-tember 2010, 142-155.
https://doi.org/10.1007/978-3-642-15552-9_11
[9]  Cohen, A., Oswald, M.R., Liu, Y., et al. (2017) Symmetry-Aware Fa?ade Parsing with Occlusions. 2017 International Conference on 3D Vision (3DV), Qingdao, 10-12 October 2017, 393-401.
https://doi.org/10.1109/3DV.2017.00052
[10]  Xiao, H., Meng, G., Wang, L., et al. (2018) Facade Repetition Detection in a Fronto-Parallel View with Fiducial Lines Extraction. Neurocomputing, 273, 435-447.
https://doi.org/10.1016/j.neucom.2017.07.040
[11]  Ali, H., Seifert, C., Jindal, N., et al. (2007) Window Detection in Fa-cades. 14th International Conference on Image Analysis and Processing (ICIAP 2007), Modena, 10-14 September 2007, 837-842.
https://doi.org/10.1109/ICIAP.2007.4362880
[12]  高云龙, 张帆, 屈孝志, 黄先锋, 崔婷婷. 结合样本自动选择与规则性约束的窗户提取方法[J]. 武汉大学学报(信息科学版), 2018, 43(3): 436-443.
[13]  Liu, H., Xu, Y., Zhang, J., et al. (2020) DeepFacade: A Deep Learning Approach to Facade Parsing with Symmetric Loss. IEEE Transactions on Multimedia, 22, 3153-3165.
https://doi.org/10.1109/TMM.2020.2971431
[14]  Sun, Y., Malihi, S., Li, H., et al. (2022) DeepWindows: Windows Instance Segmentation through an Improved Mask R-CNN Using Spatial Attention and Relation Modules. ISPRS In-ternational Journal of Geo-Information, 11, 162.
https://doi.org/10.3390/ijgi11030162
[15]  Li, C.K., Zhang, H.X., Liu, J.X., et al. (2020) Window Detection in Facades Using Heatmap Fusion. Journal of Computer Science and Technology, 35, 900-912.
[16]  Redmon, J. and Farhadi, A. (2018) YOLOv3: An Incremental Improvement.
[17]  Wang, W., Xie, E., Song, X., et al. (2019) Efficient and Accurate Arbi-trary-Shaped Text Detection with Pixel Aggregation Network. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 8439-8448.
https://doi.org/10.1109/ICCV.2019.00853
[18]  Hou, Q., Zhou, D. and Feng, J. (2021) Coordinate Attention for Efficient Mobile Network Design. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 13708-13717.
https://doi.org/10.1109/CVPR46437.2021.01350
[19]  Hu, J., Shen, L. and Sun, G. (2017) Squeeze-and-Excitation Net-works. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-22 June 2018, 7132-7141.
https://doi.org/10.1109/CVPR.2018.00745
[20]  Rao, Y., Zhao, W., Tang, Y., et al. (2022) HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions. 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, 28 November-9 December 2022, 10353-10366.
[21]  Zheng, Z., Wang, P., Ren, D., et al. (2022) Enhanc-ing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. IEEE Transactions on Cybernetics, 52, 8574-8586.
https://doi.org/10.1109/TCYB.2021.3095305
[22]  Zhang, Y., Ren, W., Zhang, Z., et al. (2022) Focal and Efficient IOU Loss for Accurate Bounding Box Regression. Neurocomputing, 506, 146-157.
https://doi.org/10.1016/j.neucom.2022.07.042
[23]  Kong, G.F. and Fan, H.C. (2021) Enhanced Facade Parsing for Street-Level Images Using Convolutional Neural Networks. IEEE Transactions on Geoscience and Remote Sensing, 59, 10519-10531.
https://doi.org/10.1109/TGRS.2020.3035878
[24]  Ren, S., He, K., Girshick, R., et al. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149.
https://doi.org/10.1109/TPAMI.2016.2577031
[25]  Liu, W., Anguelov, D., Erhan, D., et al. (2015) SSD: Single Shot MultiBox Detector. 14th European Conference Computer Vision, Amsterdam, 11-14 October 2016, 21-37.
https://doi.org/10.1007/978-3-319-46448-0_2
[26]  Lin, T.Y., Goyal, P., Girshick, R., et al. (2017) Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39, 2999-3007.
https://doi.org/10.1109/ICCV.2017.324

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413