全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于特征重构的YOLO目标检测模型实现
Implementation of YOLO Target Detection Model Based on Feature Reconstruction

DOI: 10.12677/AIRR.2024.131013, PP. 112-120

Keywords: 目标检测,YOLOv5s,特征重构
Object Detection
, YOLOv5s, Feature Reconstruction

Full-Text   Cite this paper   Add to My Lib

Abstract:

目标检测是计算机视觉领域的四大核心任务之一,它涵盖了目标的分类和定位,至今已有近二十年的研究历史。YOLOv5在YOLO系列目标检测算法中广泛应用,这得益于其稳定的工程实践能力,以及可以较好地平衡检测精度和速度,但YOLOv5算法的检测精度距两阶段目标检测器的性能还有差距。因此本文选取其作为研究对象,对其进行改进,力求在保证检测速度基本不变的同时尽可能地提升精度。针对YOLOv5s目标检测算法不能充分利用特征信息的问题,对YOLOv5s进行改进后,提出一种基于特征重构模块的目标检测模型(F-YOLOv5s)。该特征重构模块将w-h平面上的特征信息转移到空间维度,能够减少下采样带来的信息损失,从而提高目标检测的准确率。实验表明,在PASCAL VOC2007和VOC2012数据集上,本文提出的特征重构模块能有效提高特征信息的利用率,使得检测精度大幅度提升。
Object detection is one of the four core tasks in the field of computer vision, which covers the classification and localization of objects, and has been studied for nearly two decades. YOLOv5 is widely used in the YOLO series target detection algorithms, which is due to its stable engineering practice ability and good balance between detection accuracy and speed. However, the detection accuracy of YOLOv5 algorithm is still short of the performance of the two-stage target detector. Aiming at the problem that the YOLOv5 target detection algorithm cannot make full use of feature information, an improved YOLOv5 target detection model based on feature reconstruction module (F-YOLOv5) is proposed. The feature reconstruction module transfers feature information from the w-h plane to the spatial dimension, reducing information loss caused by down sampling and thereby improving the accuracy of object detection. Experiments show that on PASCAL VOC2007 and VOC2012 data sets, the feature reconstruction module proposed in this paper can effectively improve the utilization rate of feature information and greatly improve the detection accuracy.

References

[1]  谷永立, 宗欣欣. 基于深度学习的目标检测研究综述[J]. 现代信息科技, 2022, 6(11): 76-81
[2]  Gao, X., Wu, Y., Yang, K., et al. (2015) Vehicle Bottom Anomaly Detection Algorithm Based on SIFT. Optik, 126, 3562-3566.
https://doi.org/10.1016/j.ijleo.2015.08.268
[3]  裘莉娅, 陈玮琳, 李范鸣, 等. 复杂背景下基于 LBP 纹理特征的运动目标快速检测算法[J]. 红外与毫米波学报, 2023, 41(3): 639-651.
[4]  Chacon-Murguia, M.I., Rivero-Olivas, A. and Ramirez-Quintana, J.A. (2021) Adaptive Fuzzy Weighted Color Histogram and HOG Appearance Model for Object Tracking with a Dynamic Trained Neural Network Prediction. Signal, Image and Video Processing, 15, 1585-1592.
https://doi.org/10.1007/s11760-021-01891-9
[5]  李雄飞, 王婧, 张小利, 等. 基于 SVM 和窗口梯度的多焦距图像融合方法[J]. 吉林大学学报(工学版), 2020, 50(1): 227-236.
[6]  Mehmood, Z. and Asghar, S. (2021) Customizing SVM as a Base Learner with AdaBoost Ensemble to Learn from Multi-Class Problems: A Hybrid Approach AdaBoost-MSVM. Knowledge-Based Systems, 217, Article ID: 106845.
https://doi.org/10.1016/j.knosys.2021.106845
[7]  刘国特, 伍伟权, 郭芳, 等. 基于改进级联 Gentle Adaboost分类器的支柱绝缘子红外图像AI识别[J]. 高电压技术, 2022, 48(3): 1088-1095.
[8]  He, X., Yang, H., Hu, Z., et al. (2022) Robust Lane Change Decision Making for Autonomous Vehicles: An Observation Adversarial Reinforcement Learning Approach. IEEE Transactions on Intelligent Vehicles, 8, 184-193.
https://doi.org/10.1109/TIV.2022.3165178
[9]  Cho, M.A., Chung, T., Lee, H., et al. (2019) N-RPN: Hard Example Learning for Region Proposal Networks. 2019 IEEE International Conference on Image Processing (ICIP), Taipei, 22-25 September 2019, 3955-3959.
https://doi.org/10.1109/ICIP.2019.8803519
[10]  Girshick, R. (2015) Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, 7-13 December 2015, 1440-1448.
https://doi.org/10.1109/ICCV.2015.169
[11]  He, K., Zhang, X., Ren, S., et al. (2015) Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904-1916.
https://doi.org/10.1109/TPAMI.2015.2389824
[12]  He, X., Lou, B., Yang, H., et al. (2022) Robust Decision Making for Autonomous Vehicles at Highway On-Ramps: A Constrained Adversarial Reinforcement Learning Approach. IEEE Transactions on Intelligent Transportation Systems, 24, 4103-4113.
https://doi.org/10.1109/TITS.2022.3229518
[13]  Redmon, J., Divvala, S., Girshick, R., et al. (2016) You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788.
https://doi.org/10.1109/CVPR.2016.91
[14]  Redmon, J. and Farhadi, A. (2017) YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 7263-7271.
https://doi.org/10.1109/CVPR.2017.690
[15]  Redmon, J. and Farhadi, A. (2018) YOLOv3: An Incremental Improvement. arXiv: 1804.02767.
[16]  Bochkovskiy, A., Wang, C.Y. and Liao, H. (2020) YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv: 2004.10934,.
[17]  Lin, T., Goyal, P., Girshick, R., et al. (2017) Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 2980-2988.
https://doi.org/10.1109/ICCV.2017.324
[18]  Lin, T.Y., Dollár, P., Girshick, R., et al. (2017) Feature Pyramid Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 2117-2125.
https://doi.org/10.1109/CVPR.2017.106
[19]  Liu, S., Qi, L., Qin, H., et al. (2018) Path Aggregation Network for Instance Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8759-8768.
https://doi.org/10.1109/CVPR.2018.00913
[20]  Kim, K., Ji, B.M., Yoon, D., et al. (2021) Self-Knowledge Distillation with Progressive Refinement of Targets. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, 10-17 October 2021, 6567-6576.
https://doi.org/10.1109/ICCV48922.2021.00650
[21]  Zhang, H., Fromont, E., Lefèvre, S., et al. (2020) Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection. Proceedings of the Asian Conference on Computer Vision, Kyoto, 30 November-4 December 2020, 104-118.
[22]  Zhang, H., Xu, C. and Zhang, S. (2023) Inner-IoU: More Effective Intersection over Union Loss with Auxiliary Bounding Box. arXiv: 2311.02877.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413