全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

面向3D目标检测任务的数据增强方法研究进展
Research Progress of Data Augmentation Methods for 3D Object Detection

DOI: 10.12677/airr.2024.132023, PP. 213-226

Keywords: 3D点云,数据增强,目标检测
3D Point Cloud
, Data Augmentation, Object Detection

Full-Text   Cite this paper   Add to My Lib

Abstract:

基于深度学习的3D点云目标检测技术在自动驾驶、智慧工业等领域快速发展的过程中起到了关键性及支撑性作用。然而,由于3D点云覆盖空间广阔、数据稀疏的特点,为了实现更高精度的目标检测,需要对原始点云数据进行数据增强操作。目前,针对2D图像数据增强方法的研究较为广泛,但是面向3D点云数据的增强方法研究仍处于早期阶段。因此,本文旨在针对3D目标检测数据增强方法研究进展进行综述,首先介绍了3D目标检测的基本技术和流程,然后介绍并分析了面向3D目标检测任务的数据增强方法,具体分为三个类别,包括基于2D图像衍化而来的3D点云数据增强方法、针对3D点云设计的增强方法以及混合与创新型数据增强方法。最后讨论了该领域存在的挑战以及未来的发展方向,为未来该领域的研究人员提供参考。
Deep learning-based 3D point cloud object detection technologies have played a crucial and supportive role in the rapid development of fields such as autonomous driving and smart industry. However, due to the vast coverage and sparse nature of 3D point clouds, data augmentation operations are necessary to achieve higher precision in object detection. Currently, there is extensive research on data augmentation methods for 2D images, but the study of augmentation methods for 3D point cloud data is still in its early stages. Therefore, this paper aims to provide a comprehensive review of the progress in data augmentation methods for 3D object detection. It first introduces the basic techniques and processes of 3D object detection, then presents and analyzes data augmentation methods for 3D object detection tasks, which are divided into three categories: methods derived from 2D image augmentation applied to 3D point clouds, methods designed specifically for 3D point clouds, and hybrid and innovative data augmentation methods. Finally, the paper discusses the challenges in this field and future directions for development, offering a reference for researchers in this area moving forward.

References

[1]  Qian, R., Lai, X. and Li, X. (2022) 3D Object Detection for Autonomous Driving: A Survey. Pattern Recognition, 130, Article 108796.
https://doi.org/10.1016/j.patcog.2022.108796
[2]  Simon, M., Milz, S., Amende, K., et al. (2018) Complex-YOLO: Real-Time 3D Object Detection on Point Clouds. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, 16-17 June 2019, 1190-1199.
https://doi.org/10.1109/CVPRW.2019.00158
[3]  Guo, Y., Wang, H., Hu, Q., et al. (2021) Deep Learning for 3D Point Clouds: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 4338-4364.
https://doi.org/10.1109/TPAMI.2020.3005434
[4]  Hou, J., Dai, A. and Niessner, M. (2019) 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 4421-4425.
https://doi.org/10.1109/CVPR.2019.00455
[5]  Fanelli, G., Dantone, M., Gall, J., et al. (2013) Random Forests for Real Time 3D Face Analysis. International Journal of Computer Vision, 101, 437-458.
https://doi.org/10.1007/s11263-012-0549-0
[6]  Pontil, M. and Verri, A. (1998) Support Vector Machines for 3D Object Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 637-646.
https://doi.org/10.1109/34.683777
[7]  Rusu, R.B., Blodow, N., Marton, Z.C., et al. (2008) Aligning Point Cloud Views Using Persistent Feature Histograms. 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, 22-26 September 2008, 3384-3391.
https://doi.org/10.1109/IROS.2008.4650967
[8]  Zhou, Z., Zhao, C., Adolfsson, D., et al. (2021) NDT-Transformer: Large-Scale 3D Point Cloud Localisation Using the Normal Distribution Transform Representation. 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, 30 May-5 June 2021, 5654-5660.
https://doi.org/10.1109/ICRA48506.2021.9560932
[9]  Zhou, Y. and Tuzel, O. (2018) VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4490-4499.
https://doi.org/10.1109/CVPR.2018.00472
[10]  Liang, G., Zhao, X., Zhao, J., et al. (2023) MVCNN: A Deep Learning-Based Ocean-Land Waveform Classification Network for Single-Wavelength LiDAR Bathymetry. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16, 656-674.
https://doi.org/10.1109/JSTARS.2022.3229062
[11]  Qi, C.R., Su, H., Mo, K., et al. (2017) PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 652-660.
[12]  Qi, C.R., Yi, L., Su, H., et al. (2017) PointNet : Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December, 2017, 5105-5114.
[13]  Phan, A.V., Nguyen, M.L., Nguyen, Y.L.H., et al. (2018) DGCNN: A Convolutional Neural Network over Large-Scale Labeled Graphs. Neural Networks, 108, 533-543.
https://doi.org/10.1016/j.neunet.2018.09.001
[14]  Bodla, N., Singh, B., Chellappa, R., et al. (2017) Soft-NMS—Improving Object Detection with one Line of Code. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 5562-5570.
https://doi.org/10.1109/ICCV.2017.593
[15]  Hu, H., Gu, J., Zhang, Z., et al. (2018) Relation Networks for Object Detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 3588-3597.
https://doi.org/10.1109/CVPR.2018.00378
[16]  Getreuer, P. (2012) Automatic Color Enhancement (ACE) and Its Fast Implementation. Image Processing on Line, 2, 266-277.
https://doi.org/10.5201/ipol.2012.g-ace
[17]  Zhong, Z., Zheng, L., Kang, G., et al. (2020) Random Erasing Data Augmentation. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, Vancouver, 20-27 February 2024, 13001-13008.
https://doi.org/10.1609/aaai.v34i07.7000
[18]  Jakubovitz, D. and Giryes, R. (2018) Improving DNN Robustness to Adversarial Attacks Using Jacobian Regularization. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, 8-14 September 2018, 525-541.
https://doi.org/10.1007/978-3-030-01258-8_32
[19]  Lang, A.H., Vora, S., Caesar, H., et al. (2019) PointPillars: Fast Encoders for Object Detection from Point Clouds. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 12689-12697.
https://doi.org/10.1109/CVPR.2019.01298
[20]  Li, R., Li, X., Heng, P.-A., et al. (2020) PointAugment: An Auto-Augmentation Framework for Point Cloud Classification. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 6377-6386.
https://doi.org/10.1109/CVPR42600.2020.00641
[21]  Sun, P., Wang, W., Chai, Y., et al. (2021) RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 5721-5730.
https://doi.org/10.1109/CVPR46437.2021.00567
[22]  Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2021) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. Proceedings of the 9th International Conference on Learning Representations, Online, 3-7 May 2021, 1-21.
[23]  Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 4-9 December 2017, 1-11.
[24]  Sun, P., Tan, M., Wang, W., et al. (2022) SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds. Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, 23-27 October 2022, 426-442.
https://doi.org/10.1007/978-3-031-20080-9_25
[25]  Zeng, A., Song, S., Niessner, M., et al. (2017) 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 1802-1811.
https://doi.org/10.1109/CVPR.2017.29
[26]  Santhakumar, K., et al. (2021) Exploring 2D Data Augmentation for 3D Monocular Object Detection. arXiv:2104.10786
[27]  Tomasi, C. and Manduchi, R. (1998) Bilateral Filtering for Gray and Color Images. Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), 7 January 1998, Bombay, 839-846.
[28]  Zhao, H.-K., Osher, S. and Fedkiw, R. (2001) Fast Surface Reconstruction Using the Level Set Method. Proceedings IEEE Workshop on Variational and Level Set Methods in Computer Vision, Vancouver, 13 July 2001, 194-201.
[29]  Khoury, M., Zhou, Q.-Y. and Koltun, V. (2017) Learning Compact Geometric Features. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 153-161.
https://doi.org/10.1109/ICCV.2017.26
[30]  Shi, S., Wang, X. and Li, H. (2019) PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 770-779.
https://doi.org/10.1109/CVPR.2019.00086
[31]  Ma, W., Chen, J., Du, Q., et al. (2021) PointDrop: Improving Object Detection from Sparse Point Clouds via Adversarial Data Augmentation. 2020 25th International Conference on Pattern Recognition (ICPR), Milan, 10-15 January 2021, 10004-10009.
https://doi.org/10.1109/ICPR48806.2021.9412691
[32]  Hu, J.S.K. and Waslander, S.L. (2021) Pattern-Aware Data Augmentation for LiDAR 3D Object Detection. 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, 19-22 September 2021, 2703-2710.
https://doi.org/10.1109/ITSC48978.2021.9564842
[33]  Zhao, Y., Birdal, T., Deng, H., et al. (2019) 3D Point Capsule Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 1009-1018.
https://doi.org/10.1109/CVPR.2019.00110
[34]  Shi, S., Guo, C., Jiang, L., et al. (2020) PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 10526-10535.
https://doi.org/10.1109/CVPR42600.2020.01054
[35]  Wang, Y. and Solomon, J.M. (2019) Deep Closest Point: Learning Representations for Point Cloud Registration. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 3522-3531.
https://doi.org/10.1109/ICCV.2019.00362
[36]  Chen, Y., Hu, V.T., Gavves, E., et al. (2020) PointMixup: Augmentation for Point Clouds. Proceedings of the 16th European Conference on Computer Vision, Glasgow, 23-28 August 2020, 330-345.
https://doi.org/10.1007/978-3-030-58580-8_20
[37]  Zhang, J., Chen, L., Ouyang, B., et al. (2022) PointCutMix: Regularization Strategy for Point Cloud Classification. Neurocomputing, 505, 58-67.
https://doi.org/10.1016/j.neucom.2022.07.049
[38]  Xiao, A., Huang, J., Guan, D., et al. (2022) PolarMix: A General Data Augmentation Technique for LiDAR Point Clouds. arXiv:2208.00223
[39]  Yan, Y., Mao, Y. and Li, B. (2018) SECOND: Sparsely Embedded Convolutional Detection. Sensors, 18, Article 3337.
https://doi.org/10.3390/s18103337
[40]  Hu, X., Duan, Z., Huang, X., et al. (2023) Context-Aware Data Augmentation for LIDAR 3d Object Detection. 2023 IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, 8-11 October 2023, 11-15.
https://doi.org/10.1109/ICIP49359.2023.10222773
[41]  Chen, X., Ma, H., Wan, J., et al. (2017) Multi-View 3D Object Detection Network for Autonomous Driving. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 6526-6534.
https://doi.org/10.1109/CVPR.2017.691
[42]  Qi, C.R., Liu, W., Wu, C., et al. (2018) Frustum PointNets for 3D Object Detection from RGB-D Data. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 918-927.
https://doi.org/10.1109/CVPR.2018.00102
[43]  Choi, J., Song, Y. and Kwak, N. (2021) Part-Aware Data Augmentation for 3D Object Detection in Point Cloud. 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, 27 September-1 October 2021, 3391-3397.
https://doi.org/10.1109/IROS51168.2021.9635887
[44]  Lehner, A., Gasperini, S., Marcos-Ramiro, A., et al. (2022) 3D-VField: Adversarial Augmentation of Point Clouds for Domain Generalization in 3D Object Detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 17274-17283.
https://doi.org/10.1109/CVPR52688.2022.01678
[45]  Leng, Z., Cheng, S., Caine, B., et al. (2022) PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds. Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, 23-27 October 2022, 555-572.
https://doi.org/10.1007/978-3-031-19821-2_32
[46]  Cheng, S., Leng, Z., Cubuk, E.D., et al. (2020) Improving 3D Object Detection through Progressive Population Based Augmentation. Proceedings of the 16th European Conference on Computer Vision, Glasgow, 23-28 August 2020, 279-294.
https://doi.org/10.1007/978-3-030-58589-1_17
[47]  Leng, Z., Li, G., Liu, C., et al. (2023) Lidar Augment: Searching for Scalable 3D LiDAR Data Augmentations. 2023 IEEE International Conference on Robotics and Automation (ICRA), London, 29 May-2 June 2023, 7039-7045.
https://doi.org/10.1109/ICRA48891.2023.10161037
[48]  Geiger, A., Lenz, P. and Urtasun, R. (2012) Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, 16-21 June 2012, 3354-3361.
https://doi.org/10.1109/CVPR.2012.6248074
[49]  Sun, P., Kretzschmar, H., Dotiwalla, X., et al. (2020) Scalability in Perception for Autonomous Driving: Waymo Open Dataset. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 2446-2454.
https://doi.org/10.1109/CVPR42600.2020.00252
[50]  Caesar, H., Bankiti, V., Lang, A.H., et al. (2020) nuScenes: A Multimodal Dataset for Autonomous Driving. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 11618-11628.
https://doi.org/10.1109/CVPR42600.2020.01164
[51]  Lewis, D.D. (1991) Evaluating Text Categorization. Proceedings of the Workshop on Speech and Natural Language, Pacific Grove, 19-22 February 1991, 312-318.
https://doi.org/10.3115/112405.112471
[52]  Rezatofighi, H., Tsoi, N., Gwak, J., et al. (2019) Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 658-666.
https://doi.org/10.1109/CVPR.2019.00075
[53]  Girshick, R., Donahue, J., Darrell, T., et al. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587.
https://doi.org/10.1109/CVPR.2014.81
[54]  Ren, S., He, K., Girshick, R., et al. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149.
https://doi.org/10.1109/TPAMI.2016.2577031
[55]  Padilla, R., Passos, W.L., Dias, T.L.B., et al. (2021) A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit. Electronics, 10, Article 279.
https://doi.org/10.3390/electronics10030279
[56]  Everingham, M., Van Gool, L., Williams, C.K.I., et al. (2010) The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, 88, 303-338.
https://doi.org/10.1007/s11263-009-0275-4
[57]  Singh, B. and Davis, L.S. (2018) An Analysis of Scale Invariance in Object Detection-SNIP. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 3578-3587.
https://doi.org/10.1109/CVPR.2018.00377
[58]  Lin, T.Y., Maire, M., Belongie, S., et al. (2014) Microsoft COCO: Common Objects in Context. Proceedings of the 13th European Conference on Computer Vision, Zurich, 6-12 September 2014, 740-755.
https://doi.org/10.1007/978-3-319-10602-1_48

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413