Real-Time Object Detection Using a Lightweight Two-Stage Detection Network with Efficient Data Representation

Shaohuang Wang

doi:10.62762/TETAI.2024.320179

CiteScore

3.50

Impact Factor

Volume 1, Issue 1, IECE Transactions on Emerging Topics in Artificial Intelligence

Volume 1, Issue 1, 2024

Submit Manuscript Edit a Special Issue

Academic Editor

Teerath Kumar

National College of Ireland, Ireland

Article QR Code

Scan the QR code for reading

Popular articles

Research on A Ship Trajectory Classification Method Based on Deep Learning YOLOv7-Bw: A Dense Small Object Efficient Detector Based on Remote Sensing Image A Mimic Fusion Algorithm for Dual Channel Video Based on Possibility Distribution Synthesis Theory Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval Deep Prediction Network Based on Covariance Intersection Fusion for Sensor Data Visual Feature Extraction and Tracking Method Based on Corner Flow Detection Inaugural Editorial of the Chinese Journal of Information Fusion Simultaneous Spatiotemporal Bias Compensation and Data Fusion for Asynchronous Multisensor Systems YOLOv8-Lite: A Lightweight Object Detection Model for Real-time Autonomous Driving Systems Extraction of Motion Information from Occupancy Grid Map Using Keystone Transform

IECE Transactions on Emerging Topics in Artificial Intelligence, Volume 1, Issue 1, 2024: 17-30

Open Access | Research Article | 20 April 2024

Real-Time Object Detection Using a Lightweight Two-Stage Detection Network with Efficient Data Representation

Shaohuang Wang 1 *

1 Cardiff University, Cardiff CF10 3AT, United Kingdom

* Corresponding Author: Shaohuang Wang, [email protected]

DOI: 10.62762/TETAI.2024.320179

Received: 05 December 2023, Accepted: 16 April 2024, Published: 20 April 2024

Cited by: 5 (Source: Web of Science) , 5 (Source: Google Scholar)

PDF (2.02 MB) Full-Text HTML XML

Article Metrics Cite This Article

Abstract

In this paper, a novel fast object detection framework is introduced, designed to meet the needs of real-time applications such as autonomous driving and robot navigation. Traditional processing methods often trade off between accuracy and processing speed. To address this issue, a hybrid data representation method is proposed that combines the computational efficiency of voxelization with the detail capture capability of direct data processing to optimize overall performance. The detection framework comprises two main components: a Rapid Region Proposal Network (RPN) and a Refinement Detection Network (RefinerNet). The RPN is used to generate high-quality candidate regions, while the RefinerNet performs detailed analysis on these regions to improve detection accuracy. Additionally, a variety of network optimization techniques have been implemented, including lightweight network layers, network pruning, and model quantization, to increase processing speed and reduce computational resource consumption. Extensive testing on the KITTI and NEXET datasets has proven the effectiveness of this method in enhancing the accuracy of object detection and real-time processing speed. The experimental results show that, compared to existing technologies, this method performs exceptionally well across multiple evaluation metrics, especially in meeting the stringent requirements of real-time applications in terms of processing speed.

Graphical Abstract

Keywords

object detection

real-time

refinement

network optimization

pruning

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The author declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

Arnold, E., Al-Jarrah, O. Y., Dianati, M., Fallah, S., Oxtoby, D., & Mouzakitis, A. (2019). A survey on 3d object detection methods for autonomous driving applications. IEEE Transactions on Intelligent Transportation Systems, 20(10), 3782-3795.
[CrossRef] [Google Scholar]
Yu, Z., Li, L., Xie, J., Wang, C., Li, W., & Ning, X. (2024). Pedestrian 3d shape understanding for person re-identification via multi-view learning. IEEE Transactions on Circuits and Systems for Video Technology.
[CrossRef] [Google Scholar]
Zheng, Z., Wang, X., Zheng, N., & Yang, Y. (2022). Parameter-efficient person re-identification in the 3D space. IEEE Transactions on Neural Networks and Learning Systems.
[CrossRef] [Google Scholar]
Ning, E., Wang, Y., Wang, C., Zhang, H., & Ning, X. (2024). Enhancement, integration, expansion: Activating representation of detailed features for occluded person re-identification. Neural Networks, 169, 532-541.
[CrossRef] [Google Scholar]
Yu, Z., Tiwari, P., Hou, L., Li, L., Li, W., Jiang, L., & Ning, X. (2024). Mv-reid: 3d multi-view transformation network for occluded person re-identification. Knowledge-Based Systems, 283, 111200.
[CrossRef] [Google Scholar]
Zhou, Y., & Tuzel, O. (2018). Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4490-4499).
[Google Scholar]
Anthes, C., García-Hernández, R. J., Wiedemann, M., & Kranzlmüller, D. (2016, March). State of the art of virtual reality technology. In 2016 IEEE aerospace conference (pp. 1-19). IEEE.
[CrossRef] [Google Scholar]
Wen, L. H., & Jo, K. H. (2021). Fast and accurate 3D object detection for lidar-camera-based autonomous vehicles using one shared voxel-based backbone. IEEE access, 9, 22080-22089.
[CrossRef] [Google Scholar]
Chen, X., Ma, H., Wan, J., Li, B., & Xia, T. (2017). Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 1907-1915).
[Google Scholar]
Zhou, Y., & Tuzel, O. (2018). Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4490-4499).
[Google Scholar]
Qi, C. R., Su, H., Mo, K., & Guibas, L. J. (2017). Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 652-660).
[Google Scholar]
Qi, C. R., Yi, L., Su, H., & Guibas, L. J. (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30.
[Google Scholar]
Redmon, J. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[Google Scholar]
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., & Tian, Q. (2019). Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6569-6578).
[Google Scholar]
Bolya, D., Zhou, C., Xiao, F., & Lee, Y. J. (2019). Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9157-9166).
[Google Scholar]
Zhang, S., Wen, L., Bian, X., Lei, Z., & Li, S. Z. (2018). Single-shot refinement neural network for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4203-4212).
[Google Scholar]
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.
[Google Scholar]
Tan, M., Pang, R., & Le, Q. V. (2020). Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10781-10790).
[Google Scholar]
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117-2125).
[Google Scholar]
Wang, K., Liew, J. H., Zou, Y., Zhou, D., & Feng, J. (2019). Panet: Few-shot image semantic segmentation with prototype alignment. In proceedings of the IEEE/CVF international conference on computer vision (pp. 9197-9206).
[Google Scholar]
Howard, A. G. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.
[Google Scholar]
Wu, B., Iandola, F., Jin, P. H., & Keutzer, K. (2017). Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 129-137).
[Google Scholar]
Wang, R. J., Li, X., & Ling, C. X. (2018). Pelee: A real-time object detection system on mobile devices. Advances in neural information processing systems, 31.
[Google Scholar]
Qin, Z., Li, Z., Zhang, Z., Bao, Y., Yu, G., Peng, Y., & Sun, J. (2019). ThunderNet: Towards real-time generic object detection on mobile devices. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6718-6727).
[Google Scholar]
Geiger, A., Lenz, P., & Urtasun, R. (2012, June). Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3354-3361). IEEE.
[Google Scholar]
Unal, D., Catak, F. O., Houkan, M. T., Mudassir, M., & Hammoudeh, M. (2023). Towards robust autonomous driving systems through adversarial test set generation. ISA transactions, 132, 69-79.
[CrossRef] [Google Scholar]
Kingma, D. P. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[Google Scholar]
Ioffe, S. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
[Google Scholar]
Othmani, M. (2022). A vehicle detection and tracking method for traffic video based on faster R-CNN. Multimedia Tools and Applications, 81(20), 28347-28365.
[Google Scholar]
Tian, Z., Shen, C., Chen, H., & He, T. (2019). Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9627-9636).
[Google Scholar]
Chen, X. (2022, October). Traffic lights detection method based on the improved yolov5 network. In 2022 IEEE 4th International Conference on Civil Aviation Safety and Information Technology (ICCASIT) (pp. 1111-1114). IEEE.
[CrossRef] [Google Scholar]
Li, S., Wang, S., & Wang, P. (2023). A small object detection algorithm for traffic signs based on improved YOLOv7. Sensors, 23(16), 7145.
[CrossRef] [Google Scholar]
Soylu, E., & Soylu, T. (2024). A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition. Multimedia Tools and Applications, 83(8), 25005-25035.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Wang, S.(2024). Real-Time Object Detection Using a Lightweight Two-Stage Detection Network with Efficient Data Representation. IECE Transactions on Emerging Topics in Artificial Intelligence, 1(1), 17–30. https://doi.org/10.62762/TETAI.2024.320179

Article Metrics

Citations:

Google Scholar

Crossref

Scopus

Web of Science

Article Access Statistics:

PDF Downloads: 294

Publisher's Note

IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Copyright © 2024 by the Author(s). Published by Institute of Emerging and Computer Engineers. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

IECE Transactions on Emerging Topics in Artificial Intelligence

ISSN: 3066-1676 (Online) | ISSN: 3066-1668 (Print)

Email: [email protected]

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Google Scholar

Crossref

Scopus

Web of Science

We use cookies