Shuttlecock Detection Using Residual Learning in U-Net Architecture

Muhammad Haq - Tokyo Metropolitan University
Shuhei Tarashima - Tokyo Metropolitan University
Norio Tagawa - Tokyo Metropolitan University


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.8.3.2132

Abstract


This paper introduces an enhanced approach for detecting shuttlecock. Detecting fast-moving objects, such as a shuttlecock, is crucial in various applications, including badminton video analysis and object tracking. Many deep-learning techniques have been proposed in literature to address this challenge. However, low image quality, motion blur, afterimage, and short-term occlusion can hinder accurate detection. To overcome these limitations, this research focuses on improving the existing method called TrackNetV2, which utilizes the U-Net architecture. The primary enhancement proposed in this study is incorporating residual learning within the U-Net architecture, emphasizing processing speed, prediction accuracy, and precision. Specifically, each U-Net layer is augmented with a residual layer, enhancing the network's overall performance. The results demonstrate that our proposed method outperforms the existing detection accuracy and reliability technique.

Keywords


shuttlecock detection; residual learning; object tracking; u-net architecture

Full Text:

PDF

References


M. Hirano, K. Iwakuma, and Y. Yamakawa, “Multiple High-Speed Vision for Identical Objects Tracking,” Journal of Robotics and Mechatronics, vol. 34, no. 5, pp. 1073–1084, Oct. 2022, doi:10.20965/jrm.2022.p1073.

M. Fujitake, M. Inoue, and T. Yoshimi, “Development of an Automatic Tracking Camera System Integrating Image Processing and Machine Learning,” Journal of Robotics and Mechatronics, vol. 33, no. 6, pp. 1303–1314, Dec. 2021, doi: 10.20965/jrm.2021.p1303.

P. Parisot and C. De Vleeschouwer, “Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera,” Computer Vision and Image Understanding, vol. 159, pp. 74–88, Jun. 2017, doi: 10.1016/J.CVIU.2017.01.001.

A. Cioppa, A. Deliège, and M. Van Droogenbroeck, “A Bottom-Up Approach Based on Semantics for the Interpretation of the Main Camera Stream in Soccer Games,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 1846–184609. doi: 10.1109/CVPRW.2018.00229.

A. Cioppa et al., “A Context-Aware Loss Function for Action Spotting in Soccer Videos,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 13123–13133, 2020, doi: 10.1109/CVPR42600.2020.01314.

S. Pu, “Development and application of sports video analysis platform in sports training,” Journal of Advanced Computational Intelligence and Intelligent Informatics, pp. 139–145, Jan. 2019, doi:10.20965/jaciii.2019.p0139.

J.-Y. Lin et al., “Design and implement a mobile badminton stroke classification system,” 2017 19th Asia-Pacific Network Operations and Management Symposium (APNOMS), pp. 235–238, 2017.

I. Ghosh, S. Ramasamy Ramamurthy, A. Chakma, and N. Roy, “DeCoach: Deep Learning-based Coaching for Badminton Player Assessment,” Pervasive Mob Comput, vol. 83, p. 101608, Jul. 2022, doi: 10.1016/J.PMCJ.2022.101608.

N. Homayounfar, S. Fidler, and R. Urtasun, “Sports Field Localization via Deep Structured Models,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4012–4020. doi:10.1109/CVPR.2017.427.

M. A. Haq, S. Tarashima, and N. Tagawa, “Heatmap Visualization and Badminton Player Detection using Convolutional Neural Network,” IES 2022 - 2022 International Electronics Symposium: Energy Development for Climate Change Solution and Clean Energy Transition, Proceeding, pp. 627–631, 2022, doi:10.1109/IES55876.2022.9888717.

M. A. Haq and N. Tagawa, “Improving Badminton Player Detection Using YOLOv3 with Different Training Heuristic,” JOIV : International Journal on Informatics Visualization, vol. 7, no. 2, pp. 548–554, Jun. 2023, doi: 10.30630/joiv.7.2.1166.

R. Theagarajan, F. Pala, X. Zhang, and B. Bhanu, “Soccer: Who Has the Ball? Generating Visual Analytics and Player Statistics,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 1830–18308. doi:10.1109/CVPRW.2018.00227.

R. Zhang, L. Wu, Y. Yang, W. Wu, Y. Chen, and M. Xu, “Multi-camera multi-player tracking with deep player identification in sports video,” Pattern Recognit, vol. 102, Jun. 2020, doi:10.1016/J.PATCOG.2020.107260.

S. Yang, F. Ding, P. Li, and S. Hu, “Distributed multi-camera multi-target association for real-time tracking,” Scientific Reports 2022 12:1, vol. 12, no. 1, pp. 1–13, Jun. 2022, doi: 10.1038/s41598-022-15000-4.

H.-C. Shih, “A Survey of Content-Aware Video Analysis for Sports,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 5, pp. 1212–1231, 2018, doi: 10.1109/TCSVT.2017.2655624.

S. Sarkar, A. Chakrabarti, and D. P. Mukherjee, “Generation of Ball Possession Statistics in Soccer Using Minimum-Cost Flow Network,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019, pp. 2515–2523. doi:10.1109/CVPRW.2019.00307.

S. Ye et al., “ShuttleSpace: Exploring and Analyzing Movement Trajectory in Immersive Visualization,” IEEE Trans Vis Comput Graph, vol. 27, pp. 860–869, 2020.

S. R. Vrajesh, A. N. Amudhan, A. Lijiya, and A. P. Sudheer, “Shuttlecock detection and fall point prediction using neural networks,” 2020 International Conference for Emerging Technology, INCET 2020, Jun. 2020, doi: 10.1109/INCET49848.2020.9154136.

X. Chu et al., “TIVEE: Visual Exploration and Explanation of Badminton Tactics in Immersive Visualizations,” IEEE Trans Vis Comput Graph, vol. 28, no. 1, pp. 118–128, 2022, doi:10.1109/TVCG.2021.3114861.

Y. C. Huang, I. N. Liao, C. H. Chen, T. U. Ik, and W. C. Peng, “TrackNet: A deep learning network for tracking high-speed and tiny objects in sports applications,” 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2019, Sep. 2019, doi: 10.1109/AVSS.2019.8909871.

P. Liu and J. H. Wang, “MonoTrack: Shuttle trajectory reconstruction from monocular badminton video,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vol. 2022-June, pp. 3512–3521, 2022, doi:10.1109/CVPRW56347.2022.00395.

W. Y. Wang, H. H. Shuai, K. S. Chang, and W. C. Peng, “ShuttleNet: Position-Aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 4, pp. 4219–4227, Jun. 2022, doi:10.1609/AAAI.V36I4.20341.

M. A. Haq, S. Tarashima, and N. Tagawa, “Bag of Tricks Toward Enhanced Shuttlecock Detection from Badminton Videos,” Proceeding - 12th International Electrical Engineering Congress: Smart Factory and Intelligent Technology for Tomorrow, iEECON 2024, 2024, doi: 10.1109/IEECON60677.2024.10537809.

S. Tarashima, M. A. Haq, Y. Wang, and N. Tagawa, “Widely Applicable Strong Baseline for Sports Ball Detection and Tracking,” Nov. 2023, Accessed: Sep. 22, 2024. [Online]. Available: https://arxiv.org/abs/2311.05237v2

S. Liu and W. Deng, “Very deep convolutional neural network based image classification using small training sample size,” Proceedings - 3rd IAPR Asian Conference on Pattern Recognition, ACPR 2015, pp. 730–734, Jun. 2016, doi: 10.1109/ACPR.2015.7486599.

H. Noh, S. Hong, and B. Han, “Learning Deconvolution Network for Semantic Segmentation,” 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1520–1528, 2015.

C. Szegedy et al., “Going deeper with convolutions,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 07-12-June-2015, pp. 1–9, Oct. 2015, doi:10.1109/CVPR.2015.7298594.

K. He, X. Zhang, S. Ren, and J. Sun, “Identity Mappings in Deep Residual Networks,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9908 LNCS, pp. 630–645, Mar. 2016, doi:10.48550/arxiv.1603.05027.

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9351, pp. 234–241, 2015, doi: 10.1007/978-3-319-24574-4_28/cover.

J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 07-12-June-2015, pp. 431–440, Oct. 2015, doi:10.1109/CVPR.2015.7298965.

A. Shah, E. Kadam, H. Shah, S. Shinde, and S. Shingade, “Deep residual networks with exponential linear unit,” ACM International Conference Proceeding Series, vol. 21-24-September-2016, pp. 59–65, Sep. 2016, doi: 10.1145/2983402.2983406.

K. Fujii, “Data-driven analysis for understanding team sports behaviors,” 2021, Fuji Technology Press. doi:10.20965/jrm.2021.p0505.

N. E. Sun et al., “TrackNetV2: Efficient Shuttlecock Tracking Network,” Proceedings - 2020 International Conference on Pervasive Artificial Intelligence, ICPAI 2020, pp. 86–91, Dec. 2020, doi:10.1109/ICPAI51961.2020.00023.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-December, pp. 770–778, Dec. 2016, doi: 10.1109/CVPR.2016.90.

K. Hara, D. Saito, and H. Shouno, “Analysis of function of rectified linear unit used in deep learning,” Proceedings of the International Joint Conference on Neural Networks, vol. 2015-September, Sep. 2015, doi: 10.1109/IJCNN.2015.7280578.

S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” Proceedings of the 32nd International Conference on Machine Learning, pp. 448–456, Feb. 2015, doi: 10.48550/arxiv.1502.03167.

A. Paszke et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Proceedings of the 33rd International Conference on Neural Information Processing Systems, Curran Associates, Inc., 2019, pp. 8026–8037. [Online]. Available: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

J. Chen et al., “TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation,” Feb. 2021, doi:10.48550/arxiv.2102.04306.