Automatic individual tracking method of Amur tiger based on attention mechanism fusion of multiple features

doi:10.17520/biods.2023409

Abstract

Abstract:

Background: Tracking wild animals sheds light on their ecology, behavior, and population dynamics. Developing an automated, effective, and precise target tracking system is crucial for the conservation of wild animals. Monitoring the Amur tiger (Panthera tigris altaica) population requires properly and automatically tracking individual tigers, which is challenging because of their camouflaged and fast movements in complex habitats. Individual Amur tigers are tracked incorrectly in the real world because of factors like similarity interference, occlusion, and illumination variation.
Methods: In order to accurately track Amur tigers in complex real-world circumstances, we have suggested a Siamese network tracking framework based on attention feature fusion. Through the incorporation of the attention feature fusion into the backbone network ResNet50, we have enhanced the conventional Siamese-based tracking architecture. With the addition of a multi-scale channel attention module, the system was better able to comprehend global contextual information and adjust to the varying environmental conditions and individual states of Amur tigers.
Results: The suggested approach was compared with five advanced algorithms, SiamFC, SiamRPN++, SiamCAR, SiamBAN and SiamGAT, on the Amur tiger target tracking dataset. The algorithm proposed in this paper achieved a tracking success rate of 72.5% and a precision of 93.9%, outperforming the five existing algorithms. It showed improvements of 4.1% and 2.3% compared to the baseline tracking algorithm. At the same time, the suggested approach performed better in tracking when faced with six distinct tracking problems in the Amur tiger’s complex environment.
Conclusion: This strategy significantly enhances the success rate and precision of tracking individual Amur tigers. The method in this paper is more suitable for the actual scene under the premise of using computer vision technology to monitor wild animals, and it proves that the tracking method in this paper can provide an effective scheme for automatic and efficient monitoring of Amur tigers in a complex actual scene.

Key words: Amur tiger, single object tracking, Siamese network, feature fusion, attention mechanism

Qun Xu, Yonghua Xie. Automatic individual tracking method of Amur tiger based on attention mechanism fusion of multiple features[J]. Biodiv Sci, 2024, 32(3): 23409.

Add to citation manager EndNote|Ris|BibTeX

URL: https://www.biodiversity-science.net/EN/10.17520/biods.2023409

https://www.biodiversity-science.net/EN/Y2024/V32/I3/23409

Figures/Tables 13

Fig. 1 Samples of Amur tiger target tracking training dataset. (a) Image sample taken by unmanned aerial vehicle; (b) Image sample in a benchmark for Amur tiger re-identification in the wild.

Fig. 2 Thumbnails of video sequences from Amur tiger target tracking test dataset

Fig. 3 The challenges of tracking Amur tigers

Table 1 The challenges of each video sequence in Fig. 2

视频 Video	光照变化 Illumination variation	目标旋转 Object rotation	部分遮挡 Partial occlusion	相似干扰 Similarity interference	目标形变 Object deformation	尺度变化 Scale variation
Tiger_01	√	√	-	√	-	-
Tiger_02	-	√	-	-	√	√
Tiger_03	-	-	-	√	-	-
Tiger_04	-	√	-	-	-	-
Tiger_05	-	√	-	-	-	-
Tiger_06	√	√	-	-	-	-
Tiger_07	-	√	√	√	-	-
Tiger_08	-	-	-	√	-	-

Fig. 4 SiamBAN structure diagram. Adapted from Chen et al (2020).

Fig. 5 Overall structure of AFF-ResNet50 network based on attention feature fusion. C, H, W represent the channels, height, and width of the input feature map respectively; X, The original input directly passed from the previous layer in the ResNet block; Y, The output of the residual function in the ResNet block; Z, The fused output feature.

Table 2 Average pixel error and average overlap rate of this study and five current state-of-the-art tracking methods on eight video sequences (Tiger_01-Tiger_08)

跟踪器 Tracker	平均像素误差 Average pixel error	平均重叠率 Average overlap rate
本文 This study	9.016	0.737
SiamFC	15.380	0.730
SiamRPN++	9.889	0.688
SiamCAR	19.630	0.692
SiamBAN	10.495	0.694
SiamGAT	11.828	0.719

Fig. 6 A graph of the pixel error results (a) and the overlap rate results (b) of this study and five current state-of-the-art tracking methods on the video sequences Tiger_02 and Tiger_03

Fig. 7 Comparison of tracking success rate and precision of this study and five current state-of-the-art tracking methods on the Amur tiger target tracking test dataset

Table 3 Tracking success rate of this study and five current state-of-the-art tracking methods on six types of challenges in the Amur tiger target tracking test dataset

跟踪器 Tracker	光照变化 Illumination variation	目标旋转 Object rotation	部分遮挡 Partial occlusion	相似干扰 Similarity interference	目标形变 Object deformation	尺度变化 Scale variation
本文 This Study	0.719	0.723	0.728	0.711	0.740	0.755
SiamFC	0.794	0.694	0.750	0.673	0.734	0.743
SiamRPN++	0.663	0.651	0.728	0.649	0.728	0.711
SiamCAR	0.694	0.680	0.712	0.680	0.709	0.719
SiamBAN	0.672	0.672	0.722	0.656	0.722	0.718
SiamGAT	0.681	0.683	0.729	0.728	0.728	0.744

Table 4 Tracking precision of this study and five current state-of-the-art tracking methods on six types of challenges in the Amur tiger target tracking test dataset

跟踪器 Tracker	光照变化 Illumination variation	目标旋转 Object rotation	部分遮挡 Partial occlusion	相似干扰 Similarity interference	目标形变 Object deformation	尺度变化 Scale variation
本文 This study	0.972	0.970	0.887	0.849	0.904	0.982
SiamFC	0.864	0.748	0.724	0.563	0.745	0.916
SiamRPN++	0.964	0.959	0.880	0.840	0.897	0.982
SiamCAR	0.767	0.786	0.753	0.776	0.769	0.894
SiamBAN	0.924	0.944	0.878	0.829	0.892	0.973
SiamGAT	0.937	0.841	0.756	0.853	0.773	0.918

Fig. 8 Qualitative comparison of tracking results of Amur tigers under six types of challenges. a-e represent illumination variation, object rotation, partial occlusion, similarity interference, object deformation, and scale variation challenges, respectively.

Table 5 Comparison of tracking success rate and precision of SiamBAN with various feature fusion methods

模型 Model	成功率 Success rate (%)	精确度 Precision (%)
SiamBAN	68.4	91.6
SiamBAN + 多尺度通道注意力机制 SiamBAN + Multi-scale channel attention module (MS-CAM)	67.0	93.5
SiamBAN + 注意力特征融合 SiamBAN + Attentional feature fusion (AFF)	72.5	93.9
SiamBAN + 特征金字塔网络 SiamBAN + Feature pyramid networks (FPN)	63.6	94.1

References 40

[1]	Altobel MZ, Sah M (2021) Tiger detection using Faster R-CNN for wildlife conservation. In: 14thInternational Conference on Theory and Application of Fuzzy Systems and Soft Computing-ICAFS- 2020, pp. 572-579. Springer, Cham.
[2]	Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional Siamese networks for object tracking. In: Computer Vision - ECCV 2016 Workshops, pp. 850-865. Springer, Cham.
[3]	Bhattacharya S, Sultana M, Das B, Roy B (2022) A deep neural network framework for detection and identification of Bengal tigers. Innovations in Systems and Software Engineering, 18, 1-9.
[4]	Chen X, Yan B, Zhu JW, Wang D, Yang XY, Lu HC (2021) Transformer tracking. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.8122-8131. Nashville, TN, USA.
[5]	Chen ZD, Zhong BN, Li GR, Zhang SP, Ji RR (2020) Siamese box adaptive network for visual tracking. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6667-6676. Seattle, WA, USA.
[6]	Dai YM, Gieseke F, Oehmcke S, Wu YQ, Barnard K (2021) Attentional feature fusion.In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3559-3568. Waikoloa, HI, USA.
[7]	Gong YN, Tan MY, Wang Z, Zhao GJ, Jiang PL, Jiang SM, Zhang DJ, Ge JP, Feng LM (2019) AI recognition of infrared camera image of wild animals based on deep learning: Northeast Tiger and Leopard National Park for example. Acta Theriologica Sinica, 39, 458-465. (in Chinese with English abstract) DOI
	[ 宫一男, 谭孟雨, 王震, 赵国静, 蒋沛林, 蒋仕铭, 张鼎基, 葛剑平, 冯利民 (2019) 基于深度学习的红外相机动物影像人工智能识别: 以东北虎豹国家公园为例. 兽类学报, 39, 458-465.]
[8]	Guo DY, Shao YY, Cui Y, Wang ZH, Zhang LY, Shen CH (2021) Graph attention tracking. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9538-9547. Nashville, TN, USA.
[9]	Guo DY, Wang J, Cui Y, Wang ZH, Chen SY (2020) SiamCAR:Siamese fully convolutional classification and regression for visual tracking. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6268-6276. Seattle, WA, USA.
[10]	Hayward GD, Miquelle DG, Smirnov EN, Nations C (2002) Monitoring Amur tiger populations: Characteristics of track surveys in snow. Wildlife Society Bulletin, 30, 1150-1159.
[11]	He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778. Las Vegas, NV, USA.
[12]	Karanth KU, Nichols JD, Kumar NS, Link WA, Hines JE (2004) Tigers and their prey: Predicting carnivore densities from prey abundance. Proceedings of the National Academy of Sciences, USA, 101, 4854-4858.
[13]	Kays R, Crofoot MC, Jetz W, Wikelski M (2015) Terrestrial animal tracking as an eye on life and planet. Science, 348, eaaa2478.
[14]	Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60, 84-90.
[15]	Li B, Wu W, Wang Q, Zhang FY, Xing JL, Yan JJ (2019) SiamRPN:Evolution of Siamese visual tracking with very deep networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.4277-4286. Long Beach, CA, USA.
[16]	Li B, Yan JJ, Wu W, Zhu Z, Hu XL (2018) High performance visual tracking with siamese region proposal network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.8971-8980. Salt Lake City, UT, USA.
[17]	Li SY, Li JG, Tang HL, Qian R, Lin WY (2020) ATRW: A benchmark for Amur tiger re-identification in the wild. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2590-2598. Seattle, WA, USA.
[18]	Lin TY, Dollár P, Girshick R, He KM, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936-944. Honolulu, HI, USA.
[19]	Liu PJ, Fu XF, Sun HF, He L, Liu SJ (2023) A highly robust target tracking algorithm merging a CNN and Transformer. Journal of System Simulation, 35, 1-15. (in Chinese with English abstract)
	[ 刘沛津, 付雪峰, 孙浩峰, 何林, 刘淑婕 (2023) 一种融合CNN与Transformer的高鲁棒性目标跟踪算法. 系统仿真学报, 35, 1-15.]
[20]	Palanisamy V, Ratnarajah N (2021) Detection of wildlife animals using deep learning approaches:A systematic review.In: 2021 21st International Conference on Advances in ICT for Emerging Regions (ICter), pp. 153-158. Colombo, Sri Lanka.
[21]	Qi JZ, Gu JY, Ning Y, Miquelle DG, Holyoak M, Wen DS, Liang X, Liu SY, Roberts NJ, Yang EY, Lang JM, Wang FY, Li C, Liang Z, Liu PQ, Ren Y, Zhou SC, Zhang MH, Ma JZ, Chang J, Jiang GS (2021) Integrated assessments call for establishing a sustainable meta-population of Amur tigers in northeast Asia. Biological Conservation, 261, 109250.
[22]	Rai P, Golchha V, Srivastava A, Vyas G, Mishra S (2016) An automatic classification of bird species using audio feature extraction and support vector machines. In: 2016 International Conference on Inventive Computation Technologies (ICICT), pp. 1-5. Coimbatore, India.
[23]	Riordan P (1998) Unsupervised recognition of individual tigers and snow leopards from their footprints. Animal Conservation, 1, 253-262.
[24]	Ronneberger O, Fischer P, Brox T (2015) U-net:Convolutional networks for biomedical image segmentation. In: Lecture Notes in Computer Science (eds Goos G, Hartmanis J), pp. 234-241. Springer International Publishing, Cham.
[25]	Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma SA, Huang ZH, Karpathy A, Khosla A, Bernstein M, Berg AC, Li FF (2015) ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115, 211-252.
[26]	Scanes CG (2018) Human activity and habitat loss:Destruction, fragmentation, and degradation. In: Animals and Human Society (eds Scanes CG, Toukhsati SR), pp. 451-482. Elsevier, Amsterdam.
[27]	Schindler F, Steinhage V (2021) Identification of animals and recognition of their actions in wildlife videos using deep learning techniques. Ecological Informatics, 61, 101215.
[28]	Shi CM, Liu D, Cui YL, Xie JJ, Roberts NJ, Jiang GS (2020) Amur tiger stripes: Individual identification based on deep convolutional neural network. Integrative Zoology, 15, 461-470.
[29]	Su QG, Tang JL, Zhai MX, He DJ (2022) An intelligent method for dairy goat tracking based on Siamese network. Computers and Electronics in Agriculture, 193, 106636.
[30]	Vo XT, Hoang VD, Nguyen DL, Jo K (2022) Pedestrian head detection and tracking via global vision transformer. In: Frontiers of Computer Vision, IW-FCV 2022, pp. 155-167. Springer, Cham.
[31]	Wan P, Zhao JW, Zhu M, Tan HQ, Deng ZY, Huang YY, Wu WJ, Ding AZ (2021) Freshwater fish species identification method based on improved ResNet 50 model. Transactions of the Chinese Society of Agricultural Engineering, 37(12), 159-168. (in Chinese with English abstract)
	[ 万鹏, 赵竣威, 朱明, 谭鹤群, 邓志勇, 黄毓毅, 吴文锦, 丁安子 (2021) 基于改进ResNet50模型的大宗淡水鱼种类识别方法. 农业工程学报, 37(12), 159-168.]
[32]	Wang C, Chen HQ, Zhang XB, Meng CY (2016) Evaluation of a laying-hen tracking algorithm based on a hybrid support vector machine. Journal of Animal Science and Biotechnology, 7, 1-10.
[33]	Wang D, Hu YB, Ma TX, Nie YG, Xie Y, Wei FW (2016) Noninvasive genetics provides insights into the population size and genetic diversity of an Amur tiger population in China. Integrative Zoology, 11, 16-24. DOI PMID
[34]	Xie JJ, Li AQ, Zhang JG, Cheng ZA (2019) An integrated wildlife recognition model based on multi-branch aggregation and squeeze-and-excitation network. Applied Sciences, 9, 2794.
[35]	Xie YH, Jiang JZ, Bao H, Zhai PH, Zhao Y, Zhou XY, Jiang GS (2023) Recognition of big mammal species in airborne thermal imaging based on YOLO V5 algorithm. Integrative Zoology, 18, 333-352.
[36]	Zhang J, Yang SQ, Hu SR, Ning JF, Lan XY, Wang YS (2023) A dairy goat tracking method via lightweight fusion and Kullback Leibler divergence. Computers and Electronics in Agriculture, 213, 108189.
[37]	Zhao TT, Zhou ZF, Li DX, Liu S, Li M (2018) Individual identification of leopard based on improved Cifar-10 deep learning model. Journal of Taiyuan University of Technology, 49, 585-591, 598. (in Chinese with English abstract)
	[ 赵婷婷, 周哲峰, 李东喜, 刘松, 李明 (2018) 基于改进的Cifar-10深度学习模型的金钱豹个体识别研究. 太原理工大学学报, 49, 585-591, 598.]
[38]	Zheng ZY, Zhang XQ, Qin LF, Yue S, Zeng PB (2023) Cows’ legs tracking and lameness detection in dairy cattle using video analysis and Siamese neural networks. Computers and Electronics in Agriculture, 205, 107618.
[39]	Zhou ZY, Hou JP, Liu P, Chen P, Duan C (2023) Giant panda head image segmentation based on dual model fusion. Acta Theriologica Sinica, 43, 82-88. (in Chinese with English abstract) DOI
	[ 周章玉, 侯佳萍, 刘鹏, 陈鹏, 段昶 (2023) 基于双模型融合的大熊猫头部图像分割. 兽类学报, 43, 82-88.] DOI
[40]	Zhu YX, Wang DW, Li ZL, Feng JW, Wang TM (2022) Restoring tiger population in Asia: Challenges, opportunities, and future prospects. Biodiversity Science, 30, 22421. (in Chinese with English abstract) DOI
	[ 朱逸晓, 王大伟, 李治霖, 冯佳伟, 王天明 (2022) 亚洲虎种群恢复的机遇与挑战. 生物多样性, 30, 22421.] DOI