基于TC-YOLO模型的北京珍稀鸟类识别方法

doi:10.17520/biods.2024056

生物多样性 ›› 2024, Vol. 32 ›› Issue (5): 24056. DOI: 10.17520/biods.2024056 cstr: 32101.14.biods.2024056

基于TC-YOLO模型的北京珍稀鸟类识别方法

李柏灿¹^,²^,³, 张军国¹^,²^,³^,^*(), 张长春¹^,²^,³, 王丽凤¹^,²^,³, 徐基良⁴, 刘利⁵^,^*()

1.北京林业大学工学院, 北京 100083
2.林木资源高效生产全国重点实验室, 北京 100083
3.林业装备与自动化国家林业和草原局重点实验室, 北京 100083
4.北京林业大学生态与自然保护学院, 北京 100083
5.包头师范学院生物科学与技术学院, 内蒙古包头 014030

收稿日期:2024-02-16 接受日期:2024-04-12 出版日期:2024-05-20 发布日期:2024-04-19
通讯作者: *E-mail: zhangjunguo@bjfu.edu.cn; E-mail: liuli4304842@126.com
基金资助:
国家自然科学基金(32371874);中央高校优秀青年团队项目(QNTD202304);北京市自然科学基金(6244053);国家林业和草原局林业科技成果推广计划([2019]04)

Rare bird recognition method in Beijing based on TC-YOLO model

Baican Li¹^,²^,³, Junguo Zhang¹^,²^,³^,^*(), Changchun Zhang¹^,²^,³, Lifeng Wang¹^,²^,³, Jiliang Xu⁴, Li Liu⁵^,^*()

1 School of Technology, Beijing Forestry University, Beijing 100083
2 State Key Laboratory of Efficient Production of Forest Resources, Beijing 100083
3 Key Laboratory of State Forestry and Grassland Administration on Forestry Equipment and Automation, Beijing 100083
4 School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083
5 School of Biological Science and Technology, Baotou Teachers’ College, Baotou, Nei Mongol 014030

Received:2024-02-16 Accepted:2024-04-12 Online:2024-05-20 Published:2024-04-19
Contact: *E-mail: zhangjunguo@bjfu.edu.cn; E-mail: liuli4304842@126.com

1. 附录.pdf(509KB)

摘要/Abstract

摘要：

北京地区珍稀鸟类的保护对维护当地生物多样性具有重要意义。随着人工智能技术的发展, 利用深度学习技术自动识别鸟类成为鸟类调查保护的重要手段。实际鸟类图像存在背景复杂以及相近科属鸟类具有外观相似等特点, 导致模型识别精度不佳。针对以上问题, 本文提出一种基于TC-YOLO模型的鸟类识别方法。首先, 为解决鸟类识别中复杂背景导致的漏检问题, 本文方法结合CARAFE (content-aware reassembly of features)机制, 自适应生成不同特征点所对应的上采样核, 在更大的感受野内聚合上下文语义信息, 有效聚焦鸟类前景区域。其次, 为解决鸟类识别中相似外观导致的误检问题, 本文方法引入TSCODE (task-specific context decoupling)解耦定位和分类任务, 通过获取多层级特征图的信息以回归目标边界, 并利用包含底层纹理和高层语义的特征进行物种分类, 进而提高模型的鸟类识别精度。最后, 本文开展对比实验以验证模型的性能。实验结果表明, TC-YOLO模型的平均精度均值在包含北京地区28种国家一级保护鸟类的自建数据集Beijing-28和鸟类公开数据集CUB200-2011上分别达到78.7%和75.3%, 均优于已有方法, 而且在公开数据集MS COCO上验证了TC-YOLO模型拥有较强的泛化性。本文提出的TC-YOLO模型对背景复杂或外观相似的鸟类图像都能有效识别, 漏检率和误检率较低, 能够为鸟类保护提供重要技术支撑。

关键词: 珍稀鸟类, 图像识别, YOLOv5s, 上采样, 解耦头

Abstract

Aim: Bird recognition is an important means to protect birds, and traditional bird recognition mainly relies on manual labor, which has high costs, high professional technical requirements, and certain limitations. With the development of artificial intelligence technology, the use of deep learning technology to automatically identify birds has become an important means of bird survey and protection. However, the actual bird images are characterized by a complex background and the similar appearance of birds of similar families, resulting in poor model recognition accuracy.

Methods: To address the above problems, this paper proposed a bird recognition method based on TC-YOLO model. First, in order to solve the problem of missed detection caused by complex background in bird recognition, the method in this paper combined the CARAFE (content-aware reassembly of features) mechanism to adaptively generate the upsampling kernels corresponding to different feature points, to aggregate the contextual semantic information within a larger receptive field, effectively to focus on the distribution of bird regions in the global feature map, and to improve the ability of the upsampling in recognizing the bird features so as to enable the model to accurately recognize bird targets. Second, in order to solve the problem of false detection caused by similar appearances in bird recognition, our paper introduced TSCODE (task-specific context decoupling) to decouple the localization and classification tasks by acquiring the information of multi-level feature maps to regress to the target boundary and utilizing the features containing the underlying texture and the higher-level semantics for the classification of species, which in turn improves the model’s bird recognition accuracy.

Results: This paper carried out comparative experiments to verify the performance of the model. The experimental results showed that the mean average precision of the TC-YOLO model on the self-built dataset Beijing-28, which contained 28 species of national first-class protected birds in Beijing, and the publicly available dataset of birds CUB200-2011 reached 78.7% and 75.3%, respectively, which were both better than the comparison methods, proving that the TC-YOLO model possessed a superior performance in bird recognition. In addition, in order to verify the generalization of the TC-YOLO model on other kinds of datasets, experiments were carried out on the public dataset MS COCO, and the results showed that the performance of the TC-YOLO model was better than the comparison model, which indicated that the TC-YOLO model possesses strong generalization.

Conclusion: The TC-YOLO model proposed in this paper can effectively recognize bird images in the presence of complex backgrounds or similar appearances with low leakage and misdetection rates and strong generalization, which can provide important technical support for bird conservation and thus bring more practical application value for biodiversity conservation in Beijing.

Key words: rare birds, image recognition, YOLOv5s, upsampling, decoupling head

李柏灿, 张军国, 张长春, 王丽凤, 徐基良, 刘利 (2024) 基于TC-YOLO模型的北京珍稀鸟类识别方法. 生物多样性, 32, 24056. DOI: 10.17520/biods.2024056.

Baican Li, Junguo Zhang, Changchun Zhang, Lifeng Wang, Jiliang Xu, Li Liu (2024) Rare bird recognition method in Beijing based on TC-YOLO model. Biodiversity Science, 32, 24056. DOI: 10.17520/biods.2024056.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: https://www.biodiversity-science.net/CN/10.17520/biods.2024056

https://www.biodiversity-science.net/CN/Y2024/V32/I5/24056

图/表 15

图1 北京地区国家一级保护鸟类数据集部分样例

Fig. 1 Partial examples of the national first-class protected bird dataset in Beijing

图2 基于Cycle-GAN方法的生成图像示例

Fig. 2 Example of generated images based on Cycle-GAN method

图3 TC-YOLO模型结构。Conv和Conv2d: 卷积; C3: 特征提取; SPPF: 空间金字塔池化结构; Concat和c: 特征融合; CARAFE: CARAFE上采样; C2: 第2层特征提取的特征图; Pl: 第l层金字塔层级的特征图; Head: TSCODE解耦头; NMS: 非极大值抑制。

Fig. 3 TC-YOLO model structure. Conv and Conv2d, Convolution; C3, Feature extraction; SPPF, Space pyramid pool structure; Concat and c, Feature fusion; CARAFE, CARAFE upsampling; C2, The feature map of layer 2 feature extraction; Pl, The feature map of the l-th pyramid level; Head, TSCODE decoupling header; NMS, Non-maximum suppression.

图4 CARAFE模型结构。$\mathcal{X}$: 输入特征图; C: 特征图的通道数; H: 输入特征图的高; W: 输入特征图的宽; Cm: 降维后的通道数; σ: 上采样率; σ 2 × k u p 2: 内容编码后的输出通道数; k u p 2: 预测的上采样核大小; l: 原始位置; l°: 目标位置; $N\left(\mathcal{X}_{l}, \boldsymbol{k}_{u p}\right)$: 以原始位置为中心的正方形区域; Wl°:重组上采样核; $\mathcal{X}^{\prime}$: 输出特征图; σH: 输出特征图的高; σW: 输出特征图的宽。

Fig. 4 CARAFE model structure. $\mathcal{X}$, Input feature map; C, The number of channels of the input feature map; H, The height of the input feature map; W, The width of the input feature map; Cm, Number of channels after channel compression; σ, Upsample ratio; σ 2 × k u p 2, Number of output channels after content encoding; k u p 2, Predicted upsampling kernel size; l, Source location; l°, Target location; $N\left(\mathcal{X}_{l}, \boldsymbol{k}_{u p}\right)$, A square area centered on the source position; Wl°, Reassembly upsampling kernel; $\mathcal{X}^{\prime}$, Output feature map; σH, The height of the output feature map; σW, The width of the output feature map.

图5 细节保留编码结构。Pl: 第l层金字塔层级的特征图; Gloc l: 定位的特征图; H: 特征图的高; W: 特征图的宽; floc(·): 定位的特征投影函数; $\mathcal{R}(\cdot)$: 定位的最终层。

Fig. 5 Detail-preserving encoding structure. Pl, The feature map of the l-th pyramid level; Gloc l, The feature map for localization; H, The height of the feature map; W, The width of the feature map; floc(·), The feature projection functions for localization; $\mathcal{R}(\cdot)$, The final layer in localization.

图6 语义上下文编码结构。Pl: 第l层金字塔层级的特征图; Gcls l: 分类的特征图; H: 特征图的高; W: 特征图的宽; C: 特征图的通道数; fcls(·): 分类的特征投影函数; C(·): 分类的最终层。

Fig. 6 Semantic context encoding structure. Pl, The feature map of the l-th pyramid level; Gcls l, The feature map for localization; H, The height of the feature map; W, The width of the feature map; C, The number of channels of the feature map; fcls(·), The feature projection functions for localization; C(·), The final layer in localization.

表1 Comparison of experimental results on the Beijing-28 dataset among different models①(①洪洋 (2022) 森林野火预警的小目标检测算法研究. 硕士学位论文, 电子科技大学, 成都.)

Table 1

算法模型 Algorithm model	平均精度均值 mAP@0.5 (%)	平均精度均值 mAP@0.5:0.9 (%)	参数量 Parameters (×10⁶M)	帧率 Frames per second (帧/s)	参考文献 References
TC-YOLO	90.6	78.7	28.89	56	本文 This study
Faster R-CNN	67.9	45.2	28.56	16	Ren et al, 2016
SSD	85.3	68.3	26.29	47	Liu et al, 2016
YOLOv3-tiny	80.9	54.8	8.73	86	Adarsh et al, 2020
YOLOv4-tiny	75.1	48.6	6.06	70	Wang et al, 2021
YOLOv5s	90.2	76.3	7.09	85	Jocher, 2022
YOLOv6n	76.6	67.6	4.64	67	Li et al, 2022
YOLOv7-tiny	81.4	66.1	6.09	64	Wang CY et al, 2023
YOLOv8n	84.9	74.5	3.01	67	Jocher, 2023

图7 不同模型在CUB200-2011数据集实验结果对比

Fig. 7 Comparison of experimental results on the CUB200- 2011 dataset among different models

表2 不同改进YOLOv5s方法在Beijing-28数据集实验结果对比

Table 2 Comparison of experimental results on the Beijing-28 dataset among different improved YOLOv5s methods

方法 Methods	模型构成 Model composition	平均精度均值 mAP@0.5 (%)	平均精度均值 mAP@0.5:0.95 (%)	参考文献 References
TC-YOLO	YOLOv5s + CARAFE + TSCODE	90.6	78.7	本文 This study
方法1 Method 1	YOLOv5s	90.2	76.3	Jocher, 2022
方法2 Method 2	YOLOv5s + CBAM	89.9	76.2	Xue et al, 2022
方法3 Method 3	YOLOv5s + SA	89.8	75.2	Hao et al, 2023
方法4 Method 4	YOLOv5s + CA	89.3	74.4	Zhang et al, 2023
方法5 Method 5	YOLOv5s + WioUv3	89.9	76.1	Zhao et al, 2023

图8 不同改进YOLOv5s方法在CUB200-2011数据集实验结果对比

Fig. 8 Comparison of experimental results on the CUB200- 2011 dataset among different improved YOLOv5s methods

图9 YOLOv5s改进前后算法识别效果对比图(漏检情况)

Fig. 9 Comparison of recognition performance between before and after improvement in YOLOv5s (missed detection situations)

图10 YOLOv5s改进前后算法识别效果对比图(误检情况)

Fig. 10 Comparison of recognition performance between before and after improvement in YOLOv5s (false detection situations)

表3 TC-YOLO模型的消融实验结果

Table 3 Ablation experimental results of TC-YOLO model

数据集 Datasets	算法模型 Algorithm model	精确率 Precision (%)	召回率 Recall (%)	平均精度均值 mAP@0.5 (%)	平均精度均值 mAP@0.5:0.95 (%)
Beijing-28	YOLOv5s	86.8	87.4	90.2	76.3
	C-YOLO	87.5	88.3	90.3	77.3
	T-YOLO	87.3	88.1	90.2	77.4
	TC-YOLO	89.4	89.5	90.6	78.7
CUB200-2011	YOLOv5s	81.7	82.2	85.4	72.8
	C-YOLO	81.9	82.5	85.6	73.5
	T-YOLO	84.7	82.0	85.4	74.7
	TC-YOLO	85.1	82.5	85.5	75.3

图11 消融实验效果对比图(低质量图像)

Fig. 11 Comparison of ablation study results (low-quality images)

表4 不同模型在MS COCO数据集上的实验结果对比(*代表论文中的实验结果)

Table 4 Comparison of experimental results on the MS COCO dataset among different models (* represents experimental results in the paper)

算法模型 Algorithm model	平均精度均值 mAP@0.5 (%)	平均精度均值 mAP@0.5:0.95 (%)	参考文献 References
TC-YOLO	61.0	42.1	本文 This study
Faster R-CNN*	59.2	39.8	Ren et al, 2016
SSD*	43.1	25.1	Liu et al, 2016
YOLOv4-tiny*	42.1	24.9	Wang et al, 2021
YOLOv5s*	56.8	37.4	Jocher, 2022
YOLOv6n*	52.7	37.0	Li et al, 2022
YOLOv7-tiny*	52.8	35.2	Wang CY et al, 2023
YOLOv8n*	52.6	37.3	Jocher, 2023

参考文献 47

[1]	Adarsh P, Rathi P, Kumar M (2020) YOLO v3-Tiny: Object detection and recognition using one stage improved model. In: 2020 International Conference on Advanced Computing and Communication Systems (ICACCS), pp. 687-694. IEEE, Coimbatore.
[2]	Alqaysi H, Fedorov I, Qureshi FZ, O’Nils M (2021) A temporal boosted YOLO-based model for birds detection around wind farms. Journal of Imaging, 7, 227.
[3]	Cai JM, He PY, Yang ZP, Li LY, Zhao QJ, Pan F (2023) A deep feature fusion-based method for bird sound recognition and its interpretability analysis. Biodiversity Science, 31, 23087. (in Chinese with English abstract) DOI
	[蔡建民, 何培宇, 杨智鹏, 李露莹, 赵启军, 潘帆 (2023) 基于深度特征融合的鸟鸣识别方法及其可解释性分析. 生物多样性, 31, 23087.] DOI
[4]	Cheng G, Yuan X, Yao XW, Yan KB, Zeng QH, Xie XX, Han JW (2023) Towards large-scale small object detection: Survey and benchmarks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 13467-13488.
[5]	Feng CJ, Zhong YJ, Gao Y, Scott MR, Huang WL (2021) Tood: Task-aligned one-stage object detection. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3490-3499. IEEE, Montreal.
[6]	Ge Z, Liu ST, Wang F, Li ZM, Sun J (2021) Yolox: Exceeding yolo series in 2021. arXiv, doi: 10.48550/arXiv.2107.08430.
[7]	Gou JP, Xiong XS, Yu BS, Du L, Zhan YB, Tao DC (2023) Multi-target knowledge distillation via student self- reflection. International Journal of Computer Vision, 131, 1857-1874.
[8]	Hao WL, Zhang L, Han M, Zhang K, Li FZ, Yang GQ, Liu ZY (2023) YOLOv5-SA-FC: A novel pig detection and counting method based on shuffle attention and focal complete intersection over union. Animals, 13, 3201.
[9]	Hong YY, Lu XL, Zhao HP (2021) Bird diversity and interannual dynamics in different habitats of agricultural landscape in Huanghuai Plain. Acta Ecologica Sinica, 41, 2045-2055. (in Chinese with English abstract)
	[洪咏怡, 卢训令, 赵海鹏 (2021) 黄淮平原农业景观不同生境鸟类多样性特征及年际动态. 生态学报, 41, 2045-2055.]
[10]	Huang RR, Wang Y, Yang HZ (2022) Cross-layer attention network for fine-grained visual categorization. arXiv, doi: 10.48550/arXiv.2210.08784.
[11]	Jocher G (2022) YOLOv5 Release v6.0. https://github.com/ultralytics/yolov5/releases/tag/v6.0. (accessed on 2022-11-22)
[12]	Jocher G (2023) YOLOv8 Release v8.1.0. https://github.com/ultralytics/ultralytics/releases/tag/v8.1.0. (accessed on 2023-01-10)
[13]	Lei JL, Gao SH, Rasool MA, Fan R, Jia YF, Lei GC (2023) Optimized small waterbird detection method using surveillance videos based on YOLOv7. Animals, 13, 1929.
[14]	Li CY, Li LL, Jiang HL, Weng KH, Geng YF, Li L, Ke ZD, Li QY, Cheng M, Nie WQ, Li YD, Zhang B, Liang YF, Zhou LY, Xu XM, Chu XX, Wei XX, Wei XL (2022) YOLOv6: A single-stage object detection framework for industrial applications. arXiv, doi: 10.48550/arXiv.2209.02976.
[15]	Lin TY, Dollár P, Girshick R, He KM, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117-2125. IEEE, Honolulu.
[16]	Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C (2014) Microsoft coco:Common objects in context. In: 2014 European Conference on Computer Vision (ECCV), pp. 740-755. Springer International Publishing, Zurich.
[17]	Liu S, Qi L, Qin HF, Shi JP, Jia JY (2018) Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759-8768. IEEE, Salt Lake City, UT.
[18]	Liu SL, Li YL, Qu JY, Wu RB (2022) Airport UAV and birds detection based on deformable DETR. Journal of Physics: Conference Series, 2253, 012024.
[19]	Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD:Single shot MultiBox detector. In: 2016 European Conference on Computer Vision (ECCV), pp. 21-37. Springer International Publishing, Amsterdam.
[20]	Mokany K, Ware C, Harwood TD, Schmidt RK, Ferrier S (2022) Habitat-based biodiversity assessment for ecosystem accounting in the Murray-Darling Basin. Conservation Biology, 36, e13915.
[21]	Ren SQ, He KM, Girshick R, Sun J (2016) Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149.
[22]	Ronneberger O, Fischer P, Brox T (2015) U-Net:Convolutional networks for biomedical image segmentation. In: 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 234-241. Springer International Publishing, Munich.
[23]	Roth K, Vinyals O, Akata Z (2022) Non-isotropy regularization for proxy-based deep metric learning. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7420-7430. IEEE, New Orleans.
[24]	Song GL, Liu Y, Wang XG (2020) Revisiting the sibling head in object detector. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11563-11572. IEEE, Seattle.
[25]	Sun HB, He XT, Peng YX (2022) Sim-trans: Structure information modeling transformer for fine-grained visual categorization. In: 30th ACM International Conference on Multimedia (ACM MM), pp. 5853-5861. ACM, Lisboa.
[26]	Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The Caltech-UCSD Birds-200-2011 Dataset, Technical Report CNS-TR-2011-001. California Institute of Technology, California, USA.
[27]	Wang CY, Bochkovskiy A, Liao HYM (2021) Scaled-yolov4: Scaling cross stage partial network. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13029-13038. IEEE, Nashville.
[28]	Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464-7475. IEEE, Vancouver.
[29]	Wang JQ, Chen K, Xu R, Liu ZW, Loy CC, Lin DH (2019) Carafe: Content-aware reassembly of features. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3007-3016. IEEE, Seoul.
[30]	Wang JX, Su YH, Yao JH, Liu M, Du YR, Wu X, Huang L, Zhao MH (2023) Apple rapid recognition and processing method based on an improved version of YOLOv5. Ecological Informatics, 77, 102196.
[31]	Wang K, Yang F, Chen ZB, Chen YX, Zhang Y (2023) A fine-grained bird classification method based on attention and decoupled knowledge distillation. Animals, 13, 264.
[32]	Wu KY, Ruan WD, Zhou DF, Chen QC, Zhang CY, Pan XY, Yu S, Liu Y, Xiao RB (2023) Syllable clustering analysis-based passive acoustic monitoring technology and its application in bird monitoring. Biodiversity Science, 31, 22370. (in Chinese with English abstract) DOI
	[吴科毅, 阮文达, 周棣锋, 陈庆春, 张承云, 潘新园, 余上, 刘阳, 肖荣波 (2023) 基于音节聚类分析的被动声学监测技术及其在鸟类监测中的应用. 生物多样性, 31, 22370.] DOI
[33]	Wu Y, Chen YP, Yuan L, Liu ZC, Wang LJ, Li HZ, Fu Y (2020) Rethinking classification and localization for object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10186-10195. IEEE, Seattle.
[34]	Xiang WB, Song ZY, Zhang GX, Wu XC (2022) Birds detection in natural scenes based on improved faster RCNN. Applied Sciences, 12, 6094.
[35]	Xiao ZS, Xiao WH, Wang TM, Li S, Lian XM, Song DZ, Deng XQ, Zhou QH (2022) Wildlife monitoring and research using camera-trapping technology across China: The current status and future issues. Biodiversity Science, 30, 22451. (in Chinese with English abstract) DOI
	[肖治术, 肖文宏, 王天明, 李晟, 连新明, 宋大昭, 邓雪琴, 周岐海 (2022) 中国野生动物红外相机监测与研究: 现状及未来. 生物多样性, 30, 22451.] DOI
[36]	Xie JJ, Zhong YJ, Zhang JG, Liu S, Ding CQ, Triantafyllopoulos A (2023a) A review of automatic recognition technology for bird vocalizations in the deep learning era. Ecological Informatics, 73, 101927.
[37]	Xie JJ, Zhong YJ, Zhang JG, Zhang CC, Schuller BW (2023b) A weakly supervised spatial group attention network for fine-grained visual recognition. Applied Intelligence, 53, 23301-23315.
[38]	Xie ZF, Li DZ, Sun HX, Zhang AM (2023) Deep learning techniques for bird chirp recognition task. Biodiversity Science, 31, 22308. (in Chinese with English abstract) DOI
	[谢卓钒, 李鼎昭, 孙海信, 张安民 (2023) 面向鸟鸣声识别任务的深度学习技术. 生物多样性, 31, 22308.] DOI
[39]	Xue ZY, Lin HF, Wang F (2022) A small target forest fire detection model based on YOLOv5 improvement. Forests, 13, 1332.
[40]	Yang CHY, Huang ZH, Wang NY (2022) QueryDet: Cascaded sparse query for accelerating high-resolution small object detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13668-13677. IEEE, New Orleans.
[41]	Zhang H, Shao FM, He XH, Zhang ZH, Cai YG, Bi SH (2023) Research on object detection and recognition method for UAV aerial images based on improved YOLOv5. Drones, 7, 402.
[42]	Zhao Q, Wei HL, Zhai XY (2023) Improving tire specification character recognition in the YOLOv5 network. Applied Sciences, 13, 7310.
[43]	Zhao YF, Li J, Chen XW, Tian YH (2021) Part-guided relational transformers for fine-grained visual recognition. IEEE Transactions on Image Processing, 30, 9470-9481.
[44]	Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to- image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2223-2232. IEEE, Venice.
[45]	Zhu SM (2024) The Number of Terrestrial Wild Animals in the City Has Increased to 612 Species. (in Chinese)
	[朱松梅 (2024) 全市陆生野生动物种类增至612种.] https://www.beijing.gov.cn/ywdt/yaowen/202404/t20240414_3617537.html. (accessed on 2024-04-14)
[46]	Zhuang JY, Qin Z, Yu H, Chen XC (2023) Task-Specific context decoupling for object detection. arXiv, doi: 10.48550/arXiv.2303.01047.
[47]	Zou C, Liang YQ (2021) Bird detection of transmission line based on YOLO V3 algorithm. Computer Applications and Software, 38(10), 164-167, 241. (in Chinese with English abstract)
	[邹聪, 梁永全 (2021) 基于YOLO V3算法的输电线路鸟类检测. 计算机应用与软件, 38(10), 164-167, 241.]

基于TC-YOLO模型的北京珍稀鸟类识别方法

Rare bird recognition method in Beijing based on TC-YOLO model

RichHTML

PDF (PC)

补充材料

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 47

相关文章 1

编辑推荐

Metrics

本文评价