生物多样性 ›› 2024, Vol. 32 ›› Issue (5): 24056.  DOI: 10.17520/biods.2024056  cstr: 32101.14.biods.2024056

• 技术与方法 • 上一篇    下一篇

基于TC-YOLO模型的北京珍稀鸟类识别方法

李柏灿1,2,3, 张军国1,2,3,*(), 张长春1,2,3, 王丽凤1,2,3, 徐基良4, 刘利5,*()   

  1. 1.北京林业大学工学院, 北京 100083
    2.林木资源高效生产全国重点实验室, 北京 100083
    3.林业装备与自动化国家林业和草原局重点实验室, 北京 100083
    4.北京林业大学生态与自然保护学院, 北京 100083
    5.包头师范学院生物科学与技术学院, 内蒙古包头 014030
  • 收稿日期:2024-02-16 接受日期:2024-04-12 出版日期:2024-05-20 发布日期:2024-04-19
  • 通讯作者: E-mail: zhangjunguo@bjfu.edu.cn; E-mail: liuli4304842@126.com
  • 基金资助:
    国家自然科学基金(32371874);中央高校优秀青年团队项目(QNTD202304);北京市自然科学基金(6244053);国家林业和草原局林业科技成果推广计划([2019]04)

Rare bird recognition method in Beijing based on TC-YOLO model

Baican Li1,2,3, Junguo Zhang1,2,3,*(), Changchun Zhang1,2,3, Lifeng Wang1,2,3, Jiliang Xu4, Li Liu5,*()   

  1. 1 School of Technology, Beijing Forestry University, Beijing 100083
    2 State Key Laboratory of Efficient Production of Forest Resources, Beijing 100083
    3 Key Laboratory of State Forestry and Grassland Administration on Forestry Equipment and Automation, Beijing 100083
    4 School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083
    5 School of Biological Science and Technology, Baotou Teachers’ College, Baotou, Nei Mongol 014030
  • Received:2024-02-16 Accepted:2024-04-12 Online:2024-05-20 Published:2024-04-19
  • Contact: E-mail: zhangjunguo@bjfu.edu.cn; E-mail: liuli4304842@126.com

摘要:

北京地区珍稀鸟类的保护对维护当地生物多样性具有重要意义。随着人工智能技术的发展, 利用深度学习技术自动识别鸟类成为鸟类调查保护的重要手段。实际鸟类图像存在背景复杂以及相近科属鸟类具有外观相似等特点, 导致模型识别精度不佳。针对以上问题, 本文提出一种基于TC-YOLO模型的鸟类识别方法。首先, 为解决鸟类识别中复杂背景导致的漏检问题, 本文方法结合CARAFE (content-aware reassembly of features)机制, 自适应生成不同特征点所对应的上采样核, 在更大的感受野内聚合上下文语义信息, 有效聚焦鸟类前景区域。其次, 为解决鸟类识别中相似外观导致的误检问题, 本文方法引入TSCODE (task-specific context decoupling)解耦定位和分类任务, 通过获取多层级特征图的信息以回归目标边界, 并利用包含底层纹理和高层语义的特征进行物种分类, 进而提高模型的鸟类识别精度。最后, 本文开展对比实验以验证模型的性能。实验结果表明, TC-YOLO模型的平均精度均值在包含北京地区28种国家一级保护鸟类的自建数据集Beijing-28和鸟类公开数据集CUB200-2011上分别达到78.7%和75.3%, 均优于已有方法, 而且在公开数据集MS COCO上验证了TC-YOLO模型拥有较强的泛化性。本文提出的TC-YOLO模型对背景复杂或外观相似的鸟类图像都能有效识别, 漏检率和误检率较低, 能够为鸟类保护提供重要技术支撑。

关键词: 珍稀鸟类, 图像识别, YOLOv5s, 上采样, 解耦头

Abstract

Aim: Bird recognition is an important means to protect birds, and traditional bird recognition mainly relies on manual labor, which has high costs, high professional technical requirements, and certain limitations. With the development of artificial intelligence technology, the use of deep learning technology to automatically identify birds has become an important means of bird survey and protection. However, the actual bird images are characterized by a complex background and the similar appearance of birds of similar families, resulting in poor model recognition accuracy.

Methods: To address the above problems, this paper proposed a bird recognition method based on TC-YOLO model. First, in order to solve the problem of missed detection caused by complex background in bird recognition, the method in this paper combined the CARAFE (content-aware reassembly of features) mechanism to adaptively generate the upsampling kernels corresponding to different feature points, to aggregate the contextual semantic information within a larger receptive field, effectively to focus on the distribution of bird regions in the global feature map, and to improve the ability of the upsampling in recognizing the bird features so as to enable the model to accurately recognize bird targets. Second, in order to solve the problem of false detection caused by similar appearances in bird recognition, our paper introduced TSCODE (task-specific context decoupling) to decouple the localization and classification tasks by acquiring the information of multi-level feature maps to regress to the target boundary and utilizing the features containing the underlying texture and the higher-level semantics for the classification of species, which in turn improves the model’s bird recognition accuracy.

Results: This paper carried out comparative experiments to verify the performance of the model. The experimental results showed that the mean average precision of the TC-YOLO model on the self-built dataset Beijing-28, which contained 28 species of national first-class protected birds in Beijing, and the publicly available dataset of birds CUB200-2011 reached 78.7% and 75.3%, respectively, which were both better than the comparison methods, proving that the TC-YOLO model possessed a superior performance in bird recognition. In addition, in order to verify the generalization of the TC-YOLO model on other kinds of datasets, experiments were carried out on the public dataset MS COCO, and the results showed that the performance of the TC-YOLO model was better than the comparison model, which indicated that the TC-YOLO model possesses strong generalization.

Conclusion: The TC-YOLO model proposed in this paper can effectively recognize bird images in the presence of complex backgrounds or similar appearances with low leakage and misdetection rates and strong generalization, which can provide important technical support for bird conservation and thus bring more practical application value for biodiversity conservation in Beijing.

Key words: rare birds, image recognition, YOLOv5s, upsampling, decoupling head