Biodiv Sci ›› 2024, Vol. 32 ›› Issue (5): 24056.  DOI: 10.17520/biods.2024056

• Technology and Methodology • Previous Articles     Next Articles

Rare bird recognition method in Beijing based on TC-YOLO model

Baican Li1,2,3, Junguo Zhang1,2,3,*(), Changchun Zhang1,2,3, Lifeng Wang1,2,3, Jiliang Xu4, Li Liu5,*()   

  1. 1 School of Technology, Beijing Forestry University, Beijing 100083
    2 State Key Laboratory of Efficient Production of Forest Resources, Beijing 100083
    3 Key Laboratory of State Forestry and Grassland Administration on Forestry Equipment and Automation, Beijing 100083
    4 School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083
    5 School of Biological Science and Technology, Baotou Teachers’ College, Baotou, Nei Mongol 014030
  • Received:2024-02-16 Accepted:2024-04-12 Online:2024-05-20 Published:2024-04-19
  • Contact: E-mail: zhangjunguo@bjfu.edu.cn; E-mail: liuli4304842@126.com

Abstract:

Aim: Bird recognition is an important means to protect birds, and traditional bird recognition mainly relies on manual labor, which has high costs, high professional technical requirements, and certain limitations. With the development of artificial intelligence technology, the use of deep learning technology to automatically identify birds has become an important means of bird survey and protection. However, the actual bird images are characterized by a complex background and the similar appearance of birds of similar families, resulting in poor model recognition accuracy.

Methods: To address the above problems, this paper proposed a bird recognition method based on TC-YOLO model. First, in order to solve the problem of missed detection caused by complex background in bird recognition, the method in this paper combined the CARAFE (content-aware reassembly of features) mechanism to adaptively generate the upsampling kernels corresponding to different feature points, to aggregate the contextual semantic information within a larger receptive field, effectively to focus on the distribution of bird regions in the global feature map, and to improve the ability of the upsampling in recognizing the bird features so as to enable the model to accurately recognize bird targets. Second, in order to solve the problem of false detection caused by similar appearances in bird recognition, our paper introduced TSCODE (task-specific context decoupling) to decouple the localization and classification tasks by acquiring the information of multi-level feature maps to regress to the target boundary and utilizing the features containing the underlying texture and the higher-level semantics for the classification of species, which in turn improves the model’s bird recognition accuracy.

Results: This paper carried out comparative experiments to verify the performance of the model. The experimental results showed that the mean average precision of the TC-YOLO model on the self-built dataset Beijing-28, which contained 28 species of national first-class protected birds in Beijing, and the publicly available dataset of birds CUB200-2011 reached 78.7% and 75.3%, respectively, which were both better than the comparison methods, proving that the TC-YOLO model possessed a superior performance in bird recognition. In addition, in order to verify the generalization of the TC-YOLO model on other kinds of datasets, experiments were carried out on the public dataset MS COCO, and the results showed that the performance of the TC-YOLO model was better than the comparison model, which indicated that the TC-YOLO model possesses strong generalization.

Conclusion: The TC-YOLO model proposed in this paper can effectively recognize bird images in the presence of complex backgrounds or similar appearances with low leakage and misdetection rates and strong generalization, which can provide important technical support for bird conservation and thus bring more practical application value for biodiversity conservation in Beijing.

Key words: rare birds, image recognition, YOLOv5s, upsampling, decoupling head