生物多样性 ›› 2026, Vol. 34 ›› Issue (2): 25256.  DOI: 10.17520/biods.2025256

• •    下一篇

面向偏态分布的乌兰坝野生动物识别方法

纪林1,2,3, 邓宸迅1,2,3, 王丽凤1,2,3, 王德港1,2,3, 王建涛4, 于永永4, 张军国1,2,3*   

  1. 1. 北京林业大学工学院, 北京 100083; 2. 林木资源高效生产全国重点实验室, 北京 100083; 3. 北京林业大学生物多样性智慧监测研究中心, 北京 100083; 4. 内蒙古乌兰坝国家级自然保护区管理局, 内蒙古赤峰 025450
  • 收稿日期:2025-07-02 修回日期:2026-01-12 接受日期:2026-02-28 出版日期:2026-02-20
  • 通讯作者: 张军国
  • 基金资助:
    国家自然科学基金(32371874); 科技部雄安新区科技创新专项(2023XAGG0065); 北京市自然科学基金项目(6192019)

A wildlife recognition method for skewed distributions in the Ulanba Nature Reserve

Lin Ji1,2,3, Chenxun Deng1,2,3, Lifeng Wang1,2,3, Degang Wang1,2,3, Jiantao Wang4, Yongyong Yu4, Junguo Zhang1,2,3*   

  1. 1 School of Technology, Beijing Forestry University, Beijing 100083, China 

    2 State Key Laboratory of Efficient Production of Forest Resources, Beijing 100083, China 

    3 Research Center for Biodiversity Intelligent Monitoring, Beijing Forestry University, Beijing 100083, China 

    4 Administration of Ulanba National Nature Reserve, Chifeng, Inner Mongolia 025450, China

  • Received:2025-07-02 Revised:2026-01-12 Accepted:2026-02-28 Online:2026-02-20
  • Contact: Junguo Zhang

摘要: 内蒙古乌兰坝国家级自然保护区野生动物的保护对维护当地生物多样性具有重要意义。随着人工智能技术的快速发展, 利用深度学习方法对野生动物图像进行自动识别, 已成为野生动物调查保护的关键手段。实际采集的野生动物图像数据通常呈现一种偏态分布特征, 即少数高频类别物种样本充足, 而大多数低频类别物种样本稀缺, 影响模型的整体识别性能。针对这一问题, 本文提出一种面向偏态分布的野生动物识别方法Diff-SCC。首先, 该方法利用大语言模型生成类别的丰富语义描述, 引导扩散模型生成额外样本, 同时引入多尺度负样本筛选策略, 从像素空间、特征空间及语义空间3个维度进行图像质量评估和筛选, 提升低频类别的特征多样性并平衡数据分布。其次, 本文在主干网络ResNet50中引入SCConv模块以减少空间与通道建模过程中的冗余特征, 并增强模型对前景区域的感知能力, 从而提高模型对低频类别的识别性能。最后, 本文在自建数据集ULB-12和野生动物公开数据集NACTI上开展对比实验以验证模型的性能。实验结果显示, Diff-SCC模型在上述两个数据集上的整体识别准确率分别达到78.71%和80.84%, 低频类别的识别准确率相较基线模型分别提升9.96%和9.99%。上述结果验证了Diff-SCC在处理偏态分布数据集的有效性, 能够为野生动物智能监测与保护提供可靠的技术支撑。

关键词: 野生动物, 图像识别, 偏态分布, 扩散模型, 特征重建

Abstract

Aim: The protection of wildlife in the Ulanba National Nature Reserve of Inner Mongolia plays a vital role in maintaining regional biodiversity. With the rapid development of artificial intelligence, deep learning has become a key tool for automating wildlife image recognition and advancing intelligent ecological monitoring. However, real-world wildlife image datasets typically exhibit a skewed distribution, where a few common species have abundant samples, while most species are underrepresented, thereby limiting the overall recognition performance of the model. 

Methods: To address this issue, this study proposes a novel method for wildlife recognition named Diff-SCC, which integrates diffusion-based data generation and feature reconstruction. Specifically, rich semantic descriptions of low-frequency categories are first generated using a large language model to guide the diffusion model in synthesizing additional samples. A multi-scale negative sample filtering strategy is then introduced to assess image quality from pixel, feature, and semantic levels, enhancing the diversity and balance of low-frequency categories’ features. Furthermore, an SCConv module is incorporated into the backbone network to improve spatial and channel modeling, focusing more effectively on foreground regions while reducing redundant computation. 

Results: This paper conducted comparative experiments on a self-built wildlife dataset from Ulanba Nature Reserve, which includes 12 wildlife categories, and on the public NACTI dataset. Experimental results show that the proposed Diff-SCC model achieves overall recognition accuracies of 78.71% and 80.84% on the two datasets, respectively. Notably, the recognition accuracy of low-ferquency classes improves by 9.96% and 9.99% over the baseline model, demonstrating the effectiveness of the proposed method in handling skewed data and recognizing rare species. 

Conclusion: The Diff-SCC model proposed in this study demonstrates strong capability in mitigating the challenges of skewed distributions in wildlife image classification. It offers a reliable and practical solution for intelligent wildlife monitoring and contributes to the advancement of biodiversity conservation.

Key words: wildlife, image classification, skewed distributions, diffusion model, feature reconstruction.