生物多样性 ›› 2026, Vol. 34 ›› Issue (4): 25287.  DOI: 10.17520/biods.2025287  cstr: 32101.14.biods.2025287

• 技术与方法 • 上一篇    下一篇

基于SCD-HRNet模型的野生动物姿态估计及其在生物多样性监测中的应用: 以内蒙古赛罕乌拉地区为例

孔孜亦1,2,3, 王德港1,2,3, 王建涛4, 裴志永5, 孙晶6, 张长春1,2,3,*(), 张军国1,2,3,*()   

  1. 1 北京林业大学工学院, 北京 100083
    2 林木资源高效生产全国重点实验室, 北京 100083
    3 北京林业大学生物多样性智慧监测研究中心, 北京 100083
    4 内蒙古乌兰坝国家级自然保护区管理局, 内蒙古赤峰 025450
    5 内蒙古农业大学能源与交通工程学院, 呼和浩特 010018
    6 兴安盟乌兰河地方级自然保护区管理局, 内蒙古乌兰浩特 137400
  • 收稿日期:2025-07-20 接受日期:2025-10-22 出版日期:2026-04-20 发布日期:2026-05-27
  • 通讯作者: 张长春,张军国
  • 基金资助:
    国家自然科学基金(32371874);国家自然科学基金(32401569);北京市自然科学基金(6244053);陕西省科学院科技计划项目(2025K-32)

Wildlife pose estimation based on the SCD-HRNet model and its application in biodiversity monitoring: A case study of the Saihanwula Region, Inner Mongolia

Ziyi Kong1,2,3, Degang Wang1,2,3, Jiantao Wang4, Zhiyong Pei5, Jing Sun6, Changchun Zhang1,2,3,*(), Junguo Zhang1,2,3,*()   

  1. 1 College of Engineering, Beijing Forestry University, Beijing 100083, China
    2 National Key Laboratory of Efficient Production of Forest Resources, Beijing 100083, China
    3 Research Center for Intelligent Biodiversity Monitoring, Beijing Forestry University, Beijing 100083, China
    4 Administration of Ulanba National Nature Reserve, Chifeng, Inner Mongolia 025450, China
    5 College of Energy and Transportation Engineering, Inner Mongolia Agricultural University, Hohhot 010018, China
    6 Wulanhe Local Nature Reserve Administration, Hinggan League, Ulanhot, Inner Mongolia 137400, China
  • Received:2025-07-20 Accepted:2025-10-22 Online:2026-04-20 Published:2026-05-27
  • Contact: Changchun Zhang, Junguo Zhang
  • Supported by:
    National Natural Science Foundation of China(32371874);National Natural Science Foundation of China(32401569);Beijing Natural Science Foundation(6244053);the Science and Technology Program of Shaanxi Academy of Sciences(2025K-32)

摘要:

内蒙古赛罕乌拉地区野生动物的保护对维护区域生物多样性具有重要意义。行为分析有助于提升生物多样性保护的科学性与智能化水平, 而姿态估计是行为分析的前提和核心支撑。针对野生动物监测中因光照变化、动物高速运动、复杂环境遮挡因素导致的姿态估计精度下降问题, 本文提出一种融合注意力机制和动态置信度抑制的野生动物姿态估计方法(selective coordinate-enhanced decoupling-HRNet, SCD-HRNet)。首先, 结合压缩-激励(squeeze-and-excitation, SE)注意力机制, 通过全局平均池化提取通道级上下文特征, 增强网络对物种形态特征的鉴别能力, 有效解决由光照变化导致的特征失真问题; 其次, 为应对动物高速运动带来的定位偏差, 引入坐标注意力(coordinate attention, CA)机制, 将二维坐标分解为水平与垂直分量, 通过双向注意力机制建立跨方向长程依赖关系, 提升运动模糊状态下的关键点定位精度; 最后, 提出动态置信度抑制(dynamic confidence suppression, DCS)模块, 基于模型推理精度建立自适应阈值函数, 实现遮挡部位关键点的稳健性检测。本文开展对比实验以验证模型的性能。实验结果表明, SCD-HRNet方法的平均精度均值在采集并标注的赛罕乌拉地区野生动物数据集和AP-10K公开动物数据集上分别达到了82.61%和69.79%, 均优于已有方法。本文提出的SCD-HRNet方法显著提升了复杂生态场景中野生动物图像的姿态估计精度, 为生物多样性监测中的野生动物行为分析提供了可靠的技术支持。

关键词: 野生动物, 姿态估计, HRNet, 压缩-激励注意力机制, 坐标注意力机制, 动态置信度抑制, 红外相机监测, 生物多样性保护

Abstract

Aims: The conservation of wild animals in the Saihanwula region of Inner Mongolia is of great significance for maintaining regional biodiversity. Behavioral analysis helps enhance the scientific basis and intelligent management of biodiversity conservation, with pose estimation serving as the prerequisite and core support for behavioral analysis.
Methods: Aiming to solve the problem of decreased pose estimation accuracy caused by illumination changes, high-speed animal movement and complex environmental occlusion factors in wildlife monitoring, this paper proposed a novel wildlife pose estimation method combining attention mechanism and dynamic confidence suppression (selective coordinate-enhanced decoupling-HRNet, SCD-HRNet). Firstly, combined with the squeeze-and-excitation (SE) attention mechanism, the channel-level context features were extracted by global average pooling to enhance the discrimination ability of the network for species morphological features and effectively solve the problem of feature distortion caused by illumination changes. Secondly, in order to deal with the positioning deviation caused by high-speed animal movement, the coordinate attention (CA) mechanism was introduced to decompose the two-dimensional coordinates into horizontal and vertical components, and the bidirectional attention mechanism was used to establish the cross-direction long-range dependence relationship to improve the joint positioning accuracy under motion blur. Finally, the dynamic confidence suppression (DCS) module was proposed to establish an adaptive threshold function based on model inference accuracy to achieve robust detection of occluded key points.
Results: This paper carried out comparative experiments to verify the performance of the model. The experimental results showed that the mean average precision of SCD-HRNet method reaches 82.61% and 69.79% on the collected and labeled wild animal dataset in Saihanwula area and on the AP-10K public animal dataset, respectively, outperforming the existing methods.
Conclusion: The proposed SCD-HRNet method significantly improves the pose estimation accuracy of wildlife images in complex ecological scenes and provides reliable technical support for wildlife behavior analysis in ecological monitoring.

Key words: wild animals, pose estimation, HRNet, squeeze-and-excitation attention mechanism, coordinate attention mechanism, dynamic confidence suppression, infrared camera monitoring, biodiversity conservation