生物多样性 ›› 2026, Vol. 34 ›› Issue (4): 25287.  DOI: 10.17520/biods.2025287  cstr: 32101.14.biods.2025287

• • 上一篇    下一篇

基于SCD-HRNet模型的野生动物姿态估计方法: 以内蒙古赛罕乌拉地区为例

孔孜亦1,2,3, 王德港1,2,3, 王建涛4, 裴志永5, 孙晶6, 张长春1,2,3*, 张军国1,2,3*   

  1. 1. 北京林业大学工学院, 北京 100083; 2. 林木资源高效生产全国重点实验室, 北京 100083; 3. 北京林业大学生物多样性智慧监测研究中心, 北京 100083; 4. 内蒙古乌兰坝国家级自然保护区管理局, 内蒙古赤峰 025450; 5. 内蒙古农业大学能源与交通工程学院, 呼和浩特 010018; 6. 兴安盟乌兰河地方级自然保护区管理局, 内蒙古乌兰浩特 137400
  • 收稿日期:2025-07-20 修回日期:2025-09-30 接受日期:2025-11-07 出版日期:2026-04-20
  • 通讯作者: 张长春, 张军国
  • 基金资助:
    高层次人才引智计划-张军国(陕西省科学院科技计划项目(2025K-32)); 开放环境野生动物监测图像增量学习识别机制及方法研究(国家自然科学基金项目(32371874)); 北京地区开放环境野生动物监测图像泛化识别机制及方法研究(北京市自然科学基金项目(6244053)); 湿地水鸟监测图像开放集域适应识别机制及方法(32401569)

Wildlife pose estimation method based on the SCD-HRNet Model: A case study of the Saihanwula Region, Inner Mongolia

Ziyi Kong1,2,3, Degang Wang1,2,3, Jiantao Wang4, Zhiyong Pei5, Jing Sun6, Changchun Zhang1,2,3*, Junguo Zhang1,2,3*   

  1. 1 College of Engineering, Beijing Forestry University, Beijing 100083, China 

    2 National Key Laboratory of Efficient Production of Forest Resources, Beijing 100083, China 

    3 Research Center for Intelligent Biodiversity Monitoring, Beijing Forestry University, Beijing 100083, China 

    4 Administration of Ulanba National Nature Reserve, Chifeng, Inner Mongolia 025450, China 

    5 College of Energy and Transportation Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia 010018, China 

    6 Administration of Wulanhe Local Nature Reserve, Ulanhot, Inner Mongolia 137400, China

  • Received:2025-07-20 Revised:2025-09-30 Accepted:2025-11-07 Online:2026-04-20
  • Contact: Changchun Zhang, Junguo Zhang
  • Supported by:
    High-Level Talent Recruitment Program – Junguo Zhang(陕西省科学院科技计划项目(2025K-32)); Research on Incremental Learning Mechanisms and Methods for Wildlife Monitoring Images in Open Environments(国家自然科学基金项目(32371874)); Research on Generalization Mechanisms and Methods for Open-Environment Wildlife Monitoring Images in Beijing(北京市自然科学基金项目(6244053)); Open Set Domain Adaptation Recognition Mechanisms and Methods for Wetland Waterfowl Monitoring Images(32401569)

摘要: 内蒙古赛罕乌拉地区野生动物的保护对维护当地生物多样性具有重要的意义。姿态估计是行为分析的基础,而行为分析为生物多样性保护提供了重要技术支撑。针对野生动物监测中因光照变化、动物高速运动、复杂环境遮挡因素导致的姿态估计精度下降问题。本文提出一种融合注意力机制和动态置信度抑制的野生动物姿态估计方法(Selective Coordinate-enhanced Decoupling-HRNet,SCD-HRNet)。首先,结合压缩-激励(Squeeze-and-Excitation, SE)模块,通过全局平均池化提取通道级上下文特征,增强网络对物种形态特征的鉴别能力,有效解决由光照变化导致的特征失真问题;其次,为应对动物高速运动带来的定位偏差,引入坐标注意力(Coordinate Attention, CA)机制,将二维坐标分解为水平与垂直分量进行正弦位置编码,通过双向注意力机制建立跨方向长程依赖关系,提升运动模糊状态下的关节定位精度;最后,提出动态置信度抑制(Dynamic Confidence Suppression, DCS)模块,基于模型推理精度建立自适应阈值函数,实现遮挡部位关键点的鲁棒性检测。本文开展对比实验以验证模型的性能。实验结果表明,SCD-HRNet方法的平均精度均值在采集并标注的赛罕乌拉地区野生动物数据集和AP-10K公开动物数据集上分别达到了82.61%和69.79%,均优于已有方法。本文提出的SCD-HRNet方法显著提升了复杂生态场景中野生动物图像的姿态估计精度,为生态监测中的野生动物行为分析提供了可靠的技术支持。

关键词: 野生动物, 姿态估计, HRNet, SE注意力机制, 坐标注意力机制, 动态置信度抑制

Abstract

Aims: The conservation of wild animals in the Saihanwula region of Inner Mongolia is of great significance for maintaining regional biodiversity. Behavioral analysis helps enhance the scientific basis and intelligent management of biodiversity conservation, with pose estimation serving as the prerequisite and core support for behavioral analysis. 

Methods: Aiming at the problem that the accuracy of pose estimation is decreased due to illumination changes, high-speed movement of animals and complex environmental occlusion factors in wildlife monitoring, in this paper, we propose a novel wildlife pose estimation method combining attention mechanism and dynamic confidence suppression (selective coordinate-enhanced decoupling-HRNet, SCD-HRNet). Firstly, combined with the squeeze-and-excitation (SE) module, the channel-level context features were extracted by global average pooling to enhance the discrimination ability of the network for species morphological features and effectively solve the problem of feature distortion caused by illumination changes. Secondly, in order to deal with the positioning deviation caused by the high-speed movement of animals, the coordinate attention (CA) mechanism is introduced to decompose the two-dimensional coordinates into the horizontal and vertical components for sinusoidal position coding, and the bidirectional attention mechanism is used to establish the cross-direction long-range dependence relationship to improve the joint positioning accuracy under motion blur. Finally, the dynamic confidence suppression (DCS) module is proposed to establish an adaptive threshold function based on the model inference accuracy to realize the robust detection of the key points in occlusion. 

Results: This paper carries out comparative experiments to verify the performance of the model. The experimental results show that the mean average precision of the SCD-HRNet method reaches 82.61% and 69.79% on the collected and labeled wild animal dataset in Saihanwula area and the AP-10K public animal dataset, respectively, which are better than the existing methods. 

Conclusion: The proposed SCD-HRNet method significantly improves the pose estimation accuracy of wildlife images in complex ecological scenes, and provides reliable technical support for wildlife behavior analysis in ecological monitoring.

Key words: wild animals, pose estimation, HRNet, SE attention mechanism, coordinate attention mechanism, dynamic confidence suppression