Biodiv Sci ›› 2024, Vol. 32 ›› Issue (10): 24259.  DOI: 10.17520/biods.2024259  cstr: 32101.14.biods.2024259

• Technology and Methodologies • Previous Articles     Next Articles

Cross-regional bird species recognition method integrating audio and ecological niche information

Jiangjian Xie1,2,#(), Chen Shen1,#, Feiyu Zhang1, Zhishu Xiao3,*()()   

  1. 1. School of Technology, Beijing Forestry University, Beijing 100083, China
    2. State Key Laboratory of Efficient Production of Forest Resources, Beijing 100083, China
    3. State Key Laboratory of Integrated Management of Pest Insects and Rodents in Agriculture, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
  • Received:2024-06-25 Accepted:2024-08-24 Online:2024-10-20 Published:2024-09-26
  • Contact: *E-mail: xiaozs@ioz.ac.cn
  • About author:First author contact:

    #Co-first authors

  • Supported by:
    National Natural Science Foundation of China(62303063);National Key Research and Development Program of China(E31OP01233)

Abstract:

Aim: Passive acoustic monitoring plays a pivotal role in studying avian populations, community dynamics, and behaviors. For extensive passive acoustic monitoring, employing deep learning techniques to automatically identify bird species from their vocalizations is essential. However, closely related species often produce highly similar calls, leading to confusion and false positives, which can compromise the effectiveness of deep learning models. This paper presents a novel method that integrates audio data with ecological niche information to enhance species recognition accuracy. Here, ‘ecological niche’ encompasses a species’ role in its environment, including its habitat, diet, and behavior.

Methods: The approach begins with the development of an audio recognition model using the ResNet18, a prominent deep learning framework known for its capability to extract high-level features from audio signals. Subsequently, a maximum entropy model is employed to estimate the distribution of bird species and derive ecological suitability indices for various locations. These indices provide the necessary ecological niche information. An integrated model, NicheNet, is then constructed to combine audio features with ecological niche data for improved species recognition.

Results & Conclusion: The integration of audio and ecological niche information through NicheNet demonstrates substantial improvements in recognition accuracy. Specifically, NicheNet enhances Top-1 recognition accuracy by 12.9% and Top-5 recognition accuracy by 10.6% compared to the ResNet18 model. Additionally, NicheNet reduces the near species error rate by 3.1%, the near genus error rate by 1.8%, and the near family error rate by 8.0%. Analysis of recognition outcomes for congeneric species with similar vocalizations reveals that NicheNet significantly refines classification by leveraging ecological niche information, thereby improving the discrimination of vocally similar but ecologically distinct species. This method effectively addresses the challenge of misidentification among closely related and vocally similar bird species that differ in their ecological niches, thereby advancing the accuracy of cross-regional bird species recognition based on vocalizations.

Key words: passive acoustic monitoring, bird vocalization recognition, residual network, deep learning, ecological niche