Biodiv Sci ›› 2023, Vol. 31 ›› Issue (1): 22308.  DOI: 10.17520/biods.2022308

• Original Papers: Animal Diversity • Previous Articles     Next Articles

Deep learning techniques for bird chirp recognition task

Zhuofan Xie1,2,3, Dingzhao Li2,3, Haixin Sun2,3,*(), Anmin Zhang4   

  1. 1. School of Electronic Science and Engineering (National Model Microelectronics College), Xiamen University, Xiamen, Fujian 361005
    2. School of Informatics, Xiamen University, Xiamen, Fujian 361000
    3. Key Laboratory of Southeast Coast Marine Information Intelligent Perception and Application, Ministry of Natural Resources, Xiamen, Fujian 361005
    4. School of Marine Science and Technology, Tianjin University, Tianjin 300072
  • Received:2022-06-08 Accepted:2022-07-28 Online:2023-01-20 Published:2022-09-22
  • Contact: *Haixin Sun, E-mail: hisensessun@163.com

Abstract:

Background: In the ecosystem, birds are an important component, which is crucial for regulating the ecological environment and monitoring biodiversity, and can even assist in predicting natural disasters such as earthquakes and tsunamis by monitoring the movement of birds and listening to their abnormal calls, so bird sound recognition and abnormal call detection have become popular research directions. However, low recognition rate is caused to the problems of insufficient feature extraction in traditional bird sound recognition methods.

Method: In this paper, we used a fusion feature method combined with deep learning to extract bird sound features. The fusion features were obtained by splicing the original signal parameters with the modified log-Meier spectral difference parameters; the deep learning method was based on the DenseNet121 network structure and incorporated the self-attention module and the central loss function for bird sound recognition. The self-attentive module partially improved the feature representation of key channels; the central loss function was used to solve the problem of incompact intra-class features. We used the data of 10 bird sounds from the Xeno-Canto World Wild Bird Sounds public dataset to test the accuracy of bird chirp recognition.

Conclusion In this paper, a neural network structure containing self-attention mechanism and center loss function is proposed for bird song recognition. Its verification accuracy reaches to 96.9%. The code is open source to Github: https://github.com/ CarrieX6/-Xeno-Canto-.git.

Key words: bird chirp recognition, feature fusion, self-attentive module, central loss function