Deep learning techniques for bird chirp recognition task

doi:10.17520/biods.2022308

Abstract

Abstract:

Background: In the ecosystem, birds are an important component, which is crucial for regulating the ecological environment and monitoring biodiversity, and can even assist in predicting natural disasters such as earthquakes and tsunamis by monitoring the movement of birds and listening to their abnormal calls, so bird sound recognition and abnormal call detection have become popular research directions. However, low recognition rate is caused to the problems of insufficient feature extraction in traditional bird sound recognition methods.

Method: In this paper, we used a fusion feature method combined with deep learning to extract bird sound features. The fusion features were obtained by splicing the original signal parameters with the modified log-Meier spectral difference parameters; the deep learning method was based on the DenseNet121 network structure and incorporated the self-attention module and the central loss function for bird sound recognition. The self-attentive module partially improved the feature representation of key channels; the central loss function was used to solve the problem of incompact intra-class features. We used the data of 10 bird sounds from the Xeno-Canto World Wild Bird Sounds public dataset to test the accuracy of bird chirp recognition.

Conclusion In this paper, a neural network structure containing self-attention mechanism and center loss function is proposed for bird song recognition. Its verification accuracy reaches to 96.9%. The code is open source to Github: https://github.com/ CarrieX6/-Xeno-Canto-.git.

Key words: bird chirp recognition, feature fusion, self-attentive module, central loss function

Zhuofan Xie, Dingzhao Li, Haixin Sun, Anmin Zhang. Deep learning techniques for bird chirp recognition task[J]. Biodiv Sci, 2023, 31(1): 22308.

Add to citation manager EndNote|Ris|BibTeX

URL: https://www.biodiversity-science.net/EN/10.17520/biods.2022308

https://www.biodiversity-science.net/EN/Y2023/V31/I1/22308

Figures/Tables 10

References 14

[1]	Buades A, Coll B, Morel JM (2011) Non-local means denoising. Image Processing on Line, 1, 208-212. DOI URL
[2]	Dagan U, Izhaki I (2019) Understory vegetation in planted pine forests governs bird community composition and diversity in the eastern Mediterranean region. Forest Ecosystems, 6, 29. DOI URL
[3]	Dai YS, Yang J, Dong YW, Zou HP, Hu MZ, Wang B (2021) Blind source separation-based IVA-Xception model for bird sound recognition in complex acoustic environments. Electronics Letters, 57, 454-456. DOI URL
[4]	He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778. Las Vegas, NV, USA.
[5]	Huang G, Liu Z, van der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261-2269. Honolulu, HI, USA.
[6]	Incze Á, Jancsó HB, Szilágyi Z, Farkas A, Sulyok C (2018) Bird sound recognition using a convolutional neural network. In: 2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY), pp. 295-300. Subotica, Serbia.
[7]	Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60, 84-90. DOI URL
[8]	Lü KP, Sun B, Zhao YX (2021) Research on bird recognition method based on bird singing and deep learning. Bulletin of Science and Technology, 37(10), 24-30, 37. (in Chinese)
	[吕坤朋, 孙斌, 赵玉晓 (2021) 基于鸟鸣声及深度学习的鸟类识别方法研究. 科技通报, 37(10), 24-30, 37.]
[9]	Mahendra M, Nasution MA, Rahmayanti F, Islama D (2021) Application of appropriate technology for automatic bird pest removal and automatic fish feed in the Minapadi system in Beutong Nagan Raya District. International Journal of Community Service, 1(3), 231-237. DOI URL
[10]	Petmezas G, Cheimariotis GA, Stefanopoulos L, Rocha B, Paiva RP, Katsaggelos AK, Maglaveras N (2022) Automated lung sound classification using a hybrid CNN-LSTM network and focal loss function. Sensors, 22(3), 1232.
[11]	Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, doi: arXiv:1409.1556. DOI
[12]	Song FC, Ding XM, Yao F, Rui SJ, Chen R (2021) Research on railway intelligent bird repellent based on sensor technology and Internet of Things technology. Railway Engineering Technology and Economy, 36(1), 33-37. (in Chinese)
	[宋福春, 丁小明, 姚发, 芮胜骏, 陈容 (2021) 基于传感器技术和物联网技术的铁路智能驱鸟器的研究. 铁路工程技术与经济, 36(1), 33-37.]
[13]	Yang JF, Liu QQ, Zhang K, Lin QQ, Hou JH (2022) Diversity of bird community in spring in Bodhi Islands, Hebei Province. Journal of Hebei University (Natural Science Edition), 42, 182-189. (in Chinese with English abstract)
	[杨俊锋, 刘琪琪, 张侃, 林庆乾, 侯建华 (2022) 河北菩提岛诸岛春季鸟类群落多样性. 河北大学学报(自然科学版), 42, 182-189.]
[14]	Zhang Y, Zeng JF, Li YM, Chen D (2021) Convolutional neural network-gated recurrent unit neural network with feature fusion for environmental sound classification. Automatic Control and Computer Sciences, 55, 311-318. DOI URL

鸟类中文名称 Chinese name	鸟类英文学名 Latin name	库内标签 Registered label	样本时长 Sample length (s)	数据来源 Data source
美洲麻鳽	Botaurus lentiginosus	amebit	23,960.7	https://xeno-canto.org/species/Botaurus-lentiginosus
白头海雕	Haliaeetus leucocephalus	baleag	22,744.8	https://xeno-canto.org/species/Haliaeetus-leucocephalus
布氏雀鹀	Spizella breweri	brespa	23,880.6	https://xeno-canto.org/species/Spizella-breweri
普通拟八哥	Quiscalus quiscula	comgra	23,517.5	https://xeno-canto.org/species/Quiscalus-quiscula
角鸬鹚	Phalacrocorax auritus	doccor	22,183.7	https://xeno-canto.org/species/Phalacrocorax-auritus
灰斑鸠	Streptopelia decaocto	eucdov	22,837.2	https://xeno-canto.org/species/Streptopelia-decaocto
长嘴啄木鸟	Leuconotopicus villosus	haiwoo	23,009.2	https://xeno-canto.org/species/Leuconotopicus-villosus
暗背金翅雀	Spinus psaltria	lesgol	22,521.4	https://xeno-canto.org/species/Spinus-psaltria
环颈潜鸭	Aythya collaris	rinduc	21,653.2	https://xeno-canto.org/species/Aythya-collaris
白喉雨燕	Aeronautes saxatalis	whtswi	22,491.7	https://xeno-canto.org/species/Aeronautes-saxatalis

鸟类中文名称 Chinese name	鸟类英文学名 Latin name	库内标签 Registered label	样本时长 Sample length (s)	数据来源 Data source
美洲麻鳽	Botaurus lentiginosus	amebit	23,960.7	https://xeno-canto.org/species/Botaurus-lentiginosus
白头海雕	Haliaeetus leucocephalus	baleag	22,744.8	https://xeno-canto.org/species/Haliaeetus-leucocephalus
布氏雀鹀	Spizella breweri	brespa	23,880.6	https://xeno-canto.org/species/Spizella-breweri
普通拟八哥	Quiscalus quiscula	comgra	23,517.5	https://xeno-canto.org/species/Quiscalus-quiscula
角鸬鹚	Phalacrocorax auritus	doccor	22,183.7	https://xeno-canto.org/species/Phalacrocorax-auritus
灰斑鸠	Streptopelia decaocto	eucdov	22,837.2	https://xeno-canto.org/species/Streptopelia-decaocto
长嘴啄木鸟	Leuconotopicus villosus	haiwoo	23,009.2	https://xeno-canto.org/species/Leuconotopicus-villosus
暗背金翅雀	Spinus psaltria	lesgol	22,521.4	https://xeno-canto.org/species/Spinus-psaltria
环颈潜鸭	Aythya collaris	rinduc	21,653.2	https://xeno-canto.org/species/Aythya-collaris
白喉雨燕	Aeronautes saxatalis	whtswi	22,491.7	https://xeno-canto.org/species/Aeronautes-saxatalis

参数名称 Parameter name	参数值 Parameter value
批大小 Batch_size	256
时期数 Epochs	50
学习率 Learning rate	0.001
优化器 Optimizer	自适应矩估计优化器 Adam optimizer
损失函数 Loss function	分类交叉熵 Categorical_cross-entropy

参数名称 Parameter name	参数值 Parameter value
批大小 Batch_size	256
时期数 Epochs	50
学习率 Learning rate	0.001
优化器 Optimizer	自适应矩估计优化器 Adam optimizer
损失函数 Loss function	分类交叉熵 Categorical_cross-entropy

特征提取方法 Feature extraction method	准确率 Accuracy	总参数量 No. of parameters
VGG11 + 原始特征 VGG11 + Original feature	0.906	1.38e8
VGG11 + 对数梅尔谱差分特征 VGG11 + Log-Meier spectral differential characteristics	0.926	1.38e8
VGG11 + 融合特征 VGG11 + Fusion feature	0.935	1.38e8
ResNet18 + 原始特征 ResNet18 + Original feature	0.896	1.11e7
ResNet18 + 对数梅尔谱差分特征 ResNet18 + Log-Meier spectral differential characteristics	0.912	1.11e7
ResNet18 + 融合特征 ResNet18 + Fusion feature	0.933	1.11e7
DensNet121 + 原始特征 DensNet121 + Original feature	0.901	6.94e6
DensNet121 + 对数梅尔谱差分特征 DensNet121 + Log-Meier spectral differential characteristics	0.932	6.96e6
DensNet121 + 融合特征 DensNet121 + Fusion feature	0.939	6.96e6