Biodiv Sci ›› 2023, Vol. 31 ›› Issue (1): 22308. DOI: 10.17520/biods.2022308
• Original Papers: Animal Diversity • Previous Articles Next Articles
Zhuofan Xie1,2,3, Dingzhao Li2,3, Haixin Sun2,3,*(), Anmin Zhang4
Received:
2022-06-08
Accepted:
2022-07-28
Online:
2023-01-20
Published:
2022-09-22
Contact:
*Haixin Sun, E-mail: hisensessun@163.com
Zhuofan Xie, Dingzhao Li, Haixin Sun, Anmin Zhang. Deep learning techniques for bird chirp recognition task[J]. Biodiv Sci, 2023, 31(1): 22308.
鸟类中文名称 Chinese name | 鸟类英文学名 Latin name | 库内标签 Registered label | 样本时长 Sample length (s) | 数据来源 Data source |
---|---|---|---|---|
美洲麻鳽 | Botaurus lentiginosus | amebit | 23,960.7 | |
白头海雕 | Haliaeetus leucocephalus | baleag | 22,744.8 | |
布氏雀鹀 | Spizella breweri | brespa | 23,880.6 | |
普通拟八哥 | Quiscalus quiscula | comgra | 23,517.5 | |
角鸬鹚 | Phalacrocorax auritus | doccor | 22,183.7 | |
灰斑鸠 | Streptopelia decaocto | eucdov | 22,837.2 | |
长嘴啄木鸟 | Leuconotopicus villosus | haiwoo | 23,009.2 | |
暗背金翅雀 | Spinus psaltria | lesgol | 22,521.4 | |
环颈潜鸭 | Aythya collaris | rinduc | 21,653.2 | |
白喉雨燕 | Aeronautes saxatalis | whtswi | 22,491.7 |
Table 1 Dataset description of model training dataset used in this study
鸟类中文名称 Chinese name | 鸟类英文学名 Latin name | 库内标签 Registered label | 样本时长 Sample length (s) | 数据来源 Data source |
---|---|---|---|---|
美洲麻鳽 | Botaurus lentiginosus | amebit | 23,960.7 | |
白头海雕 | Haliaeetus leucocephalus | baleag | 22,744.8 | |
布氏雀鹀 | Spizella breweri | brespa | 23,880.6 | |
普通拟八哥 | Quiscalus quiscula | comgra | 23,517.5 | |
角鸬鹚 | Phalacrocorax auritus | doccor | 22,183.7 | |
灰斑鸠 | Streptopelia decaocto | eucdov | 22,837.2 | |
长嘴啄木鸟 | Leuconotopicus villosus | haiwoo | 23,009.2 | |
暗背金翅雀 | Spinus psaltria | lesgol | 22,521.4 | |
环颈潜鸭 | Aythya collaris | rinduc | 21,653.2 | |
白喉雨燕 | Aeronautes saxatalis | whtswi | 22,491.7 |
参数名称 Parameter name | 参数值 Parameter value |
---|---|
批大小 Batch_size | 256 |
时期数 Epochs | 50 |
学习率 Learning rate | 0.001 |
优化器 Optimizer | 自适应矩估计优化器 Adam optimizer |
损失函数 Loss function | 分类交叉熵 Categorical_cross-entropy |
Table 2 DenseNet model parameter list
参数名称 Parameter name | 参数值 Parameter value |
---|---|
批大小 Batch_size | 256 |
时期数 Epochs | 50 |
学习率 Learning rate | 0.001 |
优化器 Optimizer | 自适应矩估计优化器 Adam optimizer |
损失函数 Loss function | 分类交叉熵 Categorical_cross-entropy |
特征提取方法 Feature extraction method | 准确率 Accuracy | 总参数量 No. of parameters |
---|---|---|
VGG11 + 原始特征 VGG11 + Original feature | 0.906 | 1.38e8 |
VGG11 + 对数梅尔谱差分特征 VGG11 + Log-Meier spectral differential characteristics | 0.926 | 1.38e8 |
VGG11 + 融合特征 VGG11 + Fusion feature | 0.935 | 1.38e8 |
ResNet18 + 原始特征 ResNet18 + Original feature | 0.896 | 1.11e7 |
ResNet18 + 对数梅尔谱差分特征 ResNet18 + Log-Meier spectral differential characteristics | 0.912 | 1.11e7 |
ResNet18 + 融合特征 ResNet18 + Fusion feature | 0.933 | 1.11e7 |
DensNet121 + 原始特征 DensNet121 + Original feature | 0.901 | 6.94e6 |
DensNet121 + 对数梅尔谱差分特征 DensNet121 + Log-Meier spectral differential characteristics | 0.932 | 6.96e6 |
DensNet121 + 融合特征 DensNet121 + Fusion feature | 0.939 | 6.96e6 |
Table 3 Comparison among different feature accuracies of VGG11, ResNet18 and DensNet121. Bold value is the accuracy calculated by the fusion feature.
特征提取方法 Feature extraction method | 准确率 Accuracy | 总参数量 No. of parameters |
---|---|---|
VGG11 + 原始特征 VGG11 + Original feature | 0.906 | 1.38e8 |
VGG11 + 对数梅尔谱差分特征 VGG11 + Log-Meier spectral differential characteristics | 0.926 | 1.38e8 |
VGG11 + 融合特征 VGG11 + Fusion feature | 0.935 | 1.38e8 |
ResNet18 + 原始特征 ResNet18 + Original feature | 0.896 | 1.11e7 |
ResNet18 + 对数梅尔谱差分特征 ResNet18 + Log-Meier spectral differential characteristics | 0.912 | 1.11e7 |
ResNet18 + 融合特征 ResNet18 + Fusion feature | 0.933 | 1.11e7 |
DensNet121 + 原始特征 DensNet121 + Original feature | 0.901 | 6.94e6 |
DensNet121 + 对数梅尔谱差分特征 DensNet121 + Log-Meier spectral differential characteristics | 0.932 | 6.96e6 |
DensNet121 + 融合特征 DensNet121 + Fusion feature | 0.939 | 6.96e6 |
模型 Model | 准确率 Accuracy |
---|---|
DensNet121 + 融合特征 DensNet121 + Fusion feature | 0.939 |
DensNet121 + 融合特征 + 注意力机制 DenseNet121 + Fusion feature + Attention | 0.953 |
DensNet121 + 融合特征 + 注意力机制 + 中心损失函数 DenseNet121 + Fusion feature + Attention + Center loss function | 0.969 |
Table 4 Comparison experiment based on DenseNet121
模型 Model | 准确率 Accuracy |
---|---|
DensNet121 + 融合特征 DensNet121 + Fusion feature | 0.939 |
DensNet121 + 融合特征 + 注意力机制 DenseNet121 + Fusion feature + Attention | 0.953 |
DensNet121 + 融合特征 + 注意力机制 + 中心损失函数 DenseNet121 + Fusion feature + Attention + Center loss function | 0.969 |
Fig. 6 Experimental test loss value and recognition accuracy. The above figure is the dot plot for the loss values of 0?14 epochs, and the figure below is the dot plot for the accuracies of 0?14 epochs.
[1] |
Buades A, Coll B, Morel JM (2011) Non-local means denoising. Image Processing on Line, 1, 208-212.
DOI URL |
[2] |
Dagan U, Izhaki I (2019) Understory vegetation in planted pine forests governs bird community composition and diversity in the eastern Mediterranean region. Forest Ecosystems, 6, 29.
DOI URL |
[3] |
Dai YS, Yang J, Dong YW, Zou HP, Hu MZ, Wang B (2021) Blind source separation-based IVA-Xception model for bird sound recognition in complex acoustic environments. Electronics Letters, 57, 454-456.
DOI URL |
[4] | He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778. Las Vegas, NV, USA. |
[5] | Huang G, Liu Z, van der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261-2269. Honolulu, HI, USA. |
[6] | Incze Á, Jancsó HB, Szilágyi Z, Farkas A, Sulyok C (2018) Bird sound recognition using a convolutional neural network. In: 2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY), pp. 295-300. Subotica, Serbia. |
[7] |
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60, 84-90.
DOI URL |
[8] | Lü KP, Sun B, Zhao YX (2021) Research on bird recognition method based on bird singing and deep learning. Bulletin of Science and Technology, 37(10), 24-30, 37. (in Chinese) |
[吕坤朋, 孙斌, 赵玉晓 (2021) 基于鸟鸣声及深度学习的鸟类识别方法研究. 科技通报, 37(10), 24-30, 37.] | |
[9] |
Mahendra M, Nasution MA, Rahmayanti F, Islama D (2021) Application of appropriate technology for automatic bird pest removal and automatic fish feed in the Minapadi system in Beutong Nagan Raya District. International Journal of Community Service, 1(3), 231-237.
DOI URL |
[10] | Petmezas G, Cheimariotis GA, Stefanopoulos L, Rocha B, Paiva RP, Katsaggelos AK, Maglaveras N (2022) Automated lung sound classification using a hybrid CNN-LSTM network and focal loss function. Sensors, 22(3), 1232. |
[11] |
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, doi: arXiv:1409.1556.
DOI |
[12] | Song FC, Ding XM, Yao F, Rui SJ, Chen R (2021) Research on railway intelligent bird repellent based on sensor technology and Internet of Things technology. Railway Engineering Technology and Economy, 36(1), 33-37. (in Chinese) |
[宋福春, 丁小明, 姚发, 芮胜骏, 陈容 (2021) 基于传感器技术和物联网技术的铁路智能驱鸟器的研究. 铁路工程技术与经济, 36(1), 33-37.] | |
[13] | Yang JF, Liu QQ, Zhang K, Lin QQ, Hou JH (2022) Diversity of bird community in spring in Bodhi Islands, Hebei Province. Journal of Hebei University (Natural Science Edition), 42, 182-189. (in Chinese with English abstract) |
[杨俊锋, 刘琪琪, 张侃, 林庆乾, 侯建华 (2022) 河北菩提岛诸岛春季鸟类群落多样性. 河北大学学报(自然科学版), 42, 182-189.] | |
[14] |
Zhang Y, Zeng JF, Li YM, Chen D (2021) Convolutional neural network-gated recurrent unit neural network with feature fusion for environmental sound classification. Automatic Control and Computer Sciences, 55, 311-318.
DOI URL |
[1] | Qun Xu, Yonghua Xie. Automatic individual tracking method of Amur tiger based on attention mechanism fusion of multiple features [J]. Biodiv Sci, 2024, 32(3): 23409-. |
[2] | Jianmin Cai, Peiyu He, Zhipeng Yang, Luying Li, Qijun Zhao, Fan Pan. A deep feature fusion-based method for bird sound recognition and its interpretability analysis [J]. Biodiv Sci, 2023, 31(7): 23087-. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
Copyright © 2022 Biodiversity Science
Editorial Office of Biodiversity Science, 20 Nanxincun, Xiangshan, Beijing 100093, China
Tel: 010-62836137, 62836665 E-mail: biodiversity@ibcas.ac.cn