生物多样性 ›› 2024, Vol. 32 ›› Issue (10): 24259. DOI: 10.17520/biods.2024259 cstr: 32101.14.biods.2024259
谢将剑1,2,#(), 沈忱1,#, 张飞宇1, 肖治术3,*(
)(
)
收稿日期:
2024-06-25
接受日期:
2024-08-24
出版日期:
2024-10-20
发布日期:
2024-09-26
通讯作者:
*E-mail: xiaozs@ioz.ac.cn
作者简介:
第一联系人:#共同第一作者
基金资助:
Jiangjian Xie1,2,#(), Chen Shen1,#, Feiyu Zhang1, Zhishu Xiao3,*(
)(
)
Received:
2024-06-25
Accepted:
2024-08-24
Online:
2024-10-20
Published:
2024-09-26
Contact:
*E-mail: xiaozs@ioz.ac.cn
About author:
First author contact:#Co-first authors
Supported by:
摘要:
鸟类被动声学监测对于了解其种群和群落动态及相关物种的行为功能具有重要意义。利用深度学习技术和鸟类鸣声特征来自动识别鸟类物种, 是实现鸟类大规模被动声学监测的关键。亲缘关系相近的鸟类物种的鸣声极为相似, 容易混淆, 使得假阳性增加, 从而导致深度学习模型识别精度有限。针对上述问题, 本文提出一种融合音频及生态位信息的鸟类物种识别方法: 首先基于残差网络ResNet18构建音频识别模型, 再使用最大熵模型对鸟类物种分布进行预测, 获取鸟类物种在不同位置的适生指数作为生态位信息, 最后构建融合音频及生态位信息的鸟类物种识别模型NicheNet。实验结果表明, 与ResNet18相比, NicheNet的Top-1准确率提升了12.9%, Top-5准确率提升了10.6%, 同时NicheNet的近种错误率、近属错误率以及近科错误率分别下降了3.1%、1.8%以及8.0%。结合对两对鸣声相似的鸟类同科物种的识别结果发现, NicheNet能够根据生态位信息对基于音频特征的分类结果进行修正, 以提高对亲缘关系相近、鸣声相似而分布差异大的鸟类同科物种的识别效果。本文所提出的融合音频及生态位信息的鸟类物种识别方法能够有效降低亲缘关系相近、鸣声相似但生态位不同的鸟类物种的误识别率, 进而提升基于鸣声的跨地域鸟类物种识别准确率。
谢将剑, 沈忱, 张飞宇, 肖治术 (2024) 融合音频及生态位信息的跨地域鸟类物种识别方法. 生物多样性, 32, 24259. DOI: 10.17520/biods.2024259.
Jiangjian Xie, Chen Shen, Feiyu Zhang, Zhishu Xiao (2024) Cross-regional bird species recognition method integrating audio and ecological niche information. Biodiversity Science, 32, 24259. DOI: 10.17520/biods.2024259.
图1 本研究中所用鸟类音频数据的位置分布图。橘色点为本文使用的鸟类音频的采集位置。
Fig. 1 The map of bird audio files used in this study. The recording locations of all bird audio used are shown as orange dots.
层名 Layer name | 输出尺寸 Output size | 每层参数 Layer parameters |
---|---|---|
输入 Input | 3 × 80 × 157 | - |
conv1_x | 64 × 40 × 79 | |
conv2_x | 64 × 20 × 40 | |
双重堆叠卷积模块 Double stacked convolutional module | ||
conv3_x | 128 × 10 × 20 | |
conv4_x | 256 × 5 × 10 | |
conv5_x | 512 × 3 × 5 | |
平均池化 Average pool | 512 × 1 × 1 | 自适应平均池化 Adaptive average pool |
输出 Output | 1 × 156 | 全连接层 Fully connected layer: Softmax |
表1 音频识别模型结构参数
Table 1 Structure and parameters of audio recognition model
层名 Layer name | 输出尺寸 Output size | 每层参数 Layer parameters |
---|---|---|
输入 Input | 3 × 80 × 157 | - |
conv1_x | 64 × 40 × 79 | |
conv2_x | 64 × 20 × 40 | |
双重堆叠卷积模块 Double stacked convolutional module | ||
conv3_x | 128 × 10 × 20 | |
conv4_x | 256 × 5 × 10 | |
conv5_x | 512 × 3 × 5 | |
平均池化 Average pool | 512 × 1 × 1 | 自适应平均池化 Adaptive average pool |
输出 Output | 1 × 156 | 全连接层 Fully connected layer: Softmax |
图2 物种分布预测结果示例。从蓝色到红色表示该物种的存在概率从0到1的预测结果。
Fig. 2 Examples of predicted results of species distribution. Blue to red indicates the predicted probability of the species’ presence from 0 to 1.
模型名称 Model name | Top-1准确率 Top-1 accuracy | Top-5准确率 Top-5 accuracy | 近种错误率 Near species error rate | 近属错误率 Near genus error rate | 近科错误率 Near family error rate |
---|---|---|---|---|---|
ResNet18 | 0.6140 | 0.8007 | 0.0531 | 0.0810 | 0.2518 |
NicheNet | 0.7432 | 0.9062 | 0.0220 | 0.0630 | 0.1719 |
表2 不同模型的性能对比
Table 2 Comparison of different model performance
模型名称 Model name | Top-1准确率 Top-1 accuracy | Top-5准确率 Top-5 accuracy | 近种错误率 Near species error rate | 近属错误率 Near genus error rate | 近科错误率 Near family error rate |
---|---|---|---|---|---|
ResNet18 | 0.6140 | 0.8007 | 0.0531 | 0.0810 | 0.2518 |
NicheNet | 0.7432 | 0.9062 | 0.0220 | 0.0630 | 0.1719 |
图5 不同模型识别错误数对比结果。ResNet18: 残差网络。NicheNet: 融合音频及生态位信息的识别模型。
Fig. 5 Comparison of the number of recognition errors of different models. ResNet18, Residual Neural Network 18. NicheNet, A recognition model integrating audio and niche information.
图6 黄喉歌䳭和白眉薮鸲的鸣声频谱图及分布预测结果。图(c)和(d)中,颜色由蓝到红表示该物种存在概率从0到1的预测结果。
Fig. 6 Spectrograms and distribution prediction results of Cossypha caffra and Cercotrichas leucophrys. In figure c and d, blue to red indicates the predicted probability of the species’ presence from 0 to 1.
模型名称 Model name | 误识别样本数 Number of misidentified samples (%) | 近属误识别样本数 Number of near genus misidentified samples (%) | 误识别为白眉薮鸲样本数 Number of samples misidentified as Cercotrichas leucophrys (%) |
---|---|---|---|
ResNet18 | 283 (41.6%) | 124 (18.2%) | 68 (10.0%) |
NicheNet | 188 (27.6%) | 46 (6.8%) | 14 (2.1%) |
表3 黄喉歌䳭误识别情况
Table 3 Misidentification of Cossypha caffra
模型名称 Model name | 误识别样本数 Number of misidentified samples (%) | 近属误识别样本数 Number of near genus misidentified samples (%) | 误识别为白眉薮鸲样本数 Number of samples misidentified as Cercotrichas leucophrys (%) |
---|---|---|---|
ResNet18 | 283 (41.6%) | 124 (18.2%) | 68 (10.0%) |
NicheNet | 188 (27.6%) | 46 (6.8%) | 14 (2.1%) |
图7 西白腹毛脚燕和家燕的鸣声频谱图及分布预测结果。图(c)和(d)中, 颜色由蓝到红表示该物种存在概率从0到1的预测结果。
Fig. 7 Spectrograms and distribution prediction results of Delichon urbicum and Hirundo rustica. In figure c and d, blue to red indicates the predicted probability of the species’ presence from 0 to 1.
模型 名称 Model name | 误识别样本数 Number of misidentified samples (%) | 近属误识别样本数 Number of near genus misidentified samples (%) | 误识别为家燕样本数 Number of samples misidentified as Hirundo rustica (%) |
---|---|---|---|
ResNet18 | 328 (35.2%) | 104 (11.2%) | 56 (6.0%) |
NicheNet | 314 (33.7%) | 98 (10.5%) | 48 (5.1%) |
表4 西白腹毛脚燕误识别情况
Table 4 Misidentification of Delichon urbicum
模型 名称 Model name | 误识别样本数 Number of misidentified samples (%) | 近属误识别样本数 Number of near genus misidentified samples (%) | 误识别为家燕样本数 Number of samples misidentified as Hirundo rustica (%) |
---|---|---|---|
ResNet18 | 328 (35.2%) | 104 (11.2%) | 56 (6.0%) |
NicheNet | 314 (33.7%) | 98 (10.5%) | 48 (5.1%) |
[1] | Bao YY, Li YK, Lin WY, Zhou ZQ, Xiao XB, Xie XY (2023) The current situation of horseshoe crabs in the offshore waters of northern South China Sea with analysis of the potential habitat distribution of juvenile Tachypleus tridentatus in Beibu Gulf. Biodiversity Science, 31, 22407. (in Chinese with English abstract) |
[鲍虞园, 李银康, 林吴颖, 周志琴, 肖晓波, 颉晓勇 (2023) 中国南海北部近海鲎资源调查及北部湾潮间带中华鲎幼鲎潜在栖息地评估. 生物多样性, 31, 22407.]
DOI |
|
[2] | Bold N, Zhang C, Akashi T (2019) Cross-domain deep feature combination for bird species classification with audio-visual data. IEICE Transactions on Information and Systems, E102, 2033-2042. |
[3] | Bota G, Manzano-Rubio R, Catalán L, Gómez-Catasús J, Pérez-Granados C (2023) Hearing to the unseen: AudioMoth and BirdNET as a cheap and easy method for monitoring cryptic bird species. Sensors, 23, 7176. |
[4] |
Cai JM, He PY, Yang ZP, Li LY, Zhao QJ, Pan F (2023) A deep feature fusion-based method for bird sound recognition and its interpretability analysis. Biodiversity Science, 31, 23087. (in Chinese with English abstract)
DOI |
[蔡建民, 何培宇, 杨智鹏, 李露莹, 赵启军, 潘帆 (2023) 基于深度特征融合的鸟鸣识别方法及其可解释性分析. 生物多样性, 31, 23087.]
DOI |
|
[5] | Chu G, Potetz B, Wang WJ, Howard A, Song Y, Brucher F, Leung T, Adam H (2019) Geo-aware networks for fine-grained recognition. In:2019 IEEE/CVF International Conference on Computer Vision Workshop, pp. 247-254. Seoul, Korea (South). |
[6] | Feng XJ, Mi XC, Xiao ZS, Cao L, Wu H, Ma KP (2019) Overview of Chinese biodiversity observation network (Sino BON). Bulletin of Chinese Academy of Sciences, 34, 1389-1398. (in Chinese with English abstract) |
[冯晓娟, 米湘成, 肖治术, 曹垒, 吴慧, 马克平 (2019) 中国生物多样性监测与研究网络建设及进展. 中国科学院院刊, 34, 1389-1398.] | |
[7] | Florentin J, Dutoit T, Verlinden O (2020) Detection and identification of European woodpeckers with deep convolutional neural networks. Ecological Informatics, 55, 101023. |
[8] | Jeantet L, Dufourq E (2023) Improving deep learning acoustic classifiers with contextual information for wildlife monitoring. Ecological Informatics, 77, 102256. |
[9] | Kahl S, Wood M, Eibl M, Klinck H (2021) BirdNET: A deep learning solution for avian diversity monitoring. Ecological Informatics, 61, 101236. |
[10] | Lin CT, Huang XW, Wang JN, Xi TY, Ji LQ (2021) Learning niche features to improve image-based species identification. Ecological Informatics, 61, 101217. |
[11] | Liu J, Zhang Y, Lv D, Lu J, Xie SS, Zi JL, Yin Y, Xu HF (2022) Birdsong classification based on ensemble multi scale convolutional neural network. Scientific Reports, 12, 8636. |
[12] | Mei JJ (2022) Study on the Patterns of Vocal Activity and Interspecific Relationship of the Cuculidae Species in Dabie Mountains. PhD dissertation, University of Science and Technology of China, Hefei. (in Chinese with English abstract) |
[梅金娟 (2022) 大别山区杜鹃科鸟类鸣叫活动模式与种间关系研究. 博士学位论文, 中国科学技术大学, 合肥.] | |
[13] | Morales G, Vargas V, Espejo D, Poblete V, Tomasevic JA, Otondo F, Navedo JG (2022) Method for passive acoustic monitoring of bird communities using UMAP and a deep neural network. Ecological Informatics, 72, 101909. |
[14] |
Peterson AT, Soberón J, Sánchez-Cordero V (1999) Conservatism of ecological niches in evolutionary time. Science, 285, 1265-1267.
DOI PMID |
[15] | Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecological Modelling, 190, 231-259. |
[16] | Thant ZM, Leimgruber P, Williams AC, Oo ZM, Røskaft E, May R (2023) Factors influencing the habitat suitability of wild Asian elephants and their implications for human- elephant conflict in Myanmar. Global Ecology and Conservation, 43, e02468. |
[17] | Wang HL, Xu YF, Yu Y, Lin YC, Ran JH (2022) An efficient model for a vast number of bird species identification based on acoustic features. Animals, 12, 2434. |
[18] |
Wu KY, Ruan WD, Zhou DF, Chen QC, Zhang CY, Pan XY, Yu S, Liu Y, Xiao RB (2023) Syllable clustering analysis-based passive acoustic monitoring technology and its application in bird monitoring. Biodiversity Science, 31, 22370. (in Chinese with English abstract)
DOI |
[吴科毅, 阮文达, 周棣锋, 陈庆春, 张承云, 潘新园, 余上, 刘阳, 肖荣波 (2023) 基于音节聚类分析的被动声学监测技术及其在鸟类监测中的应用. 生物多样性, 31, 22370.]
DOI |
|
[19] | Xie JJ, Yang J, Xing ZL, Zhang Z, Chen X (2020) Bird species recognition method based on multi-feature fusion. Journal of Applied Acoustics, 39, 199-206. (in Chinese with English abstract) |
[谢将剑, 杨俊, 邢照亮, 张卓, 陈新 (2020) 多特征融合的鸟类物种识别方法. 应用声学, 39, 199-206.] | |
[20] | Xie JJ, Zhong YJ, Zhang JG, Liu S, Ding CQ, Triantafyllopoulos A (2023) A review of automatic recognition technology for bird vocalizations in the deep learning era. Ecological Informatics, 73, 101927. |
[21] |
Xie ZF, Li DZ, Sun HX, Zhang AM (2023) Deep learning techniques for bird chirp recognition task. Biodiversity Science, 31, 22308. (in Chinese with English abstract)
DOI |
[谢卓钒, 李鼎昭, 孙海信, 张安民 (2023) 面向鸟鸣声识别任务的深度学习技术. 生物多样性, 31, 22308.]
DOI |
|
[22] | Xu J, Zhang XQ, Zhao CM, Geng ZL, Feng YR, Miao K, Li YJ (2024) Improving fine-grained image classification with multimodal information. IEEE Transactions on Multimedia, 26, 2082-2095. |
[23] | Yan N, Chen AB, Zhou GX, Zhang ZQ, Liu XY, Wang JW, Liu ZH, Chen WJ (2021) Birdsong classification based on multi-feature fusion. Multimedia Tools and Applications, 80, 36529-36547. |
[1] | 白皓天, 余上, 潘新园, 凌嘉乐, 吴娟, 谢恺琪, 刘阳, 陈学业. AI辅助识别的鸟类被动声学监测在城市湿地公园中的应用[J]. 生物多样性, 2024, 32(8): 24188-. |
[2] | 卢佳玉, 石小亿, 多立安, 王天明, 李治霖. 基于红外相机技术的天津城市地栖哺乳动物昼夜活动节律评价[J]. 生物多样性, 2024, 32(8): 23369-. |
[3] | 王永财, 万华伟, 高吉喜, 胡卓玮, 孙晨曦, 吕娜, 张志如. 基于深度学习的我国北方常见天然草地植物识别[J]. 生物多样性, 2024, 32(4): 23435-. |
[4] | 曲锐, 左振君, 王有鑫, 张良键, 吴志刚, 乔秀娟, 王忠. 基于元素组的生物地球化学生态位及其在不同生态系统中的应用[J]. 生物多样性, 2024, 32(4): 23378-. |
[5] | 吕晓波, 李东海, 杨小波, 张孟文. 红树林群落通过淹水时间及海水盐度的生态位分化实现物种共存[J]. 生物多样性, 2024, 32(3): 23302-. |
[6] | 原雪姣, 张渊媛, 张衍亮, 胡璐祎, 桑卫国, 杨峥, 陈颀. 基于飞机草历史分布数据拟合的物种分布模型及其预测能力[J]. 生物多样性, 2024, 32(11): 24288-. |
[7] | 杜聪聪, 冯学宇, 陈志林. 桥头堡效应通过缩小气候生态位的差异促进了红火蚁的入侵[J]. 生物多样性, 2024, 32(11): 24276-. |
[8] | 郭倩茸, 段淑斐, 谢捷, 董雪燕, 肖治术. 鸟声标注技术及其在被动声学监测中的应用[J]. 生物多样性, 2024, 32(10): 24313-. |
[9] | 陈蕾, 许志勇, 苏菩坤, 赖小甜, 赵兆. 依频声学多样性指数用于人类活动区域的适用能力[J]. 生物多样性, 2024, 32(10): 24286-. |
[10] | 刘莹莹, 龚立新, 曾皓, 冯江, 董永军, 王磊, 江廷磊. 被动声学监测在蝙蝠研究中的应用[J]. 生物多样性, 2024, 32(10): 24233-. |
[11] | 黄万涛, 郝泽周, 张梓欣, 肖治术, 张承云. 被动声学监测设备性能比较及对鸟声识别的影响[J]. 生物多样性, 2024, 32(10): 24273-. |
[12] | 李乐, 张承云, 裴男才, 高丙涛, 王娜, 李嘉睿, 武瑞琛, 郝泽周. 基于被动声学监测技术的城市绿地景观格局与鸟类多样性关联分析[J]. 生物多样性, 2024, 32(10): 24296-. |
[13] | 郝泽周, 张承云, 李乐, 高丙涛, 曾伟, 王淳, 王梓炫, 黄万涛, 张悦, 裴男才, 肖治术. 城市鸟类多样性被动声学监测与评价技术应用[J]. 生物多样性, 2024, 32(10): 24123-. |
[14] | 韩丽霞, 王永健, 刘宣. 外来物种入侵与本土物种分布区扩张的异同[J]. 生物多样性, 2024, 32(1): 23396-. |
[15] | 公欣桐, 陈飞, 高欢欢, 习新强. 两种果蝇成虫与幼虫期的竞争及其对二者共存的影响[J]. 生物多样性, 2023, 31(8): 22603-. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
备案号:京ICP备16067583号-7
Copyright © 2022 版权所有 《生物多样性》编辑部
地址: 北京香山南辛村20号, 邮编:100093
电话: 010-62836137, 62836665 E-mail: biodiversity@ibcas.ac.cn