生物多样性 ›› 2023, Vol. 31 ›› Issue (1): 22370.  DOI: 10.17520/biods.2022370

• 中国野生脊椎动物鸣声监测与生物声学研究专题 • 上一篇    下一篇

基于音节聚类分析的被动声学监测技术及其在鸟类监测中的应用

吴科毅1, 阮文达1, 周棣锋1, 陈庆春1,*(), 张承云1, 潘新园2, 余上3, 刘阳4, 肖荣波5   

  1. 1.广州大学电子与通信工程学院, 广州 510006
    2.华南农业大学林学与风景园林学院, 广州 510642
    3.广州灵感生态科技有限公司, 广州 510630
    4.中山大学生态学院, 广州 510006
    5.广东工业大学环境科学与工程学院, 广州 510006
  • 收稿日期:2022-06-30 接受日期:2022-11-24 出版日期:2023-01-20 发布日期:2022-12-02
  • 通讯作者: *陈庆春, E-mail: qcchen@gzhu.edu.cn
  • 基金资助:
    广州市基础研究计划市校(院)联合资助项目(202201020141);国家自然科学基金(32171520)

Syllable clustering analysis-based passive acoustic monitoring technology and its application in bird monitoring

Keyi Wu1, Wenda Ruan1, Difeng Zhou1, Qingchen Chen1,*(), Chengyun Zhang1, Xinyuan Pan2, Shang Yu3, Yang Liu4, Rongbo Xiao5   

  1. 1. School of Electronics and Communication Engineering, Guangzhou University, Guangzhou 510006
    2. College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou 510642
    3. Guangzhou Naturesense Ecological Technology Co., Guangzhou 510630
    4. School of Ecology, Sun Yat-sen University, Guangzhou 510006
    5. School of Environmental Science and Engineering, Guangdong University of Technology, Guangzhou 510006
  • Received:2022-06-30 Accepted:2022-11-24 Online:2023-01-20 Published:2022-12-02
  • Contact: *Qingchen Chen, E-mail: qcchen@gzhu.edu.cn

摘要:

被动声学监测通过分析鸟鸣声信息来实现物种识别, 为鸟类多样性监测提供了一种切实可行的技术方案。由于鸟种的鸣声复杂多变, 如何通过声纹快速准确辨别物种, 分析鸟类丰度, 降低对人工操作的需求等技术难题, 成为基于声纹的鸟类多样性监测所面临的挑战。本文提出了基于音节聚类的鸟类鸣声监测框架: 首先通过音高、频率平坦度等音频特征在声纹数据中提取音节, 然后通过无监督表征学习与狄利克雷过程(Dirichlet process)混合模型对音节进行深度无监督聚类训练, 完成音节聚类和自动音节种类推断。分析结果表明, 本文提出的基于音节聚类的鸟类鸣声监测框架在处理开源数据集白腰文鸟(Lonchura striata)的曲目时可获得接近90%的聚类准确率。在此基础上, 本研究对2022年4‒5月在广州市白云山公园固定监测点所录制的10种鸟类鸣声进行了无监督的音节聚类分析, 验证了本文所提出的基于音节聚类的鸟类鸣声监测框架的有效性: 本技术不仅可以支持快速鸟类物种识别, 还可以统计和分析不同物种鸟鸣在时间、频度、数量上的变化。这些结果表明, 基于音节聚类的鸟类鸣声监测框架可以显著降低对人工标注训练数据的要求, 克服传统鸟鸣物种识别框架在处理重叠鸟鸣时难以处理多物种识别的缺点, 为基于被动声学监测的鸟类多样性监测提供了一个快速物种识别、音节序列分析和精细化种群丰度分析的综合解决方案。

关键词: 被动声学监测, 鸟声音节, 深度无监督聚类, 物种识别, 音节序列分析, 丰度分析

Abstract

Aims: Passive acoustic monitoring has proven to be an effective method for monitoring bird biodiversity, as it allows for the analysis of important information such as bird songs and calls. The complexity and variations of bird songs and calls make it difficult to quickly and accurately identify bird species using voiceprint analysis. Solving this problem is essential for the successful implementation of a voiceprint-based bird diversity monitoring scheme.

Methods: This paper proposes a syllable clustering analysis-based approach for bird song/call monitoring framework. The first step is to extract syllables from voiceprint data using audio features such as pitch and frequency flatness. These syllables are then trained using a combination of unsupervised representation learning and a Dirichlet process hybrid model. The final steps are clustering the syllables and inferring their categories.

Results: (1) The analysis results show that, the proposed framework can achieve nearly 90% clustering accuracy when handling the published recordings of Lonchura striata song repository; (2) On the basis, the paper conducts unsupervised syllable clustering analysis on ten species of birds monitored in Baiyun Mountain Forest Park, Guangzhou, between April and May 2022. It verifies that the proposed framework can not only support bird species identification, but also meet the rapid species identification application requirements. This can be extended further to obtain the statistics and changes in time, frequency and quantity of various bird songs/calls.

Conclusion: The analysis results of this paper show us that, the syllable clustering-based bird song/call monitoring framework can significantly reduce the requirements for manually annotated training data. This also overcomes the shortcomings of the traditional framework in dealing with overlapping bird songs. Therefore, it provides a comprehensive solution for applications such as rapid species recognition, syllable sequence analysis, and population abundance analysis in bird diversity monitoring.

Key words: passive acoustic monitoring, birds syllable, unsupervised clustering, species identification, syllabic sequence analysis, abundance analysis