生物多样性 ›› 2023, Vol. 31 ›› Issue (11): 23272.  DOI: 10.17520/biods.2023272

• 技术与方法 • 上一篇    下一篇

基于机器学习鸟声识别算法研究进展

申小虎1,2,*(), 朱翔宇1, 史洪飞2, 王传之3   

  1. 1.江苏警官学院刑事科学技术系, 南京 210031
    2.国家林业和草原局野生动植物物证技术国家林业和草原局重点实验室, 南京 210023
    3.科大讯飞科技有限公司, 合肥 230088
  • 收稿日期:2023-07-31 接受日期:2023-10-12 出版日期:2023-11-20 发布日期:2023-12-08
  • 通讯作者: * E-mail: shenxiaohu@jspi.cn
  • 基金资助:
    野生动植物物证技术国家林业和草原局重点实验室开放课题(KLNPC2102);江苏省交通安全设施智能网联工程研究中心平台资助

Research progress of birdsong recognition algorithms based on machine learning

Xiaohu Shen1,2,*(), Xiangyu Zhu1, Hongfei Shi2, Chuanzhi Wang3   

  1. 1 Department of Forensic Science and Technology, Jiangsu Police Institute, Nanjing 210031
    2 National Forestry and Grassland Administration, Key Laboratory of State Forest and Grassland Administration on Wildlife Evidence Technology, Nanjing 210023
    3 iFLYTEK CO. LTD., Hefei 230088
  • Received:2023-07-31 Accepted:2023-10-12 Online:2023-11-20 Published:2023-12-08
  • Contact: * E-mail: shenxiaohu@jspi.cn

摘要:

监测生态系统中鸟类多样性的状态和趋势是一项重大挑战, 需要广泛适用的基于机器学习的鸟鸣识别算法。为准确把握基于机器学习的鸟声识别方法的研究现状与发展趋势, 本文介绍了鸟鸣识别任务的基本概念, 并从模型结构设计角度对基于机器学习的鸟鸣识别算法进行概述。鉴于基于机器学习的鸟鸣识别技术的跨学科性质, 根据研究方向将算法分为: 概率模型(probabilistic model)、模板匹配(template matching)、时序分析(time series analysis)、迁移学习(transfer learning)、数据融合(data fusion)、集成学习(ensemble learning)、度量学习(metric learning)和无监督聚类(unsupervised clustering)的鸟鸣识别算法。本文回顾了这些方法在完成鸟声识别任务时的技术脉络, 以及这些算法的特点和局限性, 并比较了它们在鸟鸣识别方面的有效性。本文还讨论了常用的标准化鸟声开源数据集和评估指标。最后, 本文指出当前方法所面临的挑战和该领域潜在的未来研究方向。本综述旨在为从事鸟声识别研究的学者和开发人员提供一个全面的参考框架, 以便更好地理解现有技术和潜在发展趋势。

关键词: 鸟声识别, 机器学习, 深度学习, 鸟类多样性, 鸟声数据集, 评估指标

Abstract

Background & Aim: Birds, located at the upstream of the ecological food chain, serve as crucial reference indicators for environmental quality and pollution. However, monitoring the status and trends of bird diversity in ecosystems poses a significant challenge. Establishing an all-weather bird diversity detection in system requires an extensively applicable machine learning-based birdsong recognition algorithm. To facilitate a precise comprehension of the research status pertaining to machine learning-based birdsong recognition algorithms and their developmental trends, we introduce the fundamental concepts of birdsong recognition and provides an overview of machine learning-based bird sound recognition algorithms from the perspective of model structure design.

Summary: Given the interdisciplinary nature of machine learning-based birdsong recognition technology, the algorithms can be classified into the following categories based on research directions: probabilistic model, template matching, time series analysis, transfer learning, data fusion, ensemble learning, metric learning-based, and unsupervised clustering birdsong recognition algorithms. We review the technical context of these categories in the context of performing birdsong recognition tasks. Furthermore, we present an analysis of the characteristics and limitations of these algorithms, along with a comparison of their birdsong recognition effectiveness in birdsong recognition. It also discusses commonly used standardized birdsong open-source datasets for birdsong and evaluation metrics applied. Finally, we outline the challenges confronted by existing methods and identifies potential future research directions in this field.

Perspectives: We endeavor to furnish scholars and developers involved in birdsong recognition research with a comprehensive reference framework, enabling them to better comprehend the existing technologies and potential developmental trends. Currently, there is a need to enhance the accuracy and robustness of machine learning-based birdsong recognition methods, especially for large-scale data samples. Additionally, the promotion and application of these methods still encounter several challenges that require resolution. The future investigations should focus on the following aspects: (1) optimization and improvement models; (2) integrating of multimodal data; (3) application of transfer learning; (4) expansion of application scenarios; and (5) establishing and standardization of databases.

Key words: birdsong recognition, machine learning, deep learning, bird diversity, birdsong datasets, evaluation metrics