鸟声标注技术及其在被动声学监测中的应用

doi:10.17520/biods.2024313

生物多样性 ›› 2024, Vol. 32 ›› Issue (10): 24313. DOI: 10.17520/biods.2024313 cstr: 32101.14.biods.2024313

鸟声标注技术及其在被动声学监测中的应用

郭倩茸¹, 段淑斐¹^,^*()(), 谢捷²(), 董雪燕³, 肖治术⁴()

1.太原理工大学电子信息工程学院, 太原 030600
2.南京师范大学计算机与电子信息学院/人工智能学院, 南京 210023
3.北京联合大学特殊教育学院, 北京 100075
4.中国科学院动物研究所农业虫害鼠害综合治理研究国家重点实验室, 北京 100101

收稿日期:2024-07-12 接受日期:2024-09-27 出版日期:2024-10-20 发布日期:2024-12-09
通讯作者: *E-mail: duanshufei@tyut.edu.cn
基金资助:
国家自然科学基金(32371556);国家自然科学基金(12004275);山西省自然科学基金(202403021211098);山西省回国留学人员科研教研资助项目(2024-060)

Advances in bird sound annotation methods for passive acoustic monitoring

Qianrong Guo¹, Shufei Duan¹^,^*()(), Jie Xie²(), Xueyan Dong³, Zhishu Xiao⁴()

1. College of Electronic Information Engineering, Taiyuan University of Technology, Taiyuan 030600, China
2. College of Computer and Electronic Information/College of Artificial Intelligence, Nanjing Normal University, Nanjing 210023, China
3. College of Special Education, Beijing Union University, Beijing 100075, China
4. State Key Laboratory of Integrated Management of Pest Insects and Rodents in Agriculture, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China

Received:2024-07-12 Accepted:2024-09-27 Online:2024-10-20 Published:2024-12-09
Contact: *E-mail: duanshufei@tyut.edu.cn
Supported by:
National Natural Science Foundation of China(32371556);National Natural Science Foundation of China(12004275);Natural Science Foundation of Shanxi Province(202403021211098);Shanxi Scholarship Council of China(2024-060)

1. 附录.pdf(255KB)

摘要/Abstract

摘要：

鸟声标注用于标记声音中的鸟类信息, 如种类、声音结构等, 是鸟类被动声学监测及相关声学数据分析、物种自动识别分类的重要基础。本文以鸟声标注为重点, 比较了人工标注、自动标注和半自动标注等常用方法的优势, 点明了各自在数据质量、标注一致性和标注效率等方面面临的挑战, 同时探讨了这些标注方法在被动声学监测中的应用进展, 提出了自动标注模型优化、跨地区数据集建立和半自动标注系统完善等未来发展方向。尽管目前自动标注方法取得了显著进展, 但鸟声标注仍面临冷启动问题, 亟需更大规模的跨地区数据集和高效的质量检测半自动标注系统, 以满足标注数量和质量的双重要求。本综述有助于帮助鸟声数据集创建者和标注者更好地理解现有标注技术及其潜在的发展趋势, 为大规模鸟类声学监测数据的高效物种自动识别提供技术支撑。

关键词: 鸟声数据集, 人工标注, 半自动标注, 自动标注, 鸟声识别, 被动声学监测

Abstract

Background & Aim Bird sound annotation is essential for marking bird-related information in audio data, such as species identification and sound structure. It serves as a crucial foundation for passive acoustic monitoring, birds acoustic data analysis, as well as automatic species identification and classification. The purpose of this review is to help bird sound dataset creators and annotators better understand the existing labeling technologies and their potential development trends. It also provides technical support for improving the efficiency of automatic species identification in large-scale avian acoustic monitoring data.

Summary This paper compares the advantages of various common methods such as manual annotation, automatic annotation, and semi-automatic annotation. It highlights the challenges each method faces in terms of data quality, annotation consistency and annotation efficiency. The review also discusses recent applications of these methods in passive acoustic monitoring annotation models, establishing cross-regional datasets, and enhancing semi-automatic annotation systems.

Perspectives Despite significant progress in automatic annotation methods, challenges such as cold start remain. The field urgently needs larger-scale cross-regional datasets and efficient semi-automatic annotation systems to ensure quality control to meet the increasing demands for both annotation volume and accuracy.

Key words: bird sound dataset, manual annotation, semi-automatic annotation, automatic annotation, bird sound recognition, passive acoustic monitoring

郭倩茸, 段淑斐, 谢捷, 董雪燕, 肖治术 (2024) 鸟声标注技术及其在被动声学监测中的应用. 生物多样性, 32, 24313. DOI: 10.17520/biods.2024313.

Qianrong Guo, Shufei Duan, Jie Xie, Xueyan Dong, Zhishu Xiao (2024) Advances in bird sound annotation methods for passive acoustic monitoring. Biodiversity Science, 32, 24313. DOI: 10.17520/biods.2024313.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: https://www.biodiversity-science.net/CN/10.17520/biods.2024313

https://www.biodiversity-science.net/CN/Y2024/V32/I10/24313

图/表 7

参考文献 148

[1]	Abdullah NA, Rahman MM, Rahman MM, Ghauth KI (2020) A framework for optimal worker selection in spatial crowdsourcing using Bayesian network. IEEE Access, 8, 120218-120233.
[2]	Abraham I, Alonso O, Kandylas V, Patel R, Shelford S, Slivkins A (2016) How many workers to ask? Adaptive exploration for collecting high quality labels. In:Proceedings of the 39th International ACM SIGIR Conference on Research Development in Information Retrieval, pp. 473-482. Pisa, Italy.
[3]	Acconcjaioco M, Ntalampiras S (2021) One-shot learning for acoustic identification of bird species in non-stationary environments. In:Proceedings of 2020 25th International Conference on Pattern Recognition, pp. 755-762. Milan, Italy.
[4]	Akter SN, Sinthia AK, Roy P, Razzaque MA, Hassan MM, Pupo F, Fortino G (2023) Reputation aware optimal team formation for collaborative software crowdsourcing in industry 5.0. Journal of King Saud University-Computer and Information Sciences, 35, 101710.
[5]	Allahbakhsh M, Benatallah B, Ignjatovic A, Motahari-Nezhad HR, Bertino E, Dustdar S (2013) Quality control in crowdsourcing systems: Issues and directions. IEEE Internet Computing, 17, 76-81.
[6]	Arriaga JG, Cody ML, Vallejo EE, Taylor CE (2015) Bird-DB: A database for annotated bird song sequences. Ecological Informatics, 27, 21-25.
[7]	Awwad T, Bennani N, Ziegler K, Sonigo V, Brunie L, Kosch H (2017) Efficient worker selection through history-based learning in crowdsourcing. In: 2017 IEEE 41st Annual Computer Software and Applications Conference, pp. 923-928. Turin, Italy.
[8]	Aydin BI, Yilmaz YS, Li YL, Li Q, Gao J, Demirbas M (2014) Crowdsourcing for multiple-choice question answering. In:Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 2946-2953. Québec, Canada.
[9]	Bastas S, Majid MW, Mirzaei G, Ross J, Jamali MM, Gorsevski PV, Frizado J, Bingman VP (2012) A novel feature extraction algorithm for classification of bird flight calls. In: 2012 IEEE International Symposium on Circuits and Systems, pp. 1676-1679. Seoul, Korea (South).
[10]	Bian Q, Wang C, Cheng H, Han D, Zhao YL, Yin LQ (2023) Exploring the application of acoustic indices in the assessment of bird diversity in urban forests. Biodiversity Science, 31, 22080. (in Chinese with English abstract) DOI
	[边琦, 王成, 程贺, 韩丹, 赵伊琳, 殷鲁秦 (2023) 声学指数在城市森林鸟类多样性评估中的应用. 生物多样性, 31, 22080.] DOI
[11]	Bravo Sanchez FJ, Hossain MR, English NB, Moore ST (2021) Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture. Scientific Reports, 11, 15733. DOI PMID
[12]	Briggs F, Fern XZ, Raich R (2012) Rank-loss support instance machines for MIML instance annotation. In: 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 534-542. Association for Computing Machinery, Beijing, China.
[13]	Briggs F, Huang YH, Raich R, Eftaxias K, Lei Z, Cukierski W, Hadley SF, Hadley A, Betts M, Fern XZ, Irvine J, Neal L, Thomas A, Fodor G, Tsoumakas G, Ng HW, Nguyen TNT, Huttunen H, Ruusuvuori P, Manninen T, Diment A, Virtanen T, Marzat J, Defretin J, Callender D, Hurlburt C, Larrey K, Milakov M (2013) The 9th annual MLSP competition: New methods for acoustic classification of multiple simultaneous bird species in a noisy environment. In: 2013 IEEE International Workshop on Machine Learning for Signal Processing, pp. 1-8. Southampton, UK.
[14]	Cakmak M, Chao C, Thomaz AL (2010) Designing interactions for robot active learners. IEEE Transactions on Autonomous Mental Development, 2, 108-118.
[15]	Callaghan W, Goh J, Mohareb M, Lim A, Law E (2018) Mechanicalheart:A human-machine framework for the classification of phonocardiograms. In: Proceedings of the ACM on Human-Computer Interaction, pp. 1-17. New York, United States.
[16]	Cañas JS, Toro-Gómez MP, Sugai LSM, Benítez Restrepo HD, Rudas J, Posso Bautista B, Toledo LF, Dena S, Rosa Domingos AH, de Souza FL, Neckel-Oliveira S, da Rosa A, Carvalho-Rocha V, Bernardy JV, Sugai JLMM, dos Santos CE, Pereira Bastos R, Llusia D, Ulloa JS (2023) A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring. Scientific Data, 10, 771. DOI PMID
[17]	Cartwright M, Mendez AEM, Cramer A, Lostanlen V, Dove G, Wu HH, Salamon J, Nov O, Bello J (2019) SONYC urban sound tagging (SONYC-UST):A multilabel dataset from an urban acoustic sensor network. In: Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, pp. 35-39. New York, USA.
[18]	Chakraborty D, Mukker P, Rajan P, Dileep AD (2016) Bird call identification using dynamic kernel based support vector machines and deep neural networks. In: 2016 15th IEEE International Conference on Machine Learning and Applications, pp. 280-285. Anaheim, CA, USA.
[19]	Chen G, Xia CW, Zhang YY (2020) Individual identification of birds with complex songs: The case of green-backed flycatchers Ficedula elisae. Behavioural Processes, 173, 104063.
[20]	Clark ML, Salas L, Baligar S, Quinn CA, Snyder RL, Leland D, Schackwitz W, Goetz SJ, Newsam S (2023) The effect of soundscape composition on bird vocalization classification in a citizen science biodiversity monitoring project. Ecological Informatics, 75, 102065.
[21]	Colonna JG, Carvalho JRH, Rosso OA (2020) Estimating ecoacoustic activity in the Amazon rainforest through information theory quantifiers. PLoS ONE, 15, e0229425.
[22]	Colonna JG, Cristo M, Salvatierra M Jr, Nakamura EF (2015) An incremental technique for real-time bioacoustic signal segmentation. Expert Systems with Applications, 42, 7367-7374.
[23]	Connor EF, Li SD, Li S (2012) Automating identification of avian vocalizations using time-frequency information extracted from the Gabor transform. Journal of the Acoustical Society of America, 132, 507-517. DOI PMID
[24]	Cragg JL, Burger AE, Piatt JF (2015) Testing the effectiveness of automated acoustic sensors for monitoring vocal activity of marbled murrelets Brachyramphus marmoratus. Marine Ornithology, 43, 151-160.
[25]	da Silva JL, Tabata AN, Broto LC, Cocron MP, Zimmer A, Brandmeier T (2020) Open source multipurpose multimedia annotation tool. In: Image Analysis and Recognition (eds Campilho A, Karray F, Wang Z), pp.356-367. Springer, Cham.
[26]	Dai P, Rzeszotarski JM, Paritosh P, Chi EH (2015) And now for something completely different:Improving crowdsourcing workflows with micro-diversions. In: CSCW’15:Computer Supported Cooperative Work and Social Computing, pp. 628-638. Vancouver, Canada.
[27]	Darras KFA, Pérez N, Liu DL, Hanf-Dressler T, Markolf M, Wanger TC, Cord AF (2020) ecoSound-web: An open-source, online platform for ecoacoustics. F1000 Research, 9, 1224.
[28]	Das N, Mondal A, Chaki J, Padhy N, Dey N (2020) Machine learning models for bird species recognition based on vocalization: A succinct review. In: Information Technology Intelligent Transportation Systems, pp.1-9, Xi’an, China.
[29]	Domínguez M, Latorre I, Farrús M, Codina-Filba J, Wanner L (2016) Praat on the web:An upgrade of praat for semi-automatic speech annotation. In: The 26th International Conference on Computational Linguistics:System Demonstrations, pp. 218-222. Osaka, Japan.
[30]	Dong XY, Jia JP (2020) Advances in automatic bird species recognition from environmental audio. In:2020 5th International Conference on Intelligent Computing and Signal Processing. Suzhou, China.
[31]	Dooling RJ, Prior NH (2017) Do we hear what birds hear in birdsong? Animal Behaviour, 124, 283-289. DOI PMID
[32]	Duan SF (2014) Automated Species Recognition in Environmental Recordings. PhD dissertation, Queensland University of Technology, Brisbane.
[33]	Duan SF, Zhang JL, Roe P, Wimmer J, Dong XY, Truskinger A, Towsey M (2013) Timed probabilistic automaton:A bridge between Raven and Song Scope for automatic species recognition. In: Conference on Innovative Applications of Artificial Intelligence, pp. 1519-1524. Bellevue, Washington, USA.
[34]	Fagerlund S (2004) Automatic Recognition of Bird Species by Their Sound. PhD dissertation, Helsinki University of Technology, Espoo.
[35]	Ganchev TD, Jahn O, Marques MI, de Figueiredo JM, Schuchmann KL (2015) Automated acoustic detection of Vanellus chilensis lampronotus. Expert Systems with Applications, 42, 6098-6111.
[36]	Geng H, Li DF, Jiang JC (2005) Avian vocal organ and vocal control mechanism. Acta Biochimica et Biophysica Sinica, 21, 397-403. (in Chinese with English abstract)
	[耿慧, 李东风, 蒋锦昌 (2005) 鸟类的发声器官及其调控机制. 生物物理学报, 21, 397-403.]
[37]	Goëau H, Glotin H, Vellinga WP, Planqué R, Rauber A, Joly A (2014) Lifeclef bird identification task 2014. In: Conference and Labs of the Evaluation Forum, pp. 585-597. Sheffield, UK.
[38]	Guo AQ, Chen Y, Liu YK, Zhang X, Yu XW, Luo L, Gao JJ, Yang CY (2022) Song rhythm of Nomascus hainanus and its relationship with meteorological factors. Terrestrial Ecosystems and Conservation, 2(6), 40-50. (in Chinese with English abstract)
	[郭安琪, 陈艳, 刘昱坤, 张旭, 于新文, 罗丽, 高家军, 杨蔡芸 (2022) 海南长臂猿鸣叫节律及其与气象因子的关系. 陆地生态系统与保护学报, 2(6), 40-50.]
[39]	Gupta G, Kshirsagar M, Zhong M, Gholami S, Lavista Ferres J (2021) Comparing recurrent convolutional neural networks for large scale bird species classification. Scientific Reports, 11, 17085. DOI PMID
[40]	He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778. Las Vegas, NV, USA.
[41]	Hossfeld T, Keimel C, Timmerer C (2014) Crowdsourcing quality-of-experience assessments. Computer, 47, 98-102.
[42]	Hu SP, Chu YH, Tang L, Zhou GX, Chen AB, Sun YR (2023) A lightweight multi-sensory field-based dual-feature fusion residual network for bird song recognition. Applied Soft Computing, 146, 110678.
[43]	Hua XZ, Zhou R, Ye GH, Hua R, Bao D, Tang ZS, Hua LM (2020) Analysis and observation of behavior in plateau pika. Grassland and Turf, 40(6), 1-9. (in Chinese with English abstract)
	[华铣泽, 周睿, 叶国辉, 花蕊, 包达尔罕, 唐庄生, 花立民 (2020) 高原鼠兔长鸣鸣声分析及行为学观察. 草原与草坪, 40(6), 1-9.]
[44]	Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700-4708. Honolulu, HI, USA.
[45]	Huang SW, Fu WT (2013) Enhancing reliability using peer consistency evaluation in human computation. In: CSCW’13:Computer Supported Cooperative Work, pp. 639-648. San Antonio, Texas, USA.
[46]	Huang YR, Chen M (2019) Improve reputation evaluation of crowdsourcing participants using multidimensional index and machine learning techniques. IEEE Access, 7, 118055-118067.
[47]	Iqbal T, Cao Y, Plumbley M, Wang WW (2020) Incorporating auxiliary data for urban sound tagging. In: Detection and Classification of Acoustic Scenes and Events 2020 Challenge. Tokyo, Japan.
[48]	Irwin A (2002) Citizen Science: A Study of People, Expertise and Sustainable Development. Routledge, London.
[49]	Jiao YX, Lin ZK, Yu L, Wu XZ (2022) A fine-grain batching-based task allocation algorithm for spatial crowdsourcing. ISPRS International Journal of Geo- Information, 11, 203.
[50]	Jin YH, Wei MF, Li QX (2023) An RF fingerprint extraction method based on time-frequency domain feature fusion. In: 2022 International Conference on Algorithms, Network and Computer Technology, Wuhan, China.
[51]	Joly A, Champ J, Buisson O (2014) Instance-based bird species identification with undiscriminant features pruning. In: Conference and Labs of the Evaluation Forum Working Notes 2014, pp. 625-633. Sheffield, UK.
[52]	Kahl S, Wood CM, Eibl M, Klinck H (2021) BirdNET: A deep learning solution for avian diversity monitoring. Ecological Informatics, 61, 101236.
[53]	Kazai G, Kamps J, Milic-Frayling N (2012) The face of quality in crowdsourcing relevance labels: Demographics, personality and labeling accuracy. In: CIKM’12: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2583-2586. Hawaii, USA.
[54]	Kim B, Pardo B (2018) A human-in-the-loop system for sound event detection and annotation. ACM Transactions on Interactive Intelligent Systems, 8, 1-23.
[55]	Kim MJ, Kim H (2012) Audio-based objectionable content detection using discriminative transforms of time-frequency dynamics. IEEE Transactions on Multimedia, 14, 1390-1400.
[56]	Koluguri NR, Meenakshi GN, Ghosh PK (2017) Spectrogram enhancement using multiple window Savitzky-Golay (MWSG) filter for robust bird sound detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25, 1183-1192.
[57]	Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60, 84-90.
[58]	Kulkarni A, Can M, Hartmann B (2012) Collaboratively crowdsourcing workflows with turkomatic. In: CSCW’12: Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, pp. 1003-1012. Seattle, Washington, USA.
[59]	Lasseck M (2015) Improved automatic bird identification through decision tree based feature selection and bagging. In:Conference and Labs of the Evaluation Forum Working Notes 2015. Toulouse, France.
[60]	Lasseck M (2019) Bird species identification in soundscapes. In:Conference and Labs of the Evaluation Forum Working Notes 2019. Lugano, Switzerland.
[61]	LeBien J, Zhong M, Campos-Cerqueira M, Velev JP, Dodhia R, Lavista Ferres J, Aide TM (2020) A pipeline for identification of bird and frog species in tropical soundscape recordings using a convolutional neural network. Ecological Informatics, 59, 101113.
[62]	LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324.
[63]	Lee CH, Hsu SB, Shih JL, Chou CH (2013) Continuous birdsong recognition using Gaussian mixture modeling of image shape features. IEEE Transactions on Multimedia, 15, 454-464.
[64]	Li SY, Wei ML, Huang SJ (2022) Deep generative crowdsourcing learning with worker correlation utilization. Journal of Software, 33, 1274-1286. (in Chinese with English abstract)
	[李绍园, 韦梦龙, 黄圣君 (2022) 利用标注者相关性的深度生成式众包学习. 软件学报, 33, 1274-1286.]
[65]	Lin YH, Chen YY, Rubenstein DR, Liu M, Liu M, Shen SF (2023) Environmental quality mediates the ecological dominance of cooperatively breeding birds. Ecology Letters, 26, 1145-1156.
[66]	Liu HH, Liu XB, Mei XH, Kong QQ, Wang WW, Plumbley MD (2022) Surrey system for DCASE 2022 task 5: Few-shot bioacoustic event detection with segment-level metric learning. arXiv, 2207.10547.
[67]	Liu Y, Xie SN, Zhong ZW, Li JB, Ren QQ (2018) Topic-interest based influence maximization algorithm in social networks. Journal of Computer Research and Development, 55, 2406-2418. (in Chinese with English abstract)
	[刘勇, 谢胜男, 仲志伟, 李金宝, 任倩倩 (2018) 社会网中基于主题兴趣的影响最大化算法. 计算机研究与发展, 55, 2406-2418.]
[68]	Machado RB, Aguiar L, Jones G (2017) Do acoustic indices reflect the characteristics of bird communities in the savannas of Central Brazil? Landscape and Urban Planning, 162, 36-43.
[69]	Marin-Cudraz T, Muffat-Joly B, Novoa C, Aubry P, Desmet JF, Mahamoud-Issa M, Nicolè F, Van Niekerk MH, Mathevon N, Sèbe F (2019) Acoustic monitoring of rock ptarmigan: A multi-year comparison with point-count protocol. Ecological Indicators, 101, 710-719. DOI
[70]	Martín-Morató I, Mesaros A (2021) What is the ground truth? Reliability of multi-annotator data for audio tagging. In: 2021 29th European Signal Processing Conference, pp. 76-80. Dublin, Ireland.
[71]	Méndez Méndez AE (2024) A Framework of Interaction Between Machine, Experts and Crowd Towards the Generation of Large-scale, High-quality Audio Datasets. PhD dissertation, New York University, New York.
[72]	Méndez Méndez AE, Cartwright M, Bello JP (2019) Machine- crowd-expert model for increasing user engagement and annotation quality. In: CHI EA’19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, pp.1-6. Scotland, UK.
[73]	Mesaros A, Heittola T, Virtanen T, Plumbley MD (2021) Sound event detection: A tutorial. IEEE Signal Processing Magazine, 38, 67-83.
[74]	Mulimani M, Koolagudi SG (2019) Segmentation and characterization of acoustic event spectrograms using singular value decomposition. Expert Systems with Applications, 120, 413-425.
[75]	Narasimhan R, Fern XZ, Raich R (2017) Simultaneous segmentation and classification of bird song using CNN. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 146-150. New Orleans, LA, USA.
[76]	Ntalampiras S (2018) Bird species identification via transfer learning from music genres. Ecological Informatics, 44, 76-81.
[77]	Pahuja R, Kumar A (2021) Sound-spectrogram based automatic bird species recognition using MLP classifier. Applied Acoustics, 180, 108077.
[78]	Piczak KJ (2016) Recognizing bird species in audio recordings using deep convolutional neural networks. In:Conference and Labs of the Evaluation Forum Working Notes 2016, pp. 534-543. Évora, Portugal.
[79]	Pimm SL, Alibhai S, Bergl R, Dehgan A, Giri C, Jewell Z, Joppa L, Kays R, Loarie S (2015) Emerging technologies to conserve biodiversity. Trends in Ecology & Evolution, 30, 685-696.
[80]	Priyadarshani N, Marsland S, Castro I (2018) Automated birdsong recognition in complex acoustic environments: A review. Journal of Avian Biology, 49, 01447.
[81]	Ptacek L, Machlica L, Linhart P, Jaska P, Muller L (2016) Automatic recognition of bird individuals on an open set using as-is recordings. Bioacoustics, 25, 55-73.
[82]	Qiao Y, Qian K, Zhao ZP (2020) A survey on Chinese literature for bird sound recognition based on machine listening. Journal of Fudan University (Natural Science), 59, 375-380. (in Chinese with English abstract)
	[乔玉, 钱昆, 赵子平 (2020) 基于机器听觉的鸟声识别的中文研究综述. 复旦学报(自然科学版), 59, 375-380.]
[83]	Rabiner L, Juang B (1993) Fundamentals of Speech Recognition. Prentice-Hall, New Jersey.
[84]	Rahman MM, Abdullah NA (2023) A trustworthiness-aware spatial task allocation using a fuzzy-based trust and reputation system approach. Expert Systems with Applications, 211, 118592.
[85]	Rahman SMT, Sinthia AK, Akter SN, Roy P, Razzaque MA (2021) Reputation aware fair worker selection in collaborative software crowdsourcing. In: 2021 3rd International Conference on Sustainable Technologies for Industry 4.0 (STI), pp. 1-6. Dhaka, Bangladesh.
[86]	Randler C (2021) Users of a citizen science platform for bird data collection differ from other birdwatchers in knowledge and degree of specialization. Global Ecology and Conservation, 27, e01580.
[87]	Ren JF, Jiang XD, Yuan JS, Magnenat-Thalmann N (2017) Sound-event classification using robust texture features for robot hearing. IEEE Transactions on Multimedia, 19, 447-458.
[88]	Reynolds DA (1994) Experimental evaluation of features for robust speaker identification. IEEE Transactions on Speech and Audio Processing, 2, 639-643.
[89]	Rogowitz BE, Treinish LA, Bryson S (1996) How not to lie with visualization. Computers in Physics, 10, 268-273.
[90]	Ross SRPJ, O’Connell DP, Deichmann JL, Desjonquères C, Gasc A, Phillips JN, Sethi SS, Wood CM, Burivalova Z (2023) Passive acoustic monitoring provides a fresh perspective on fundamental ecological questions. Functional Ecology, 37, 959-975.
[91]	Ruff ZJ, Lesmeister DB, Jenkins JMA, Sullivan CM (2023) PNW-Cnet v4: Automated species identification for passive acoustic monitoring. SoftwareX, 23, 101473.
[92]	Ruiz-Muñoz JF, Castellanos-Dominguez G, Orozco-Alzate M (2016) Enhancing the dissimilarity-based classification of birdsong recordings. Ecological Informatics, 33, 75-84.
[93]	Sainath TN, Weiss RJ, Senior A, Wilson KW, Vinyals O (2015) Learning the speech front-end with raw waveform CLDNNs. In: Interspeech 2015, pp. 1-5. Dresden, Germany.
[94]	Salamon J, Bello JP, Farnsworth A, Kelling S (2017) Fusing shallow and deep learning for bioacoustic bird species classification. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 141-145. New Orleans, LA, USA.
[95]	Salamon J, Bello JP, Farnsworth A, Robbins M, Keen S, Klinck H, Kelling S (2016) Towards the automatic classification of avian flight calls for bioacoustic monitoring. PLoS ONE, 11, e0166866.
[96]	Settles B (2010) Active learning literature survey. Machine Learning, 15, 201-221.
[97]	Sevilla A, Glotin H (2017) Audio bird classification with Inception-v4 extended with time and time-frequency attention mechanisms. In: Conference and Labs of the Evaluation Forum Working Notes 2017, pp. 1-8. Dublin, Ireland.
[98]	Shen XH, Zhu XY, Shi HF, Wang CZ (2023) Research progress of birdsong recognition algorithms based on machine learning. Biodiversity Science, 31, 23272. (in Chinese with English abstract) DOI
	[申小虎, 朱翔宇, 史洪飞, 王传之 (2023) 基于机器学习鸟声识别算法研究进展. 生物多样性, 31, 23272.] DOI
[99]	Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations, p. 1556. Banff, Alberta, Canada.
[100]	Sinha R, Rajan P (2018) A deep autoencoder approach to bird call enhancement. In: 2018 IEEE 13th International Conference on Industrial and Information Systems, pp. 22-26. Rupnagar, India.
[101]	Somervuo P, Harma A, Fagerlund S (2006) Parametric representations of bird sounds for automatic species recognition. IEEE Transactions on Audio, Speech, and Language Processing, 14, 2252-2263.
[102]	Stowell D, Plumbley MD (2014a) Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ, 2, e448.
[103]	Stowell D, Plumbley MD (2014b) Large-scale analysis of frequency modulation in birdsong data bases. Methods in Ecology and Evolution, 5, 901-912.
[104]	Stowell D, Wood MD, Pamuła H, Stylianou Y, Glotin H (2019) Automatic acoustic detection of birds through deep learning: The first bird audio detection challenge. Methods in Ecology and Evolution, 10, 368-380. DOI
[105]	Sueur J, Pavoine S, Hamerlynck O, Duvail S (2008) Rapid acoustic survey for biodiversity appraisal. PLoS ONE, 3, e4065.
[106]	Sugai LSM, Llusia D (2019) Bioacoustic time capsules: Using acoustic monitoring to document biodiversity. Ecological Indicators, 99, 149-152.
[107]	Sun R, Marye YW, Zhao HA (2013) Wavelet transform digital sound processing to identify wild bird species. In: 2013 International Conference on Wavelet Analysis and Pattern Recognition, pp. 306-309. Tianjin, China.
[108]	Szegedy C, Liu W, Jia YQ, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9. Boston, MA, USA.
[109]	Tacchetti M (2017) User’s Guide for ELAN Linguistic Annotator, 5th edn. https://www.mpi.nl/corpus/manuals/manual-elan_ug.pdf (accessed on 2024-06-12)
[110]	Tan MX, Le QV (2019) EfficientNet: Rethinking model scaling for convolutional neural networks. arXiv, 1905.11946.
[111]	Tang JG, Zhang XY, Gao T, Liu DY, Fang X, Pan J, Wang Q, Du J, Xu KL, Pan QH (2022) Few-shot embedding learning and event filtering for bioacoustic event detection. In: Detection and Classification of Acoustic Scenes and Events 2022 Challenge. Nancy, France.
[112]	Tang XY, Yu W, Li SJ (2017) D3MOPSO: An evolutionary method for metasearch rank aggregation based on user preferences. Journal of Computer Research and Development, 54, 1665-1681. (in Chinese with English abstract)
	[汤小月, 余伟, 李石君 (2017) D3MOPSO: 一种基于用户偏好的元搜索排序聚合演化方法. 计算机研究与发展, 54, 1665-1681.]
[113]	Thompson MR (2021) Sonic visualiser: Visualisation, analysis, and annotation of music audio recordings. Journal of the American Musicological Society, 74, 701-714.
[114]	Towsey MW, Planitz B (2011) Technical Report: Acoustic analysis of the natural environment. QUT ePrints.
[115]	Usman AM, Ogundile OO, Versfeld DJJ (2020) Review of automatic detection and classification techniques for cetacean vocalization. IEEE Access, 8, 105181-105206.
[116]	Van den Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) WaveNet: A generative model for raw audio. In: Speech Synthesis Workshop. Sunnyvale, California, USA.
[117]	Wang EZ, He DJ (2014) Bird recognition based on MFCC and dual-GMM. Computer Engineering and Design, 35, 1868-1871. (in Chinese with English abstract)
	[王恩泽, 何东健 (2014) 基于MFCC和双重GMM的鸟类识别方法. 计算机工程与设计, 35, 1868-1871.]
[118]	Wang HC, Wei Q, Du W, Wang F, Ge BJ, Ju RR (2023) Application of passive acoustic monitoring technology in bird monitoring in Jiuli Lake Wetland Park. Journal of Jiangsu Forestry Science & Technology, 50(3), 30-36. (in Chinese)
	[王虎诚, 魏齐, 杜伟, 王菲, 葛秉珏, 鞠然然 (2023) 被动声学监测技术在九里湖湿地公园鸟类监测中的应用研究. 江苏林业科技, 50(3), 30-36.]
[119]	Wang HL, Xu YF, Yu Y, Lin YC, Ran JH (2022) An efficient model for a vast number of bird species identification based on acoustic features. Animals, 12, 2434.
[120]	Wang JC, Cui JG, Shi HT, Brauth SE, Tang YZ (2012) Effects of body size and environmental factors on the acoustic structure and temporal rhythm of calls in Rhacophorus dennysi. Asian Herpetological Research, 3, 205-212.
[121]	Wang X, Li Y (2014) Multi-band spectral subtraction method applied to natural sounds classification. Computer Engineering and Applications, 50, 190-220. (in Chinese with English abstract)
	[王熙, 李应 (2014) 多频带谱减法用于生态环境声音分类. 计算机工程与应用, 50, 190-220.]
[122]	Wei JM, Li Y (2015) Rapid bird sound recognition using anti-noise texture features. Acta Electronica Sinica, 43, 185-190. (in Chinese with English abstract)
	[魏静明, 李应 (2015) 利用抗噪纹理特征的快速鸟鸣声识别. 电子学报, 43, 185-190.] DOI
[123]	Wu HL, Ma TF, Wu LF, Xu FL, Ji SL (2021) Exploiting heterogeneous graph neural networks with latent worker/ task correlation information for label aggregation in crowdsourcing. ACM Transactions on Knowledge Discovery from Data, 16, 1-18.
[124]	Wu KY, Ruan WD, Zhou DF, Chen QC, Zhang CY, Pan XY, Yu S, Liu Y, Xiao RB (2023) Syllable clustering analysis-based passive acoustic monitoring technology and its application in bird monitoring. Biodiversity Science, 31, 22370. (in Chinese with English abstract) DOI
	[吴科毅, 阮文达, 周棣锋, 陈庆春, 张承云, 潘新园, 余上, 刘阳, 肖荣波 (2023) 基于音节聚类分析的被动声学监测技术及其在鸟类监测中的应用. 生物多样性, 31, 22370.] DOI
[125]	Xiao ZS, Cui JG, Wang DP, Wang ZT, Luo JH, Xie J (2023) Interdisciplinary development trends of contemporary bioacoustics and the opportunities for China. Biodiversity Science, 31, 22423. (in Chinese with English abstract) DOI
	[肖治术, 崔建国, 王代平, 王志陶, 罗金红, 谢捷 (2023) 现代生物声学的学科发展趋势及中国机遇. 生物多样性, 31, 22423.] DOI
[126]	Xie J, Colonna JG, Zhang JL (2021a) Bioacoustic signal denoising: A review. Artificial Intelligence Review, 54, 3575-3597.
[127]	Xie J, Hu K, Guo Y, Zhu QB, Yu JH (2021b) On loss functions and CNNs for improved bioacoustic signal classification. Ecological Informatics, 64, 101331.
[128]	Xie J, Towsey M, Eichinski P, Zhang JL, Roe P (2015) Acoustic feature extraction using perceptual wavelet packet decomposition for frog call classification. In: 2015 IEEE 11th International Conference on e-Science, pp. 237-242. Munich, Germany.
[129]	Xie J, Towsey M, Zhang JL, Roe P (2016) Adaptive frequency scaled wavelet packet decomposition for frog call classification. Ecological Informatics, 32, 134-144.
[130]	Xie JJ, Li WB, Zhang JG, Ding CQ (2018) Bird species recognition method based on Chirplet spectrogram feature and deep learning. Journal of Beijing Forestry University, 40(3), 122-127. (in Chinese with English abstract)
	[谢将剑, 李文彬, 张军国, 丁长青 (2018) 基于Chirplet语图特征和深度学习的鸟类物种识别方法. 北京林业大学学报, 40(3), 122-127.]
[131]	Xie JJ, Li XG, Xing ZL, Zhang BW, Bao WD, Zhang JG (2019) Improved distributed minimum variance distortionless response (MVDR) beamforming method based on a local average consensus algorithm for bird audio enhancement in wireless acoustic sensor networks. Applied Sciences, 9, 3153.
[132]	Xie JJ, Yang J, Ding CQ, Li WB (2020) High accuracy individual identification model of Crested Ibis (Nipponia nippon) based on autoencoder with self-attention. IEEE Access, 8, 41062-41070.
[133]	Xie JJ, Yang J, Xing ZL, Zhang Z, Chen X (2020) Bird species recognition method based on multi-feature fusion. Journal of Applied Acoustics, 39, 199-206. (in Chinese with English abstract)
	[谢将剑, 杨俊, 邢照亮, 张卓, 陈新 (2020) 多特征融合的鸟类物种识别方法. 应用声学, 39, 199-206.]
[134]	Xie JJ, Zhao SB, Li XG, Ni DM, Zhang JG (2022) KD-CLDNN: Lightweight automatic recognition model based on bird vocalization. Applied Acoustics, 188, 108550.
[135]	Xie JJ, Zhong YJ, Zhang JG, Liu S, Ding CQ, Triantafyllopoulos A (2023) A review of automatic recognition technology for bird vocalizations in the deep learning era. Ecological Informatics, 73, 101927.
[136]	Yan X, Li Y (2013) Anti-noise power normalized cepstral coefficients in bird sounds recognition. Acta Electronica Sinica, 41, 295-300. (in Chinese with English abstract) DOI
	[颜鑫, 李应 (2013) 利用抗噪幂归一化倒谱系数的鸟类声音识别. 电子学报, 41, 295-300.] DOI
[137]	Yang DC, Wang HL, Ye ZJ, Zou YX (2021) 2021 Few-shot bioacoustic event detection = A good transductive inference is all you need. In: Detection and Classification of Acoustic Scenes and Events 2021 Challenge.
[138]	Yang HF, Zhang JL, Roe P (2013) Reputation modelling in citizen science for environmental acoustic data analysis. Social Network Analysis and Mining, 3, 419-435.
[139]	Yang JR, Fan J, Wei ZW, Li GL, Liu TY, Du XY (2020) A game-based framework for crowdsourced data labeling. The VLDB Journal, 29, 1311-1336.
[140]	Ye B, Wang Y (2016) CrowdRec: Trust-aware worker recommendation in crowdsourcing environments. In: 2016 IEEE International Conference on Web Services, pp. 1-8. San Francisco, CA, USA.
[141]	Yu Q, Liu RS (1995) Introduction to bird sound research—A review at home and abroad. Journal of Zoology, 30(1), 52-55. (in Chinese)
	[俞清, 刘如笋 (1995) 鸟声研究介绍——国内外综述. 动物学杂志, 30(1), 52-55.]
[142]	Yuen MC, King I, Leung KS (2015) TaskRec: A task recommendation framework in crowdsourcing systems. Neural Processing Letters, 41, 223-238.
[143]	Zhang FY, Zhang LY, Chen HX, Xie JJ (2021) Bird species identification using spectrogram based on multi-channel fusion of DCNNs. Entropy, 23, 1507.
[144]	Zhang HD, Huang WX, Su ZH, Chen JY, Jiang D, Fan LX, Zhang C, Lian DF, Wu KS (2023) Hierarchical crowdsourcing for data labeling with heterogeneous crowd. In: 2023 IEEE 39th International Conference on Data Engineering, pp. 1234-1246. Anaheim, CA, USA.
[145]	Zhang XX, Li Y (2015) Adaptive energy detection for bird sound detection in complex environments. Neurocomputing, 155, 108-116.
[146]	Zhao Y, Zheng K, Wang ZW, Deng LW, Yang B, Pedersen TB, Jensen CS, Zhou XF (2024) Coalition-based task assignment with priority-aware fairness in spatial crowdsourcing. The VLDB Journal, 33, 163-184.
[147]	Zou L, Yan GW, Wang RY, Du J, Lei M, Gao T, Fang X (2024) Multitask frame-level learning for few-shot sound event detection. arXiv, 2403.11091.
[148]	Zsebők S, Nagy-Egri MF, Barnaföldi GG, Laczi M, Nagy G, Vaskuti É, Garamszegi L (2019) Automatic bird song and syllable segmentation with an open-source deep-learning object detection method—A case study in the Collared flycatcher. Ornis Hungarica, 27, 59-66.

数据集类型 Dataset type	优点 Advantage	缺点 Disadvantage
网站收集的数据集 Datasets collected on the website	扩大数据集, 研究个体的行为模式和活动范围、追踪物种的迁徙路径、监测不同物种的分布变化和评估地区的物种多样性 To expand datatsets, study individual behavioural patterns and ranges, track species migration pathways, monitor changes in the distribution of different species, and assess species diversity in an area	(1)数据收集主要集中在参与者密集的地区, 缺乏某些区域或鸟类的样本, 导致数据集在地理分布上的偏差 (1) Data collection is mainly focused on areas with a high concentration of participants, and there is a lack of samples for certain areas or birds, resulting in a bias in the geographical distribution of the dataset (2)数据采集依赖于公众, 参与者的专业水平参差不齐, 导致录音质量和准确性不一致 (2) Data collection relied on the public, and the level of expertise of participants varied, resulting in inconsistent recording quality and accuracy
博物馆收集的数据集 Datasets collected by museums	数据集质量高, 有助于深入研究物种的生理和行为特征、精准识别物种和对比不同时期的物种分布和种群变化等 The high-quality datasets collected are useful for in-depth study of the physiological and behavioral characteristics of species, accurate identification of species, and comparison of species distribution and population changes over time	更新频率较低, 无法及时反映当前鸟类种群的变化和动态 The datasets are updated infrequently and do not reflect the changes and dynamics of current bird populations in a timely manner
鸟类挑战赛公开数据集 Public datasets of bird challenge	数据集质量高、标签完备、应用于小区域的特定物种监测和保护研究 The datasets are high-quality, well-labeled, and can be used for species-specific monitoring and conservation research in small areas	集中于特定地理区域或物种研究, 导致数据集的代表性不足, 标签缺乏迁移性 The datasets focus on specific geographic regions or species studies, resulting in under-representation of the datasets and a lack of transferability of labels
自建数据集 Self-managed datasets	及时反映当前物种种群动态和变化, 补充未被广泛覆盖的小区域数据 The datasets reflect the current population dynamics and changes of species timely, and supplement the data of small regions that are not widely covered	难以覆盖广泛的地理区域和多样的鸟类种类 The datasets are difficult to cover a wide geographical area and diverse bird species

数据集类型 Dataset type	优点 Advantage	缺点 Disadvantage
网站收集的数据集 Datasets collected on the website	扩大数据集, 研究个体的行为模式和活动范围、追踪物种的迁徙路径、监测不同物种的分布变化和评估地区的物种多样性 To expand datatsets, study individual behavioural patterns and ranges, track species migration pathways, monitor changes in the distribution of different species, and assess species diversity in an area	(1)数据收集主要集中在参与者密集的地区, 缺乏某些区域或鸟类的样本, 导致数据集在地理分布上的偏差 (1) Data collection is mainly focused on areas with a high concentration of participants, and there is a lack of samples for certain areas or birds, resulting in a bias in the geographical distribution of the dataset (2)数据采集依赖于公众, 参与者的专业水平参差不齐, 导致录音质量和准确性不一致 (2) Data collection relied on the public, and the level of expertise of participants varied, resulting in inconsistent recording quality and accuracy
博物馆收集的数据集 Datasets collected by museums	数据集质量高, 有助于深入研究物种的生理和行为特征、精准识别物种和对比不同时期的物种分布和种群变化等 The high-quality datasets collected are useful for in-depth study of the physiological and behavioral characteristics of species, accurate identification of species, and comparison of species distribution and population changes over time	更新频率较低, 无法及时反映当前鸟类种群的变化和动态 The datasets are updated infrequently and do not reflect the changes and dynamics of current bird populations in a timely manner
鸟类挑战赛公开数据集 Public datasets of bird challenge	数据集质量高、标签完备、应用于小区域的特定物种监测和保护研究 The datasets are high-quality, well-labeled, and can be used for species-specific monitoring and conservation research in small areas	集中于特定地理区域或物种研究, 导致数据集的代表性不足, 标签缺乏迁移性 The datasets focus on specific geographic regions or species studies, resulting in under-representation of the datasets and a lack of transferability of labels
自建数据集 Self-managed datasets	及时反映当前物种种群动态和变化, 补充未被广泛覆盖的小区域数据 The datasets reflect the current population dynamics and changes of species timely, and supplement the data of small regions that are not widely covered	难以覆盖广泛的地理区域和多样的鸟类种类 The datasets are difficult to cover a wide geographical area and diverse bird species

标注技术 Annotation technique		技术特征 Technical characteristic	优点 Advantage	缺点 Disadvantage	针对缺点采取的措施 Measures taken in response to disadvantages
人工标注 Manual annotation	专家标注 Expert annotation	完全依赖专家完成标注工作 Rely solely on experts to annotate	准确率高 High accuracy	时间成本高 High time cost	提出众包标注 Proposing crowdsourced annotations
人工标注 Manual annotation	公民科学 Citizen Science	依赖爱好者完成标注工作 Rely on hobbyists to annotate	效率高 High efficiency	标签准确率不高 The accuracy of the label is not high	标签质量控制: 提升数据标签质量、任务请求者细化任务内容、限制众包参与者 Label quality control measures: Improving the quality of data labeling, refining task content by task requesters, and limiting crowdsourcing participants
自动标注 Automatic annotation		完全依赖模型完成标注工作 Rely entirely on the model for annotation	效率高 High efficiency	依赖模型性能 Depend on model performance	优化模型、训练数据集专家标注 Optimizing the model and using expert-annotated datasets for training
半自动标注 Semi-automatic annotation		标注工作依赖机器和人工 Rely on machines and humans	效率高、准确率高 High efficiency and high accuracy	需要人员和标签管理 Require personnel and label management	信誉管理、任务分配、激励机制等方面优化管理 Optimizing management in terms of reputation management, task allocation, and incentive mechanism

标注技术 Annotation technique		技术特征 Technical characteristic	优点 Advantage	缺点 Disadvantage	针对缺点采取的措施 Measures taken in response to disadvantages
人工标注 Manual annotation	专家标注 Expert annotation	完全依赖专家完成标注工作 Rely solely on experts to annotate	准确率高 High accuracy	时间成本高 High time cost	提出众包标注 Proposing crowdsourced annotations
人工标注 Manual annotation	公民科学 Citizen Science	依赖爱好者完成标注工作 Rely on hobbyists to annotate	效率高 High efficiency	标签准确率不高 The accuracy of the label is not high	标签质量控制: 提升数据标签质量、任务请求者细化任务内容、限制众包参与者 Label quality control measures: Improving the quality of data labeling, refining task content by task requesters, and limiting crowdsourcing participants
自动标注 Automatic annotation		完全依赖模型完成标注工作 Rely entirely on the model for annotation	效率高 High efficiency	依赖模型性能 Depend on model performance	优化模型、训练数据集专家标注 Optimizing the model and using expert-annotated datasets for training
半自动标注 Semi-automatic annotation		标注工作依赖机器和人工 Rely on machines and humans	效率高、准确率高 High efficiency and high accuracy	需要人员和标签管理 Require personnel and label management	信誉管理、任务分配、激励机制等方面优化管理 Optimizing management in terms of reputation management, task allocation, and incentive mechanism

人工特征 Artificial feature	特征提取方法 Feature extraction method	参考文献 Reference
时域特征 Time domain feature	短时能量 Short-term energy
	短时平均幅度 Short-term average amplitude
	短时过零率 Short-term zero-crossing rate	Marin-Cudraz et al, 2019
频域特征 Frequency domain feature	基频 Fundamental frequency
	子带能量比 Subband energy ratio
	梅尔频率倒谱系数 Mel frequency cepstrum coefficient	Chakraborty et al, 2016
	线性预测倒谱系数 Linear prediction cepstrum coefficient	Rabiner & Juang, 1993
	感知线性预测倒谱系数 Perceptual linear prediction cepstrum coefficient	Reynolds, 1994
图像特征 Image feature	图像频率统计 Image frequency statistics	Bastas et al, 2012
	形状特征 Shape features	Lee et al, 2013
	纹理特征 Texture features	Ren et al, 2017
	边缘特征 Edge features	Kim & Kim, 2012
	深度学习特征 Deep learning features	Sevilla & Glotin, 2017.
时频特征 Time-frequency feature	离散小波变换 Discrete wavelet transformation	Sun et al, 2013
	小波包分解 Wavelet packet decomposition	Xie et al, 2016
	Gabor变换特征 Gabor transform features	Connor et al, 2012
	短时傅里叶变换 Short-time Fourier transformation	Mulimani & Koolagudi, 2019
	梅尔频率倒谱变换 Mel frequency cepstrum transformation	Usman et al, 2020
	Chirplet变换 Chirplet transformation	谢将剑等, 2018
	匹配追踪 Matched pursuit	Stowell & Plumbley, 2014b
	Gammatone听觉滤波器 Gammatone auditory filters	Stowell & Plumbley, 2014b

鸟声标注技术及其在被动声学监测中的应用

Advances in bird sound annotation methods for passive acoustic monitoring

RichHTML

PDF (PC)

补充材料

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 148

相关文章 13

编辑推荐

Metrics

本文评价

软件名称 Software	输入 Input	模型 Model	免费 Free	网址 Website
Kaleidoscope Pro	音节 Syllable	隐马尔柯夫模型、K-means聚类算法 Hidden Markov model, K-means clustering algorithm	否 No	https://www.wildlifeacoustics.com/products/kaleidoscope-pro
BirdNET	3秒声谱图 3 s spectrogram	BirdNET	是 Yes	https://birdnet.cornell.edu/ https://github.com/kahst/BirdNET-Analyzer
Avisoft-SASLab Pro	音节 Syllable	轴平行阈值、线性判别分析 Axis parallel threshold, linear discriminant analysis	否 No	https://avisoft.com/
Arbimon	音节 Syllable	模板匹配 Template matching	是 Yes	https://arbimon.org/
AviaNZ	手动设置声谱图长度 Manually set the spectrogram length	小波识别器 Wavelet detector	是 Yes	http://www.avianz.net/
Luscinia	音节 Syllable	动态时间扭曲 Dynamic time warping	是 Yes	https://github.com/rflachlan/Luscinia/releases
ChirpOMatic	12秒的语音片段 12 s voice clips	机器学习 Machine learning	否 No	https://www.chirpomatic.com/
Merlin Bird ID		深度学习 Deep learning	是 Yes	https://merlin.allaboutbirds.org/
Shiny PNW-Cnet	12秒的语音片段 12 s voice clips	PNW-Cnet	是 Yes	https://github.com/zjruff/Shiny_PNW-Cnet/tree/main/scripts
Raven Pro	音节 Syllable		否 No	https://www.ravensoundsoftware.com/software/raven-pro/

标注技术 Annotation technology	适用场景 Applicable scenarios
人工标注 Manual annotation	标注小型数据集、发布基准数据集、标注新物种或不常见的物种、行为生态学研究和种间关系研究 Labeling small datasets, publishing benchmark datasets, labeling new or uncommon species, behavioral ecology studies, and interspecific relationship studies
半自动标注 Semi-automatic annotation	标注中型数据集 Labeling medium-sized datasets
自动标注 Automatic annotation	标注大型数据集、长期的被动声学监测、实时监控系统、环境变化评估, 以及大规模研究项目的初步筛选 Labeling large datasets, long-term passive acoustic monitoring, real-time monitoring systems, environmental change assessments, and initial screening of large-scale research projects

[1]	白皓天, 余上, 潘新园, 凌嘉乐, 吴娟, 谢恺琪, 刘阳, 陈学业. AI辅助识别的鸟类被动声学监测在城市湿地公园中的应用[J]. 生物多样性, 2024, 32(8): 24188-.
[2]	谢将剑, 沈忱, 张飞宇, 肖治术. 融合音频及生态位信息的跨地域鸟类物种识别方法[J]. 生物多样性, 2024, 32(10): 24259-.
[3]	陈蕾, 许志勇, 苏菩坤, 赖小甜, 赵兆. 依频声学多样性指数用于人类活动区域的适用能力[J]. 生物多样性, 2024, 32(10): 24286-.
[4]	刘莹莹, 龚立新, 曾皓, 冯江, 董永军, 王磊, 江廷磊. 被动声学监测在蝙蝠研究中的应用[J]. 生物多样性, 2024, 32(10): 24233-.
[5]	黄万涛, 郝泽周, 张梓欣, 肖治术, 张承云. 被动声学监测设备性能比较及对鸟声识别的影响[J]. 生物多样性, 2024, 32(10): 24273-.
[6]	申小虎, 李冠宇, 史洪飞, 王传之. 数据不平衡下鸟声识别的集成学习策略[J]. 生物多样性, 2024, 32(10): 24215-.
[7]	李乐, 张承云, 裴男才, 高丙涛, 王娜, 李嘉睿, 武瑞琛, 郝泽周. 基于被动声学监测技术的城市绿地景观格局与鸟类多样性关联分析[J]. 生物多样性, 2024, 32(10): 24296-.
[8]	郝泽周, 张承云, 李乐, 高丙涛, 曾伟, 王淳, 王梓炫, 黄万涛, 张悦, 裴男才, 肖治术. 城市鸟类多样性被动声学监测与评价技术应用[J]. 生物多样性, 2024, 32(10): 24123-.
[9]	申小虎, 朱翔宇, 史洪飞, 王传之. 基于机器学习鸟声识别算法研究进展[J]. 生物多样性, 2023, 31(11): 23272-.
[10]	肖治术, 崔建国, 王代平, 王志陶, 罗金红, 谢捷. 现代生物声学的学科发展趋势及中国机遇[J]. 生物多样性, 2023, 31(1): 22423-.
[11]	马海港, 范鹏来. 被动声学监测技术在陆生哺乳动物研究中的应用、进展和展望[J]. 生物多样性, 2023, 31(1): 22374-.
[12]	吴科毅, 阮文达, 周棣锋, 陈庆春, 张承云, 潘新园, 余上, 刘阳, 肖荣波. 基于音节聚类分析的被动声学监测技术及其在鸟类监测中的应用[J]. 生物多样性, 2023, 31(1): 22370-.
[13]	钟恩主, 管振华, 周兴策, 赵友杰, 李函, 谭绍斌, 胡坤融. 被动声学监测技术在西黑冠长臂猿监测中的应用[J]. 生物多样性, 2021, 29(1): 109-117.