Biodiv Sci

Previous Articles     Next Articles

A comparative evaluation of bioinformatic pipelines for invertebrate biodiversity profiling via environmental DNA metabarcoding

Ziling Yan1,2, Xiaoyu Chen2, Meng Yao1,2*   

  1. 1 Institute of Ecology, College of Urban and Environmental Sciences, Peking University, Beijing 100871, China 

    2 State Key Laboratory of Gene Function and Modulation Research, School of Life Sciences, Peking University, Beijing 100871, China

  • Received:2025-09-15 Revised:2025-12-09
  • Contact: Meng Yao

Abstract:

Aims: Environmental DNA (eDNA) technology has been increasingly applied in biodiversity research. However, its rapid development has also sparked methodological debates. A key issue involves the selection of bioinformatic pipelines, particularly for extremely biodiverse taxa such as invertebrates. Bioinformatic pipelines significantly affect eDNA-based biodiversity profiles, yet a systematic comparative evaluation of relevant pipelines is currently lacking. Therefore, this study aims to compare and evaluate bioinformatic pipelines commonly used for analyzing eDNA-derived invertebrate sequencing data. 

Method: Invertebrate metabarcoding sequencing was carried out on freshwater eDNA samples, and the performance of various bioinformatic pipelines in processing invertebrate sequences was comparatively assessed. Four commonly used clustering or denoising methods (UPARSE, Swarm, UNOISE, and DADA2) and three taxonomic assignment methods (BOLDigger, BLASTN, and Naïve Bayesian Classifier) were selected, together constituting 12 bioinformatic pipelines. 

Results: Of the 12 evaluated pipelines, the combination of DADA2 denoising and BOLDigger taxonomic assignment yielded the largest number of invertebrate molecular operational taxonomic units (MOTUs), along with the highest levels of taxonomic coverage and resolution. Among the four clustering or denoising methods, UNOISE and DADA2 denoising yielded more invertebrate MOTUs than UPARSE and Swarm clustering. Among the three taxonomic assignment methods, BOLDigger and BLASTN yielded higher taxonomic coverage and resolution than Naïve Bayesian Classifier. 

Conclusion: These findings have significant implications for eDNA-based research of freshwater invertebrate biodiversity. Furthermore, our results suggest that bioinformatic pipelines should be adjusted according to different study taxa and barcode regions to obtain accurate and reliable biodiversity data.

Key words: environmental DNA, invertebrate biodiversity, bioinformatic pipeline, clustering, denoising, taxonomic assignment