生物多样性 ›› 2022, Vol. 30 ›› Issue (12): 22252.  DOI: 10.17520/biods.2022252

所属专题: 土壤生物与土壤健康

• 技术与方法 • 上一篇    下一篇

土壤动物的分子分类预测策略评估

徐聪1, 张飞宇1, 俞道远2, 孙新3, 张峰1,*()   

  1. 1.南京农业大学植物保护学院, 南京 210095
    2.南京农业大学资源与环境科学学院, 南京 210095
    3.中国科学院城市环境研究所, 福建厦门 130102
  • 收稿日期:2022-05-09 接受日期:2022-08-18 出版日期:2022-12-20 发布日期:2022-11-25
  • 通讯作者: *E-mail: fzhang@njau.edu.cn
  • 基金资助:
    国家科技基础资源调查专项(2018FY100303);国家自然科学基金(31970434);国家自然科学基金(32270470)

Performance evaluation of molecular taxonomy assignment tools for soil invertebrates

Cong Xu1, Feiyu Zhang1, Daoyuan Yu2, Xin Sun3, Feng Zhang1,*()   

  1. 1. College of Plant Protection, Nanjing Agricultural University, Nanjing 210095
    2. College of Resources and Environmental Science, Nanjing Agricultural University, Nanjing 210095
    3. Institute of Urban Environment, Chinese Academy of Sciences, Xiamen, Fujian 130102
  • Received:2022-05-09 Accepted:2022-08-18 Online:2022-12-20 Published:2022-11-25
  • Contact: *E-mail: fzhang@njau.edu.cn

摘要:

土壤动物类群包含庞大的生物多样性, 由于传统的形态学鉴定技术很难满足该类群多样性调查和监测的巨大需求, 基于DNA等遗传物质的分子层面的鉴定技术(分子分类预测)逐渐登上舞台。然而, 分子分类预测能否在参考分子序列严重匮乏的土壤动物分类研究中实现有效鉴定、如何利用分子分类预测更为准确高效地获取土壤动物的分类信息, 是当下分子分类预测在土壤动物应用中的两大难题。为探究这两大难题, 本文基于宏条形码技术, 对5款常用的分子分类预测软件(VSEARCH、HS-BLASTN、EPA-NG、RAPPAS和APPLES; 前两款基于相似度算法, 其余基于系统发育位置算法)进行了准确性(科和属阶元)、运行速度和内存占用等性能的比较和评估。其中, 预测准确性的评估基于4类土壤动物(弹尾纲, 蜱螨亚纲, 环带纲和色矛纲)和3种分子标记(COI、16S和18S)展开。结果表明: EPA-NG在大部分场合下准确性最高, 尤其是在使用COI标记时, 准确性远高于其他工具。VSEARCH和HS-BLASTN准确性也较高, 基于16S和18S标记时, 它们的准确性和EPA-NG相当。此外, VSEARCH在所有软件中运行速度最快且内存占用最小, 这使得它在16S和18S的应用中比EPA-NG更具竞争力。RAPPAS和APPLES具有较低的假阳性, 但假阴性很高, 相对保守的算法使得它们无法将一些物种鉴定到低阶元。总体来说, 即使是在参考数据库缺少目标物种且小部分物种在分类上存在界定争议的前提下, 5款分子分类预测软件都能极为准确地将土壤动物预测至科级阶元, 因此分子分类预测在土壤动物应用中前景远大。COI标记在土壤动物科、属和种阶元上的覆盖度最广且能有效实现分子鉴定, 在目前最适合作为土壤动物尤其是土壤节肢动物的分子标记。在应用COI标记且参考数据库规模不大时, EPA-NG是分子分类预测的最佳选择; 而在应用16S、18S标记或参考数据库规模较大时, 更推荐使用VSEARCH。

关键词: 分子分类预测, 土壤动物, 生物信息学软件, 物种鉴定, 生物多样性

Abstract

Aims: Soil invertebrate communities are of extremely high diversity but still poorly studied in DNA-based diversity assessments. Since traditional morphological identifications have trouble in completing thousands of taxonomy assignments accurately with limited time, more and more biodiversity surveys turn to molecular taxonomy assignments. To promote biodiversity surveys on soil invertebrates, we made a comprehensive comparison for five popular taxonomy assignment tools (VSEARCH, HS-BLASTN, EPA-NG, RAPPAS and APPLES) targeting on different molecular markers (COI, 16S and 18S). Four soil invertebrate groups (Collembola, Acari, Clitellata and Chromadorea) were selected in the comparison representing three representative phyla of varied body-sizes.
Methods: The databases of four soil invertebrate groups using three molecular markers were built with a filtering step. The commands of five taxonomy assignment tools were integrated into a script which would finally output the taxonomic information of query sequences. All of assignment accuracy, running speed and memory usage of five tools were estimated and compared.
Results: Our results indicated that EPA-NG performed best in accuracy for most cases, especially for COI. VSEARCH and HS-BLASTN remained high accuracy and showed similar accuracy performance when utilizing 16S and 18S markers. Moreover, shorter running time and lower memory usage made VSEARCH more popular applying in 16S and 18S than EPA-NG. RAPPAS and APPLES showed unstable performances in accuracy and were often too conservative to identify some species at generic or familial levels.
Conclusion: This study concluded that molecular taxonomy assignment could accomplish identifications of soil invertebrates in an accurate and efficient manner. COI marker is the most recommended marker applied in molecular taxonomy assignment for soil invertebrates because of its abundant repositories of reference sequences reflected in all of species, genus and family levels. When COI is utilized as marker, EPA-NG is the most recommended tool unless the reference database is too large. When 16S or 18S is utilized as marker, VSEARCH is most highly recommended.

Key words: taxonomy assignment, soil invertebrate, bioinformatics tool, identification, biodiversity