Biodiv Sci ›› 2024, Vol. 32 ›› Issue (4): 23435.  DOI: 10.17520/biods.2023435  cstr: 32101.14.biods.2023435

• Technology and Methodology • Previous Articles     Next Articles

Identification of common native grassland plants in northern China using deep learning

Yongcai Wang1, Huawei Wan2, Jixi Gao2,*(), Zhuowei Hu1,*(), Chenxi Sun2, Na Lü2, Zhiru Zhang2   

  1. 1 College of Resource Environment and Tourism, Capital Normal University, Beijing 100048
    2 Satellite Application Center for Ecology and Environment, Ministry of Ecology and Environment, Beijing 100094
  • Received:2023-11-15 Accepted:2024-03-30 Online:2024-04-20 Published:2024-05-17
  • Contact: * E-mail: gjx@nies.org;huzhuowei@cnu.edu.cn

Abstract:

Aims: The classification and identification of grassland plants is an essential part of grassland resource surveillance and biodiversity monitoring. Rapid advancements in computer vision and deep learning have created opportunities for automating this process, however, there is currently a shortage of datasets and models specifically tailored for the identification of grassland plants.

Methods: This study established a dataset comprising images of 831 species of native grassland plants in northern China. Employing state-of-the-art image classification architectures based on convolutional neural networks (CNN) and vision transformers (ViT), we trained models for the recognition of grassland plant images. Four models (Eva-02, ResNet_RS, MobileNetV3, and MobileViTv2) were evaluated for accuracy, recognition speed, and size.

Results: Regarding model recognition accuracy, the Top1 accuracy of the Eva-02, MobileViTv2, ResNet_RS, and MobileNetV3 models on the test set were 96.78%, 94.29%, 95.57%, and 91.53%, respectively. The Top5 accuracy on the test set were 99.17%, 98.93%, 98.79%, and 97.56%, respectively. In terms of model size and recognition speed, the MobileNetV3 model exhibited the smallest parameter size and fastest recognition speed, followed by MobileViTv2, making these models suitable for deployment on mobile devices. Conversely, the Eva-02 model had the largest parameter size and the slowest detection speed. Comparing with Pl@ntNet, HuaBanLv, and Baidu-Shitu, all four models developed in this study outperform these three recognition systems.

Conclusion: The plant recognition models trained in this study can recognize the largest number of natural grassland plant species with the highest accuracy compared to other popular recognition systems. The four models strike a balance between model recognition accuracy and performance that is suitable for deployment on both desktop and mobile platforms. They also fulfill the requirements for indoor and outdoor application scenarios.

Key words: grassland, species recognition, deep learning, convolutional neural network, vision transformer