Biodiv Sci

Previous Articles     Next Articles

Interpretable Machine Learning and Its Applications in Ecology

Yafei Shi1*, Furong Niu2, Xiaomin Huang3, Xing Hong1, Xiangwen Gong4, Yanli Wang2, Dong Lin1, Xiaoni Liu1   

  1. 1.Pratacultural College, Gansu Agricultural University, Lanzhou 730070, China 

    2.College of Forestry, Gansu Agricultural University, Lanzhou 730070, China 

    3.Agricultural College, Yangzhou University, Yangzhou, Jiangsu 225009, China 

    4.College of Geographical Sciences, Southwest University, Chongqing 400715, China

  • Received:2025-06-05 Revised:2025-09-02
  • Contact: Yafei Shi

Abstract:

Aims: The increasing adoption of machine learning (ML) in ecological research has enabled the modeling of complex, nonlinear ecological patterns. However, the "black-box" nature of many ML models limits their interpretability, hindering the extraction of ecological insights. This review aims to introduce the core concepts, methods, and practical tools of interpretable machine learning (IML), and to demonstrate how these techniques can enhance ecological understanding from predictive models. 

Methods: We first clarify key distinctions among white-box and black-box models, global and local interpretability, and intrinsic versus post-hoc explanation frameworks. Using a simulated dataset representing plant diversity and environmental variables (e.g., elevation, temperature, soil moisture), we apply both white-box models (e.g., linear regression, decision trees) and black-box models (e.g., random forest) to illustrate major interpretability techniques, including regression coefficients, permutation importance, partial dependence plots (PDP), accumulated local effects (ALE), SHapley Additive exPlanations (SHAP) values, and Local Interpretable Model-agnostic Explanations (LIME). 

Results: White-box models offer direct and transparent interpretability through their model structure, while black-box models require additional tools to derive explanations. Our case study shows that both model types can yield consistent insights about variable importance and ecological relationships. Furthermore, methods such as ALE and SHAP effectively address common limitations in conventional approaches like PDP by accounting for feature interactions and dependencies. 

Conclusion: IML provides a valuable toolkit for improving model transparency and interpretability in ecological research. It serves as a crucial complement to traditional statistical modeling, enabling researchers to extract meaningful ecological interpretations from complex models. As ecological data and modeling complexity continue to grow, the integration of IML techniques will become increasingly important for hypothesis generation and ecological decision-making.

Key words: machine learning, ecological interpretation, random forest, black-box model, plant diversity