Alpha-diversity index selection: Simulation comparison under unequal sampling

Yi Zou

doi:10.17520/biods.2025278

Biodiversity Science >

2026 , Vol. 34 >Issue 1: 25278

DOI: https://doi.org/10.17520/biods.2025278

Special Feature: Methods for Ecological Data Analysis

Alpha-diversity index selection: Simulation comparison under unequal sampling

Yi Zou

Expand

Department of Health and Environmental Sciences, School of Science, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China

Received date: 2025-07-18

Accepted date: 2025-09-17

Online published: 2025-09-28

Fold

Abstract

Aims: Unequal sampling is a common issue in field-based community ecology. Choosing α-diversity metrics that remain robust when sample sizes vary among plots is critical for reliable biodiversity assessment. This study evaluated the performance of nine diversity indices, including five “observed” indices calculated directly from the data: (1) species richness, (2) Shannon index, (3) Simpson index, (4) Hurlbert’s rarefied richness, and (5) Fisher’s α; and four “richness-estimator” indices: (1) Chao1 index, (2) abundance-based coverage estimator (ACE), (3) the extrapolated value of iNEXT (interpolation/extrapolation), and (4) total expected species (TES).

Methods: Using simulation, the performance of each index was evaluated under a gradient of minimum-sample thresholds, and for each case the accuracy and precision of between-sites variance (linear regression R²) was recorded. The simulation built up 20 sites in which “true” species richness (S) was linearly correlated with an environmental gradient (x) with a theoretical coefficient of determination R² = 0.80. Four unequal-sampling scenarios were then generated by imposing different minimum sample sizes per site. For each scenario, linear models were fitted between every diversity index and x, recording the corresponding R².

Results: The results indicate that sample size (the number of individuals recorded at a sampling site, as well as the equivalent sampling completeness) is the primary factor determining index performance. As sample size increased, model R² of all diversity metrics significantly improved. Under extremely low sampling (minimum < 20 individuals; sampling coverage < 20 %), rarefied richness had a higher R² than other indices. When the minimum sample size reached 100 individuals, the estimator indices group outperformed the observed indices. This study further clarified the minimum sample size and the corresponding sampling completeness required for each index to recover the predetermined R².

Conclusion: Overall, rarefied richness is recommended for highly unequal, sample size-poor scenarios. In practice, rarefaction threshold should be set at a relatively high level (e.g., > 40 individuals) to enhance the overall comparability among sampling sites, even if it results in the exclusion of extremely under sampled sites. Once sampling completeness is adequate, richness estimators are preferable, as they can generate extrapolated richness that are close to the true gradient.

Key words： community ecology; biodiversity metrics; species richness; incomplete detection; environmental factor; sampling completeness

Cite this article

Yi Zou . Alpha-diversity index selection: Simulation comparison under unequal sampling[J]. Biodiversity Science, 2026 , 34(1) : 25278 . DOI: 10.17520/biods.2025278

References

[1]	Beck J, Schwanghart W (2010) Comparing measures of species diversity from incomplete inventories: An update. Methods in Ecology and Evolution, 1, 38-44.
[2]	Brehm G, Süssenbach D, Fiedler K (2003) Unique elevational diversity patterns of geometrid moths in an Andean montane rainforest. Ecography, 26, 456-466.
[3]	Chao A (1984) Nonparametric estimation of the number of classes in a population. Scandinavian Journal of Statistics, 11, 265-270.
[4]	Chao A, Chiu C (2016) Species richness:Estimation and comparison. Wiley StatsRef: Statistics Reference Online, 1-26.
[5]	Chao A, Colwell RK, Chiu CH, Townsend D (2017) Seen once or more than once: Applying Good-Turing theory to estimate species richness using only unique observations and a species list. Methods in Ecology and Evolution, 8, 1221-1232.
[6]	Chao A, Jost L (2012) Coverage-based rarefaction and extrapolation: Standardizing samples by completeness rather than size. Ecology, 93, 2533-2547.
[7]	Chao A, Lee SM (1992) Estimating the number of classes via sample coverage. Journal of the American Statistical Association, 87, 210-217.
[8]	Currie DJ (1991) Energy and large-scale patterns of animal- and plant-species richness. The American Naturalist, 137, 27-49.
[9]	Engel T, Blowes SA, McGlinn DJ, May F, Gotelli NJ, McGill BJ, Chase JM (2021) Using coverage-based rarefaction to infer non-random species distributions. Ecosphere, 12, e03745.
[10]	Fisher RA, Corbet AS, Williams CB (1943) The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology, 12, 42-58.
[11]	Flather C (1996) Fitting species-accumulation functions and assessing regional land use impacts on avian diversity. Journal of Biogeography, 23, 155-168.
[12]	Gorrod EJ, Keith DA (2009) Observer variation in field assessments of vegetation condition: Implications for biodiversity conservation. Ecological Management & Restoration, 10, 31-40.
[13]	Gotelli NJ, Colwell RK (2001) Quantifying biodiversity: Procedures and pitfalls in the measurement and comparison of species richness. Ecology Letters, 4, 379-391.
[14]	Hayek LAC, Buzas MA (2010) Surveying Natural Populations: Quantitative Tools for Assessing Biodiversity, 2nd edn. Columbia University Press, New York.
[15]	Hsieh TC, Ma KH, Chao A (2016) iNEXT: An R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods in Ecology and Evolution, 7, 1451-1456.
[16]	Hurlbert AH, Haskell JP (2003) The effect of energy and seasonality on avian species richness and community composition. The American Naturalist, 161, 83-97.
[17]	Hurlbert SH (1971) The nonconcept of species diversity: A critique and alternative parameters. Ecology, 52, 577-586.
[18]	Jost L (2006) Entropy and diversity. Oikos, 113, 363-375.
[19]	Kreyling J, Schweiger AH, Bahn M, Ineson P, Migliavacca M, Morel-Journel T, Christiansen JR, Schtickzelle N, Larsen KS (2018) To replicate, or not to replicate—that is the question: How to tackle nonlinear responses in ecological experiments? Ecology Letters, 21, 1629-1638.
[20]	Lomolino MV (2001) The species-area relationship: New challenges for an old pattern. Progress in Physical Geography: Earth and Environment, 25, 1-21.
[21]	Ma KP, Liu YM (1994) Measurement of biotic community diversity. I. α diversity (Part 2). Chinese Biodiversity, 2, 231-239. (in Chinese with English abstract)
	[马克平, 刘玉明 (1994) 生物群落多样性的测度方法. I. α多样性的测度方法(下). 生物多样性, 2, 231-239.]
[22]	McGill BJ (2011) Linking biodiversity patterns by autocorrelated random sampling. American Journal of Botany, 98, 481-502.
[23]	McGill BJ, Etienne RS, Gray JS, Alonso D, Anderson MJ, Benecha HK, Dornelas M, Enquist BJ, Green JL, He FL, Hurlbert AH, Magurran AE, Marquet PA, Maurer BA, Ostling A, Soykan CU, Ugland KI, White EP (2007) Species abundance distributions: Moving beyond single prediction theories to integration within an ecological framework. Ecology Letters, 10, 995-1015.
[24]	Mersmann O, Trautmann H, Steuer D, Bornkamp B (2018) truncnorm: Truncated Normal Distribution. R package version 1.0-8.
[25]	Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, Minchin PR, O’Hara RB, Simpson GL, Solymos P, Stevens MHH, Szoecs E, Wagner H (2018) vegan: Community Ecology Package. R package version 2.5-6.
[26]	R Core Team (2021) R: A Language and Environment for Statistical Computing. Version 4.1.2. R Foundation for Statistical Computing, Vienna, Austria.
[27]	Reese GC, Wilson KR, Flather CH (2014) Performance of species richness estimators across assemblage types and survey parameters. Global Ecology and Biogeography, 23, 585-594.
[28]	Soley-Guardia M, Alvarado-Serrano DF, Anderson RP (2024) Top ten hazards to avoid when modeling species distributions: A didactic guide of assumptions, problems, and recommendations. Ecography, 2024, e06852.
[29]	Whittaker RH (1960) Vegetation of the Siskiyou Mountains, Oregon and California. Ecological Monographs, 30, 279-338.
[30]	Wickham H (2009) ggplot2: Elegant Graphics for Data Analysis. Springer Publishing Company, New York.
[31]	Wickham H, Fran?ois R, Henry L, Müller K, Vaughan D (2023) dplyr: A Grammar of Data Manipulation. R package version 1.1.4.
[32]	Zou Y, Zhao P, Axmacher JC (2023) Estimating total species richness: Fitting rarefaction by asymptotic approximation. Ecosphere, 14, e4363.
[33]	Zou Y, Zhao P, Wu NC, Lai JS, Peres-Neto PR, Axmacher JC (2025) rarestR: An R package using rarefaction metrics to estimate α- and β-diversity for incomplete samples. Diversity and Distributions, 31, e13954.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References