Biodiv Sci ›› 2012, Vol. 20 ›› Issue (1): 76-85.  DOI: 10.17520/biods.2011131

• Methodologies • Previous Articles     Next Articles

Applying cluster analysis and Google Maps in the study of large-scale species occurrence data

Kunchi Lai1, Youhua Cheng1, Yuehchih Chen1, Yousheng Li2, Kwangtsao Shao1,*()   

  1. 1 Biodiversity Research Center, Academia Sinica, Taipei 11529
    2 Research Center for Information Technology Innovation, Academia Sinica, Taipei 11529
  • Received:2011-07-29 Accepted:2012-01-15 Online:2012-01-20 Published:2012-02-14
  • Contact: Kwangtsao Shao


The primary species occurrence data include the data on animal and plant specimens in museums and herbaria, as well as species observations. TaiBIF (Taiwan Biodiversity Information Facility) data portal has integrated 26 datasets so far, resulting in more than 1.5 million species occurrence data; 85% of them are geo-referenced. This study utilizes more than 8,800 Cyprinidae occurrence data from 11 datasets and uses three different types of clustering algorithms—grid-based, partition-based, and density-based—to produce different spatial visualization results. It aims to resolve the problems of efficacy and poor visualization when large scales of species occurrence data are presented in Google Maps. The study also explores the comparative differences between the results obtained from the three clustering algorithms and the expert opinion range maps of Cyprinidae. It hopes to identify a quick and efficient way to present species distribution data, in turn help researchers to extract knowledge from large amount of data so that the knowledge can be tapped as important reference for ecological conservation efforts.

Key words: species occurrence data, cluster analysis, biodiversity informatics, visualization