生物多样性 ›› 2012, Vol. 20 ›› Issue (1): 76-85.  DOI: 10.17520/biods.2011131

• 方法 • 上一篇    下一篇

运用聚类分析与Google Maps于大量物种出现记录之研究

赖昆祺1, 郑又华1, 陈岳智1, 李佑升2, 邵广昭1,*()   

  1. 1 中研院生物多样性研究中心, 台北 11529
    2 中研院信息科技创新研究中心, 台北 11529
  • 收稿日期:2011-07-29 接受日期:2012-01-15 出版日期:2012-01-20 发布日期:2012-02-14
  • 通讯作者: 邵广昭
  • 作者简介:*E-mail: zoskt@gate.sinica.edu.tw

Applying cluster analysis and Google Maps in the study of large-scale species occurrence data

Kunchi Lai1, Youhua Cheng1, Yuehchih Chen1, Yousheng Li2, Kwangtsao Shao1,*()   

  1. 1 Biodiversity Research Center, Academia Sinica, Taipei 11529
    2 Research Center for Information Technology Innovation, Academia Sinica, Taipei 11529
  • Received:2011-07-29 Accepted:2012-01-15 Online:2012-01-20 Published:2012-02-14
  • Contact: Kwangtsao Shao

摘要:

物种出现记录包含博物馆动物标本、植物标本、生态调查与物种观察等资料。在台湾生物多样性信息机构(Taiwan Biodiversity Information Facility, TaiBIF)物种出现记录整合平台中, 已整合台湾26个数据集, 包含超过150万笔物种出现记录, 其中约有85%的数据具有地理信息。我们利用数据库中所汇整的鲤科数据, 包括11个数据集、超过8,800笔出现记录数据, 利用网格式、切割式与密度式3种聚类分析算法分别绘制出不同的空间可视化结果, 藉此解决大量物种出现记录于Google Maps上呈现效能与可视化不佳之问题。同时我们也探讨了3种聚类分析法之结果与鲤科的专家意见范围地图(expert opinion range maps)比对的差异。期望透过本研究可快速且有效地呈现物种分布资料, 进而帮助研究者挖掘出大量数据所隐含的知识, 并为生态保育提供重要参考。

关键词: 物种出现记录, 聚类分析, 生物多样性信息学, 可视性分析

Abstract

The primary species occurrence data include the data on animal and plant specimens in museums and herbaria, as well as species observations. TaiBIF (Taiwan Biodiversity Information Facility) data portal has integrated 26 datasets so far, resulting in more than 1.5 million species occurrence data; 85% of them are geo-referenced. This study utilizes more than 8,800 Cyprinidae occurrence data from 11 datasets and uses three different types of clustering algorithms—grid-based, partition-based, and density-based—to produce different spatial visualization results. It aims to resolve the problems of efficacy and poor visualization when large scales of species occurrence data are presented in Google Maps. The study also explores the comparative differences between the results obtained from the three clustering algorithms and the expert opinion range maps of Cyprinidae. It hopes to identify a quick and efficient way to present species distribution data, in turn help researchers to extract knowledge from large amount of data so that the knowledge can be tapped as important reference for ecological conservation efforts.

Key words: species occurrence data, cluster analysis, biodiversity informatics, visualization