Biodiversity Science ›› 2015, Vol. 23 ›› Issue (4): 550-555.doi: 10.17520/biods.2015120

• Orginal Article • Previous Article    

Using NCBIminer to search and download nucleotide sequences from GenBank

Xiaoting Xu1, Zhiheng Wang1, *(), Dimitar Dimitrov2, Carsten Rahbek3, 4   

  1. 1 Department of Ecology and Key Laboratory for Earth Surface Processes of the Ministry of Education, College of Urban and Environmental Sciences, Peking University, Beijing 100871
    2 Natural History Museum, University of Oslo, Oslo, Norway
    3 Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
    4 Imperial College London, Grand Challenges in Ecosystems and the Environment Initiative, Silwood Park Campus, Berkshire, UK
  • Received:2015-05-07 Accepted:2015-07-09 Online:2015-08-03
  • Wang Zhiheng

GenBank is the leading public genetic resources database and currently contains over 1012 base pairs from about 300,000 formally described species. It offers valuable resources for studies on the evolution of species, genes, and genomes. However, difficulties in GenBank data mining hinder the potential wide application of this tool for big data collection. To address this issue, we introduce new bioinformatics software —NCBIminer. NCBIminer is a freely available, cross-platform, and user-friendly software for mining nucleotide sequences from GenBank. The main purpose of NCBIminer is to download sequences for user required genes and taxonomic groups based on gene names, types, and one or several reference sequences. The program algorithms have been described elsewhere and here, we focus on introducing the details in the usage of the program including how to install, run, and set parameters.

Key words: GenBank, bioinformatics, gene, phylogenetic evolution, DNA, nucleotide sequences

Appendix 1

Data format for a sequence in GenBank. The items in the left box are feature types defined in GenBank, while those in the right box are GenBank annotation information."

Appendix 2

Data format for a sequence in GenBank. The items in the left box are feature types defined in GenBank, while those in the right box are GenBank annotation informatioppendix 2 NCBIminer workflow. a, Major steps of the NCBIminer’s work flow; b, The algorithms for the establishment of improved reference sequences and sequence combination of multiple queries. Modified from Xu et al. (2015)."

1 Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool.Journal of Molecular Biology, 215, 403-410.
2 Chen ZD (陈之端), Li DZ (李德铢) (2013) On Barcode of Life and Tree of Life.Plant Diversity and Resources(植物分类与资源学报), 35, 675-681. (in Chinese with English abstract)
3 Driskell AC, Ané C, Burleigh JG, McMahon MM, O’Meara BC, Sanderson MJ (2004) Prospects for building the Tree of Life from large sequence databases.Science, 306, 1172-1174.
4 Holt B, Lessard JP, Borregaard MK, Fritz SA, Araujo MB, Dimitrov D, Fabre PH, Graham CH, Graves GR, Jonsson KA, Nogues-Bravo D, Wang ZH, Whittaker RJ, Fjeldsa J, Rahbek C (2013) An update of Wallace’s zoogeographic regions of the world.Science, 339, 74-78.
5 Jones M, Koutsovoulos G, Blaxter M (2011) iPhy: an integrated phylogenetic workbench for supermatrix analyses.BMC Bioinformatics, 12, 30.
6 Li DC (2013) Similarity analysis of DNA sequences based on CLZ complexity.Journal of Computational and Theoretical Nanoscience, 10, 481-487.
7 Li DZ, Gao LM, Li HT, Wang H, Ge XJ, Liu JQ, Chen ZD, Zhou SL, Chen SL, Yang JB, Fu CX, Zeng CX, Yan HF, Zhu YJ, Sun YS, Chen SY, Zhao L, Wang K, Yang T, Duan GW, Grp CPB (2011) Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants. Proceedings of the National Academy of Sciences, USA, 108, 19641-19646.
8 Lu LM (鲁丽敏), Sun M (孙苗), Zhang JB (张景博), Li HL (李洪雷), Lin L (林立), Yang T (杨拓), Chen M (陈闽), Chen ZD (陈之端) (2014) Tree of Life and its applications.Biodiversity Science(生物多样性), 22, 3-20. (in Chinese with English abstract)
9 Pearse WD, Purvis A (2013) phyloGenerator: an automated phylogeny generation tool for ecologists.Methods in Ecology and Evolution, 4, 692-698.
10 Pei NC (裴男才) (2015) Applications of DNA barcoding in evolutionary ecology.Biodiversity Science(生物多样性), 23, 291-292. (in Chinese)
11 Qiu Q, Zhang GJ, Ma T, Qian WB, Wang JY, Ye ZQ, Cao CC, Hu QJ, Kim J, Larkin DM, Auvil L, Capitanu B, Ma J, Lewin HA, Qian XJ, Lang YS, Zhou R, Wang LZ, Wang K, Xia JQ, Liao SG, Pan SK, Lu X, Hou HL, Wang Y, Zang XT, Yin Y, Ma H, Zhang J, Wang ZF, Zhang YM, Zhang DW, Yonezawa T, Hasegawa M, Zhong Y, Liu WB, Zhang Y, Huang ZY, Zhang SX, Long RJ, Yang HM, Wang J, Lenstra JA, Cooper DN, Wu Y, Wang J, Shi P, Wang J, Liu JQ (2012) The yak genome and adaptation to life at high altitude.Nature Genetics, 44, 946-949.
12 Ren BQ (任保青), Chen ZD (陈之端) (2010) DNA barcoding plant life.Chinese Bulletin of Botany(植物学报), 45, 1-12. (in Chinese with English abstract)
13 Sanderson M, Boss D, Chen D, Cranston K, Wehe A (2008) The PhyLoTA browser: processing GenBank for molecular phylogenetics research.Systematic Biology, 57, 335-346.
14 Xu X, Wang Z, Rahbek C, Lessard J-P, Fang J (2013)
15 Evolutionary history influences the effects of water-energy dynamics on oak diversity in Asia.Journal of Biogeography, 40, 2146-2155.
16 Xu XT, Dimitrov D, Rahbek C, Wang ZH (2015) NCBIminer: sequences harvest from Genbank.Ecography, 38, 426-430.
17 Yang ZY, Ran JH, Wang XQ (2012) Three genome-based phylogeny of Cupressaceae s.l.: further evidence for the evolution of gymnosperms and southern hemisphere biogeography.Molecular Phylogenetics and Evolution, 64, 452-470.
18 Zanne AE, Tank DC, Cornwell WK, Eastman JM, Smith SA, FitzJohn RG, McGlinn DJ, O’Meara BC, Moles AT, Reich PB, Royer DL, Soltis DE, Stevens PF, Westoby M, Wright IJ, Aarssen L, Bertin RI, Calaminus A, Govaerts R, Hemmings F, Leishman MR, Oleksyn J, Soltis PS, Swenson NG, Warman L, Beaulieu JM (2013) Three keys to the radiation of angiosperms into freezing environments.Nature, 506, 89-92.
[1] Weng Zhuoxian, Huang Jiaqiong, Zhang Shihao, Yu Kaichun, Zhong Fusheng, Huang Xunhe, Zhang Bin. Genetic diversity and population structure of black-bone chickens in China revealed by mitochondrial COI gene sequences [J]. Biodiv Sci, 2019, 27(6): 667-676.
[2] Jiang Kaiwen, Pan Bo, Tian Bin. Recent taxonomic changes for Fabaceae (Leguminosae) genera in China [J]. Biodiv Sci, 2019, 27(6): 689-697.
[3] Li Yiyuan, C. Molik David, E. Pfrender Michael. EPPS, a metabarcoding bioinformatics pipeline using Nextflow [J]. Biodiv Sci, 2019, 27(5): 567-575.
[4] yuchun Rao Chun Zhou Han Lin. Gene Mapping and Candidate Gene Analysis of Rice Early Senescence Mutant LS-es1 [J]. Chin Bull Bot, 2019, 54(5): 0-0.
[5] Shao Xinning, Song Dazhao, Huang Qiaowen, Li Sheng, Yao Meng. Fast surveys and molecular diet analysis of carnivores based on fecal DNA and metabarcoding [J]. Biodiv Sci, 2019, 27(5): 543-556.
[6] Xu Yakun, Ma Yue, Hu Xiaoxi, Wang Jun. Analysis of prospective microbiology research using third-generation sequencing technology [J]. Biodiv Sci, 2019, 27(5): 534-542.
[7] Huai-Dong Tian. Method for N-methyl-N-nitrosourea Mutagenesis on Hermaphroditic Germ Cells of Rice [J]. Chin Bull Bot, 2019, 54(5): 0-0.
[8] Li Hanxi, Huang Xuena, Li Shiguo, Zhan Aibin. Environmental DNA (eDNA)-metabarcoding-based early monitoring and warning for invasive species in aquatic ecosystems [J]. Biodiv Sci, 2019, 27(5): 491-504.
[9] . Megasporogenesis,microsporogenesis and development of male and female gametophytes of Clematis heracleifolia [J]. Chin Bull Bot, 2019, 54(5): 0-0.
[10] Jie-Li HE. Development of EST-SSR and Evaluation of Genetic Diversity of Common Millet (Panicum miliaceum) [J]. Chin Bull Bot, 2019, 54(5): 0-0.
[11] Liu Shanlin. DNA barcoding and emerging reference construction and data analysis technologies [J]. Biodiv Sci, 2019, 27(5): 526-533.
[12] Li Meng, Wei Tingting, Shi Boyang, Hao Xiyang, Xu Haigen, Sun Hongying. Biodiversity monitoring of freshwater benthic macroinvertebrates using environmental DNA [J]. Biodiv Sci, 2019, 27(5): 480-490.
[13] Ma Yanjie, He Haopeng, Shen Wenjing, Liu Biao, Xue Kun. Effects of transgenic maize on arthropod diversity [J]. Biodiv Sci, 2019, 27(4): 419-432.
[14] Xu yue Shaojun Dai. Agrobacterium rhizogenes-mediated Transformation System of Spinacia oleracea L. [J]. Chin Bull Bot, 2019, 54(4): 0-0.
[15] Chen Zhixiang, Yao Xueying, Stephen R. Downie, Wang Qizhi. Assembling and analysis of Sanicula orthacantha chloroplast genome [J]. Biodiv Sci, 2019, 27(4): 366-372.
Full text



[1] Bo Wu, Chunquan Zhu, Diqiang Li, , Ke Dong, Xiulei Wang, Peili Shi. Setting biodiversity conservation priorities in the Forests of the Upper Yangtze Ecoregion based on ecoregion conservation methodology[J]. Biodiv Sci, 2006, 14(2): 87 -97 .
[2] WANG Wei, LI Qing-Kang, MA Ke-Ping. Establishment and Spatial Distribution of Quercus liaotungensis Koidz. Seedlings in Dongling Mountain[J]. Chin J Plan Ecolo, 2000, 24(5): 595 -600 .
[3] HE Jin-Sheng, HAN Xing-Guo. Ecological stoichiometry: Searching for unifying principles from individuals to ecosystems[J]. Chin J Plan Ecolo, 2010, 34(1): 2 -6 .
[4] Anrong Liu,Teng Yang,Wei Xu,Zijian Shangguan,Jinzhou Wang,Huiying Liu,Yu Shi,Haiyan Chu,Jin-Sheng He. Status, issues and prospects of belowground biodiversity on the Tibetan alpine grassland[J]. Biodiv Sci, 2018, 26(9): 972 -987 .
[5] Huaizhen Tian, Lin Chen, Fuwu Xing. Species diversity and conservation of orchids in Nanling National Nature Reserve, Guangdong[J]. Biodiv Sci, 2013, 21(2): 224 -231 .
[6] YANG Ming-Zhi, ZHANG Han-Bo. Physiological Responses of Gall Tissues on Ivy Tree Leaves Induced by Thrip[J]. Plant Diversity, 2010, 32(4): 339 -346 .
[7] Yi Deng, Wei Wang, Wen-Qing Li, Chuan Xia, Hong-Ze Liao, Xue-Qin Zhang and De Ye. MALE GAMETOPHYTE DEFECTIVE 2, Encoding a Sialyltransferase-like Protein, is Required for Normal Pollen Germination and Pollen Tube Growth in Arabidopsis[J]. J Integr Plant Biol, 2010, 52(9): 829 -843 .
[8] Decheng Xu, Xiaojing Wang. Axillary Bud Propagation and Regeneration from Stem Segment Explants in Calophyllum inophyllum[J]. Chin Bull Bot, 2014, 49(2): 167 -172 .
[9] Fang Ming-Yuan. New Taxa of Rhododendron from Sichuan and Xizang[J]. J Syst Evol, 1988, 26(1): 66 -68 .
[10] XU Bo, WANG Jin-Niu, SHI Fu-Sun, GAO Jing, and WU Ning. Adaptation of biomass allocation patterns of wild Fritillaria unibracteata to alpine environment in the eastern Qinghai-Xizang Plateau[J]. Chin J Plan Ecolo, 2013, 37(3): 187 -196 .