Biodiversity Science ›› 2020, Vol. 28 ›› Issue (5): 587-595.doi: 10.17520/biods.2020156

Verification of virus identity and host association using genomics technology

Benfeng Han, Xin Zhou, Xue Zhang()   

  1. Department of Entomology, College of Plant Protection, China Agricultural University, Beijing 100193
  • Received:2020-04-16 Accepted:2020-06-09 Online:2020-06-18
  • Xue Zhang

Genomics technology, especially metagenomic sequencing, has played an important role in identifying and tracing unknown viruses. While classical methods in virus taxonomy rely on phenotypic traits, the metagenomics pipeline assembles new virus genomes from short nucleotide fragments without the need for any a priori reference sequences. This new technology increases the efficiency in identifying viruses and hosts associated with those viruses. This is particularly useful in identifying viruses that can cause epidemics. One current challenge in accomplishing this, is the ability to trace the original and intermediate viral hosts. To do this, a comprehensive virus sequence library characterized by definite host information is needed. Unfortunately, such information is still limited. As wild and stock animals are main sources for pathogenic viruses, an extensive survey of the global virome is vitally important to help identify and prevent zoonotic epidemics. This review summarizes the application of genomics technologies in the identification of viruses and the hosts associated with those viruses, using the outbreak of SARS-CoV-2 as an example. We also address intrinsic drawbacks of current methodologies as well as the incompleteness of available virus libraries. We propose the necessity and feasibility in constructing a comprehensive virus database with host association that emphasizes the diversity of viruses and their interactions with other organisms.

Key words: SARS-CoV-2, high-throughput sequencing, virus diversity, host association, virus evolution

Table 1

Methods used in the phylogenetic analysis of viruses"

分析方法 Method 作用 Application 常用软件 Software 参考文献 Reference
Phylogenetic trees
分析不同生物间相关性, 通过树状分支可视化生物之间的亲缘关系并推测进化历史
Analyzing the correlation between different organisms, visualizing the relationship between organisms through tree branches and speculating on the evolutionary history
Zhou et al, 2020
Wu et al, 2020
Frias-De-Diego et al, 2019
Yuen et al, 2019
Reconstructing ancestral state
in phylogenies (RASP)
重建祖先在系统发生树上的地理分布, 推断历史生物地理学信息
Inferring historical biogeography through reconstructing ancestral geographic distributions on phylogenetic trees
RASP Luo et al, 2015
Frias-De-Diego et al, 2019
Yuen et al, 2019
Phylogenetic network
综合一系列系统发育树的可视化结果, 较直观地展示重组等性状冲突事件
Enabling the visualization of a multitude of optimal trees, displaying reorganization and other trait conflict events
Yu et al, 2020
Haplotype network
An intuitive method used in visualizing relationships between individual genotypes in a population level
PopART Tang et al, 2020
Leigh & Bryant, 2015
Bayesian evolutionary
Inferring the time when the population diverged based on time evolutionary tree modeling by BEAST
BEAST Luo et al, 2015
Bouckaert et al, 2014
Suchard et al, 2018
Recombination analysis
检验可能存在的重组信号, 揭示重组在基因进化中的作用
Identifying possible recombination signals and revealing the role of recombination in gene evolution
Wu et al, 2020
Lam et al, 2020

Table 2

Virus sequences deposited in the ViPR database (Data are from"


Percent without host identification
Arenaviridae 177 4,532 3,012 6,949 23.23%
Caliciviridae 225 52,189 49,073 96,673 28.16%
Coronaviridae 1,043 34,864 28,823 119,573 22.80%
Filoviridae 16 3,577 3,390 22,038 22.53%
Flaviviridae 367 345,546 261,780 877,286 39.36%
Hantaviridae 304 10,189 6,867 10,603 14.02%
Hepeviridae 33 17,838 15,022 19,203 17.95%
Herpesviridae 782 58,371 45,281 300,180 60.18%
Nairoviridae 38 3,669 1,931 3,553 9.92%
Paramyxoviridae 574 50,355 44,898 67,728 24.22%
Peribunyaviridae 183 5,031 2,434 6,367 22.80%
Phasmaviridae 16 1,106 340 1,114 0.27%
Phenuiviridae 215 5,189 2,934 7,133 13.37%
Picornaviridae 1,038 127,336 116,298 346,846 26.91%
Pneumoviridae 17 37,289 33,516 60,235 30.36%
Poxviridae 283 10,444 7,487 125,948 40.87%
Reoviridae 363 107,566 39,985 108,677 15.93%
Rhabdoviridae 530 33,347 26,510 46,761 22.12%
Togaviridae 60 12,800 10,854 46,764 36.25%
总计 Total 6,264 921,238 700,435 2273,631
