Biodiv Sci ›› 2011, Vol. 19 ›› Issue (1): 3-16. DOI: 10.3724/SP.J.1003.2011.14256
Special Issue: 昆虫多样性与生态功能
• Special Issue • Previous Articles Next Articles
An Li1,2, Guixia Xu1, Hongzhi Kong1,*()
Received:
2010-10-26
Accepted:
2010-12-30
Online:
2011-01-20
Published:
2011-04-01
Contact:
Hongzhi Kong
An Li, Guixia Xu, Hongzhi Kong. Mechanisms underlying copy number variation in F-box genes: evidence from comparison of 12 Drosophila species[J]. Biodiv Sci, 2011, 19(1): 3-16.
Fig. 1 Copy number and domain structure of F-box proteins from 12 Drosophila species. The phylogenetic tree of the 12 Drosophila species (modified from http://rana.lbl.gov/drosophila/index.html) is above the table. The first three letters of each specific epithet are used as the abbreviation for each species. On the left side of the table is a neighbor-joining (NJ) tree for 48 clades of F-box proteins whose domain structure are shown on the right. Numbers in the table indicate the copy numbers of F-box genes belonging to each clade. Clades with copy number variations are shaded.
Fig. 2 Examples of copy number variation of F-box genes. Domain legends are the same as those in A, D. ananassae gained a new copy of F-box gene; B, The most recent common ancestor (MRCA) of D.yakuba and D.erecta lost the ortholog of Clade7; C, MRCA of D.mojavensisandD.virilis gained a new gene; D, MRCA of the top six species gained a new copy of F-box gene and D.yakuba lost this gene after their divergence; E, Four gene loss events occurred independently in D. erecta, D. ananassae, D. willistoni, and D. virilis. A gene duplication event happened in the MRCA of D.pseudoobscura and D.persimilis, generating two copies, but then one copy lost in D.persimilis; F, D.willistoni lost Clade24 gene. The MRCA of the top five species gained a new copy of this gene, but then D.melanogaster lost the new copy.
Fig. 3 Evolutionary change of the copy number of F-box genes in Drosophila species. The numbers in circles and rectangles represent the copy numbers of genes in extant and ancestral species, respectively. Numbers above and below each branch indicate the numbers of gains (+) and losses (-) of genes, respectively. The phylogenetic tree is modified from Assembly/Alignment/Annotation of 12 related Drosophila species [http://rana.lbl.gov/drosophila/index.html].
Fig. 4 Chromosomal locations of F-box genes and their orthologous and paralogous relationships in the 12 Drosophila genomes. Horizontal bars represent the chromosomal arms/scaffolds/segments of the 12 Drosophila genomes. L and R indicate the left and right arms of chromosomes, respectively. Hexagonal lumps designate F-box genes. Orthologs are connected by lines. Filled circles indicate gene duplication events. Orthologs connected by curve do not exist in the species between them. Numbers below genes are the No. of clades (those in bold indicate that the copy numbers are different among the 12 species). Since the 12 species are different in karyotype, we arranged the genes from chromosomes X to 3R in D. melanogaster, and the chromosomes in other species are arranged corresponding to D. melanogaster. Lines in bold indicate three examples of copy number variation, which show one gain (Clade22), one gain and one loss (Clade14), as well as one gain and two losses (Clade24), respectively.
Fig. 5 Mechanisms for copy number variation of F-box genes. A, The mechanism that caused the increase in copy number of Clade22 was retroposition. (a) Comparison of the exon/intron structure of paralogs is showed (the upper gene is similar to ancestral gene). Regions that can match to each other are connected with thin lines, and the numbers show the length of the matched parts. (b) The intronless F-box gene (underlined and in bold) likely possesses a poly-A tail (in gray) and two direct repeats (in boxes). B, The mechanism that caused the increase in copy number of Clade18 was de novo origination. The new gene was derived from non-coding sequences between CG1792 and pasha (respectively named after orthologous genes in D. melanogaster). (c) and (d) are alignments between non-coding sequences and F-box genes (Appendix V for the details). C, Mechanisms for the decrease in copy number of F-box genes. Comparison of the exon/intron structures between orthologs are showed (the upper gene is ancestral).
冗余序列 Redundant sequences | 所选的序列 Selected sequences |
---|---|
FBpp0073103, FBpp0073101, FBpp0073102 | FBpp0073102 |
FBpp0078592, FBpp0110535, FBpp0078594, FBpp0078593 | FBpp0078593 |
FBpp0084796, FBpp0084795 | FBpp0084795 |
FBpp0110196, FBpp0078693 | FBpp0078693 |
FBpp0087190, FBpp0111982, FBpp0111980, FBpp0111981 | FBpp0111981 |
FBpp0086875, FBpp0086876 | FBpp0086876 |
FBpp0087274, FBpp0087275, FBpp0087276 | FBpp0087276 |
FBpp0083216, FBpp0083215 | FBpp0083215 |
FBpp0099839, FBpp0099383 | FBpp0099383 |
FBpp0268227, FBpp0264637 | FBpp0264637 |
FBpp0259602, FBpp0267478 | FBpp0267478 |
FBpp0267102, FBpp0267103 | FBpp0267103 |
FBpp0153606, FBpp0157663 | FBpp0157663 |
FBpp0157330, FBpp0144755 | FBpp0144755 |
FBpp0157205, FBpp0157374 | FBpp0157374 |
Appendix I Redundant sequences and the one finally selected
冗余序列 Redundant sequences | 所选的序列 Selected sequences |
---|---|
FBpp0073103, FBpp0073101, FBpp0073102 | FBpp0073102 |
FBpp0078592, FBpp0110535, FBpp0078594, FBpp0078593 | FBpp0078593 |
FBpp0084796, FBpp0084795 | FBpp0084795 |
FBpp0110196, FBpp0078693 | FBpp0078693 |
FBpp0087190, FBpp0111982, FBpp0111980, FBpp0111981 | FBpp0111981 |
FBpp0086875, FBpp0086876 | FBpp0086876 |
FBpp0087274, FBpp0087275, FBpp0087276 | FBpp0087276 |
FBpp0083216, FBpp0083215 | FBpp0083215 |
FBpp0099839, FBpp0099383 | FBpp0099383 |
FBpp0268227, FBpp0264637 | FBpp0264637 |
FBpp0259602, FBpp0267478 | FBpp0267478 |
FBpp0267102, FBpp0267103 | FBpp0267103 |
FBpp0153606, FBpp0157663 | FBpp0157663 |
FBpp0157330, FBpp0144755 | FBpp0144755 |
FBpp0157205, FBpp0157374 | FBpp0157374 |
重新注释的基因 Re-annotated genes | 对应的蛋白质 Corresponding proteins |
---|---|
Dsim\GD12690 | FBpp0211092 |
Dmel\CG14102 | FBpp0074697 |
Dyak\GE20089 | FBpp0265099 |
Dpse\GA21694 | FBpp0280727 |
Appendix II Re-annotated genes
重新注释的基因 Re-annotated genes | 对应的蛋白质 Corresponding proteins |
---|---|
Dsim\GD12690 | FBpp0211092 |
Dmel\CG14102 | FBpp0074697 |
Dyak\GE20089 | FBpp0265099 |
Dpse\GA21694 | FBpp0280727 |
Appendix III Phylogenetic tree of F-box proteins from 12 Drosophila species. The name of each protein is composed of the first three letters of the specific epithet, followed by the name of the sequence and the C-terminal domain (N means none).
进化枝编号 Clade no. | 合计 Total | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 6 | 14 | 16 | 17 | 18 | 22 | 23 | 24 | 31 | 36 | 37 | 38 | ||
串联重复 Tandem duplication | 1 | 1 | 3 | 2 | 7 (29%) | |||||||||
散在重复 Dispersed duplication | 1 | 2 | 1 | 6 | 1 | 1 | 1 | 1 | 1 | 15 (63%) | ||||
反转录转座 Retroposition | 1 | 1 (4%) | ||||||||||||
从头起源 De novo origination | 1 | 1 (4%) | ||||||||||||
合计 Total | 1 | 2 | 1 | 1 | 7 | 1 | 1 | 4 | 1 | 1 | 1 | 1 | 2 | 24 (100%) |
Appendix IV Events caused increase in copy number of F-box genes in Drosophila
进化枝编号 Clade no. | 合计 Total | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 6 | 14 | 16 | 17 | 18 | 22 | 23 | 24 | 31 | 36 | 37 | 38 | ||
串联重复 Tandem duplication | 1 | 1 | 3 | 2 | 7 (29%) | |||||||||
散在重复 Dispersed duplication | 1 | 2 | 1 | 6 | 1 | 1 | 1 | 1 | 1 | 15 (63%) | ||||
反转录转座 Retroposition | 1 | 1 (4%) | ||||||||||||
从头起源 De novo origination | 1 | 1 (4%) | ||||||||||||
合计 Total | 1 | 2 | 1 | 1 | 7 | 1 | 1 | 4 | 1 | 1 | 1 | 1 | 2 | 24 (100%) |
Appendix V Sequence alignments of Clade18 genes from D. mojavensis and D. virilis with non-coding sequences from D. willistoni and D. grimshawi, respectively.
蛋白质名称 Protein name | C端结构域 C-terminal domain | 拷贝数变异 CNV | 参与的生物学过程 Biological process involved in |
---|---|---|---|
FBpp0073102 | WD40 | No | negative regulation of growth; regulation of mitosis; DNA endoreduplication. |
FBpp0078846 | WD40 | No | protein ubiquitination during ubiquitin-dependent protein catabolic process; WD40 protein FBW5 promotes ubiquitination of tumor suppressor TSC2 by DDB1-CUL4-ROC1 ligase. |
FBpp0083434 | WD40 | No | anatomical structure development; ovarian follicle cell development; gamete generation; regulation of biological process; circadian rhythm; learning or memory; cell motion; regulation of cellular component organization; olfactory learning; positive regulation of protein import into nucleus; locomotory behavior; catabolic process; negative regulation of nurse cell apoptosis; regulation of signal transduction; regulation of Wnt receptor signaling pathway. |
FBpp0086876 | SPRY | No | negative regulation of synaptic growth at neuromuscular junction; neuromuscular synaptic transmission. |
FBpp0077147 | UBCc | No | apoptosis; induction of compound eye retinal cell programmed cell death. |
FBpp0078693 | LRR | No | circadian behavior; locomotor rhythm; entrainment of circadian clock by photoperiod. |
FBpp0078970 | IBR | No | compound eye morphogenesis; negative regulation of protein catabolic process; G2 phase of mitotic cell cycle; G1/S transition of mitotic cell cycle; eye-antennal disc morphogenesis; regulation of mitosis. |
FBpp0076199 | N | Yes | engulfment of apoptotic cell. |
FBpp0073740 | LRR | Yes | deactivation of rhodopsin mediated signaling. |
Appendix VI Relationships between functions and copy number variation of F-box proteins
蛋白质名称 Protein name | C端结构域 C-terminal domain | 拷贝数变异 CNV | 参与的生物学过程 Biological process involved in |
---|---|---|---|
FBpp0073102 | WD40 | No | negative regulation of growth; regulation of mitosis; DNA endoreduplication. |
FBpp0078846 | WD40 | No | protein ubiquitination during ubiquitin-dependent protein catabolic process; WD40 protein FBW5 promotes ubiquitination of tumor suppressor TSC2 by DDB1-CUL4-ROC1 ligase. |
FBpp0083434 | WD40 | No | anatomical structure development; ovarian follicle cell development; gamete generation; regulation of biological process; circadian rhythm; learning or memory; cell motion; regulation of cellular component organization; olfactory learning; positive regulation of protein import into nucleus; locomotory behavior; catabolic process; negative regulation of nurse cell apoptosis; regulation of signal transduction; regulation of Wnt receptor signaling pathway. |
FBpp0086876 | SPRY | No | negative regulation of synaptic growth at neuromuscular junction; neuromuscular synaptic transmission. |
FBpp0077147 | UBCc | No | apoptosis; induction of compound eye retinal cell programmed cell death. |
FBpp0078693 | LRR | No | circadian behavior; locomotor rhythm; entrainment of circadian clock by photoperiod. |
FBpp0078970 | IBR | No | compound eye morphogenesis; negative regulation of protein catabolic process; G2 phase of mitotic cell cycle; G1/S transition of mitotic cell cycle; eye-antennal disc morphogenesis; regulation of mitosis. |
FBpp0076199 | N | Yes | engulfment of apoptotic cell. |
FBpp0073740 | LRR | Yes | deactivation of rhodopsin mediated signaling. |
[1] |
Bai C, Richman R, Elledge SJ (1994) Human cyclin F. The EMBO Journal, 13,6087-6098.
URL PMID |
[2] | Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL (2002) The Pfam protein families database. Nucleic Acids Research, 30,276-280. |
[3] |
Beckmann JS, Estivill X, Antonarakis SE (2007) Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability. Nature Reviews Genetics, 8,639-646.
DOI URL PMID |
[4] |
Cai J, Zhao R, Jiang H, Wang W (2008) De novo origination of a new protein-coding gene in Saccharomyces cerevisiae. Genetics, 179,487-496.
DOI URL PMID |
[5] |
Celniker SE, Wheeler DA, Kronmiller B, Carlson JW, Halpern A, Patel S, Adams M, Champe M, Dugan SP, Frise E, Hodgson A, George RA, Hoskins RA, Laverty T, Muzny DM, Nelson CR, Pacleb JM, Park S, Pfeiffer BD, Richards S, Sodergren EJ, Svirskas R, Tabor PE, Wan K, Stapleton M, Sutton GG, Venter C, Weinstock G, Scherer SE, Myers EW, Gibbs RA, Rubin GM (2002) Finishing a whole- genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biology, 3, research0079.1-0079.14.
DOI URL PMID |
[6] |
Derti A, Roth FP, Church GM, Wu CT (2006) Mammalian ultraconserved elements are strongly depleted among segmental duplications and copy number variants. Nature Genetics, 38,1216-1220.
DOI URL PMID |
[7] |
Dharmasiri N, Dharmasiri S, Weijers D, Lechner E, Yamada M, Hobbie L, Ehrismann JS, Jürgens G, Estelle M (2005) Plant development is regulated by a family of auxin receptor F box proteins. Developmental Cell, 9,109-119.
DOI URL PMID |
[8] |
Drosophila 12 Genomes Consortium (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature, 450,203-218.
DOI URL PMID |
[9] |
Eddy SR (1998) Profile hidden Markov models. Bioinformatics, 14,755-763.
DOI URL PMID |
[10] |
Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 52,696-704.
DOI URL PMID |
[11] |
Han J, Gon P, Reddig K, Mitra M, Guo P, Li HS (2006) The fly CAMTA transcription factor potentiates deactivation of rhodopsin, a G protein-coupled light receptor. Cell, 127,847-858.
URL PMID |
[12] |
Hinds DA, Kloek AP, Jen M, Chen XY, Frazer KA (2006) Common deletions and SNPs are in linkage disequilibrium in the human genome. Nature Genetics, 38,82-85.
DOI URL PMID |
[13] |
Junier T, Pagni M (2000) Dotlet: diagonal plots in a web browser. Bioinformatics, 16,178-179.
DOI URL PMID |
[14] |
Kipreos ET, Pagano M (2000) The F-box protein family. Genome Biology, 1,REVIEWS3002.
DOI URL PMID |
[15] | Levine MT, Jones CD, Kern AD, Lindfors HA, Begun DJ (2006) Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. Proceedings of the National Academy of Sciences, USA, 103,9935-9939. |
[16] |
Li D, Dong Y, Jiang Y, Jiang H, Cai J, Wang W (2010) A de novo originated gene depresses budding yeast mating pathway and is repressed by the protein encoded by its antisense strand. Cell Research, 20,408-420.
DOI URL PMID |
[17] |
Long M, Betran E, Thornton K, Wang W (2003) The origin of new genes: glimpses from the young and old. Nature Reviews Genetics, 4,865-875.
DOI URL PMID |
[18] | Markow TA, O’Grady PM (2005) Drosophila: A Guide to Species Identification and Use, P. 250. Elsevier Academic, London. |
[19] | Nei M (2007) The new mutation theory of phenotypic evolution. Proceedings of the National Academy of Sciences, USA, 104,12235-12242. |
[20] | Nicholas K, Nicholas HB Jr (1997) GeneDoc: a tool for editing and annotating multiple sequence alignments. Distributed by the author. |
[21] |
Niimura Y, Nei M (2006) Evolutionary dynamics of olfactory and other chemosensory receptor genes invertebrates. Journal of Human Genetics, 51,505-517.
URL PMID |
[22] |
Niimura Y, Nei M (2007) Extensive gains and losses of olfactory receptor genes in mammalian evolution. PLoS One, 2,e708.
DOI URL PMID |
[23] | Nozawa M, Nei M (2007) Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proceedings of the National Academy of Sciences, USA, 104,7122-7127. |
[24] |
Perry GH, Dominy NJ, Claw KG, Lee AS, Fiegler H, Redon R, Werner J, Villanea FA, Mountain JL, Misra R, Carter NP, Lee C, Stone AC (2007) Diet and the evolution of human amylase gene copy number variation. Nature Genetics, 39,1256-1260.
DOI URL PMID |
[25] |
Posada D, Crandall KA (1998) MODELTEST: testing the model of DNA substitution. Bioinformatics, 14,817-818.
DOI URL PMID |
[26] |
Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen WW, Cho EK, Dallaire S, Freeman JL, González JR, Gratacòs M, Huang J, Kalaitzopoulos D, Komura D, MacDonald JR, Marshall CR, Mei R, Montgomery L, Nishimura K, Okamura K, Shen F, Somerville MJ, Tchinda J, Valsesia A, Woodwark C, Yang F, Zhang J, Zerjal T, Zhang J, Armengol L, Conrad DF, Estivill X, Tyler-Smith C, Carter NP, Aburatani H, Lee C, Jones KW, Scherer SW, Hurles ME (2006) Global variation in copy number in the human genome. Nature, 444,444-454.
URL PMID |
[27] |
Reese MG, Hartze G, Harris NL, Ohler U, Abril JF, Lewis SE (2000) Genome annotation assessment in Drosophila melanogaster. Genome Research, 10,483-501.
DOI URL PMID |
[28] |
Schaeffer SW, Bhutkar A, McAllister BF, Matsuda M, Matzkin LM, O’Grady PM, Rohde C, Valente VL, Aguadé M, Anderson WW, Edwards K, Garcia AC, Goodman J, Hartigan J, Kataoka E, Lapoint RT, Lozovsky ER, Machado CA, Noor MA, Papaceit M, Reed LK, Richards S, Rieger TT, Russo SM, Sato H, Segarra C, Smith DR, Smith TF, Strelets V, Tobari YN, Tomimura Y, Wasserman M, Watts T, Wilson R, Yoshida K, Markow TA, Gelbart WM, Kaufman TC (2008) Polytene chromosomal maps of 11 Drosophila species: the order of genomic scaffolds inferred from genetic and physical maps. Genetics, 179,1601-1655.
DOI URL PMID |
[29] |
Sharp AJ, Locke DP, McGrath SD, Cheng Z, Bailey JA, Vallente RU, Pertz LM, Clark RA, Schwartz S, Segraves R, Oseroff VV, Albertson DG, Pinkel D, Eichler EE (2005) Segmental duplications and copy-number variation in the human genome. The American Journal of Human Genetics, 77,78-88.
DOI URL PMID |
[30] |
Silva E, Au-Yeung HW, Van Goethem E, Burden J, Franc NC (2007) Requirement for a Drosophila E3-ubiquitin ligase in phagocytosis of apoptotic cells. Immunity, 27,585-596.
DOI URL PMID |
[31] | Singh ND, Larracuente AM, Sackton TB, Clark AG (2009) Comparative genomics on the Drosophila phylogenetic tree. Annual Review of Ecology, Evolution, and Systematics, 40,459-480. |
[32] | Small KS, Brudno M, Hill MM, Sidow A (2007) Extreme genomic variation in a natural population. Proceedings of the National Academy of Sciences, USA, 104,5698-5703. |
[33] | Stark A, Lin MF, Kheradpour P, Pedersen JS, Parts L, Carlson JW, Crosby MA, Rasmussen MD, Roy S, Deoras AN, Ruby JG, Brennecke J, Harvard FlyBase curators, Berkeley Drosophila Genome Project, Hodges E, Hinrichs AS, Caspi A, Paten B, Park S, Han MV, Maeder ML, Polansky BJ, Robson BE, Aerts S, Helden J, Hassan B, Gilbert DG, Eastman DA, Rice M, Weir M, Hahn MW, Park Y, Pachter L, Kent WJ, Haussler D, Lai EC, Bartel DP, Hannon GJ, Kaufman TC, Eisen MB, Clark AG, Smith D, Celniker SE, Gelbart WM, Kellis M, Dewey CN (2007) Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature, 450,219-232. |
[34] |
Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution, 24,1596-1599.
DOI URL PMID |
[35] |
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research, 25,4876-4882.
DOI URL PMID |
[36] | Xu G, Ma H, Nei M, Kong H (2009) Evolution of F-box genes in plants: different modes of sequence divergence and their relationships with functional diversification. Proceedings of the National Academy of Sciences, USA, 106,835-840. |
[37] |
Zhou Q, Zhang G, Zhang Y, Xu S, Zhao R, Zhan Z, Li X, Ding Y, Yang S, Wang W (2008) On the origin of new genes in Drosophila. Genome Research, 18,1446-1455.
DOI URL PMID |
[1] | Xintong Gong, Fei Chen, Huanhuan Gao, Xinqiang Xi. Larva and adult competition between two Drosophila species and the effects on species coexistence [J]. Biodiv Sci, 2023, 31(8): 22603-. |
[2] | Xue Haoyue, Xu Guixia, Guo Chunce, Shan Hongyan, Kong Hongzhi. Comparative evolutionary analysis of MADS-box genes in Arabidopsis thaliana and A. lyrata [J]. Biodiv Sci, 2010, 18(2): 109-119. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
Copyright © 2022 Biodiversity Science
Editorial Office of Biodiversity Science, 20 Nanxincun, Xiangshan, Beijing 100093, China
Tel: 010-62836137, 62836665 E-mail: biodiversity@ibcas.ac.cn