Allopolyploidy combines two progenitor genomes in the same nucleus. It is a common speciation process, especially in plants. Deciphering the origins of polyploid species is a complex problem due to, among other things, extinct progenitors, multiple origins, gene flow between different polyploid populations, and loss of parental contributions through gene or chromosome loss. Among the perennial species of Glycine, the plant genus that includes the cultivated soybean (G. max), are eight allopolyploid species, three of which are studied here. Previous crossing studies and molecular systematic results from two nuclear gene sequences led to hypotheses of origin for these species from among extant diploid species. We use several phylogenetic and population genomics approaches to clarify the origins of the genomes of three of these allopolyploid species using single nucleotide polymorphism data and a guided transcriptome assembly. The results support the hypothesis that all three polyploid species are fixed hybrids combining the genomes of the two putative parents hypothesized on the basis of previous work. Based on mapping to the soybean reference genome, there appear to be no large regions for which one homoeologous contribution is missing. Phylogenetic analyses of 27 selected transcripts using a coalescent approach also are consistent with multiple origins for these allopolyploid species, and suggest that origins occurred within the last several hundred thousand years.
References
[1]
Arnold B, Bomblies K, Wakeley J. 2012. Extending coalescent theory to autotetraploids. Genetics 192:195-204
[2]
Aronesty E. 2013. Comparison of sequencing utility programs. Open Bioinformatics Journal 7:1-8
[3]
Blanc G, Wolfe K. 2004. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. The Plant Cell 16:1667-1678
[4]
Bombarely A, Edwards KD, Sanchez-Tamburrino J, Mueller LA. 2012. Deciphering the complex leaf transcriptome of the allotetraploid species Nicotiana tabacum: a phylogenomic perspective. BMC Genomics 13:406
[5]
Brown AHD, Doyle JL, Grace JP, Doyle JJ. 2002. Molecular phylogenetic relationships within and among diploid races of Glycine tomentella (Leguminosae) Australian Systematic Botany 15:37-47
[6]
Bruen TC, Philippe H, Bryant D. 2006. A simple and robust statistical test for detecting the presence of recombination. Genetics 172(4):2665-2681
[7]
Cannon SB, Ilut D, Farmer AD, Maki SL, May GD, Singer SR, Doyle JJ. 2010. Polyploidy did not predate the evolution of nodulation in all legumes. PLoS ONE 5:e11630
[8]
Chester M, Gallagher JP, Symonds VV, Cruz da Silva AV, Mavrodiev EV, Leitch AR, Soltis PS, Soltis DE. 2012. Extensive chromosomal variation in a recently formed natural allopolyploid species, Tragopogon miscellus (Asteraceae) Proceedings of the National Academy of Sciences of the United States of America 109:1176-1181
[9]
Coate JE, Bar H, Doyle JJ. 2014. Extensive translational regulation of gene expression in an allopolyploid correlates with long term retention of duplicated genes. The Plant Cell 26:136-150
[10]
Coate JE, Doyle JJ. 2010. Quantifying whole transcriptome size, a prerequisite for understanding transcriptome evolution across species: an example from a plant allopolyploid. Genome Biology and Evolution 2:534-546
[11]
Coate JE, Doyle JJ. 2013. Genomics and transcriptomics of photosynthesis in polyploids. In: Chen ZJ, Birchler JA, eds. Polyploid and hybrid genomics. Hoboken, NJ: Wiley-Blackwell. 153-169
[12]
Coate JE, Luciano AK, Seralathan V, Minchew KJ, Owens TG, Doyle JJ. 2012. Anatomical, biochemical, and photosynthetic responses to recent allopolyploidy in Glycine dolichocarpa (Fabaceae) American Journal of Botany 99:55-67
[13]
Coate JE, Powell AF, Owens TG, Doyle JJ. 2013. Transgressive physiological and transcriptomic responses to light stress in allopolyploid Glycine dolichocarpa (Leguminosae) Heredity 110:160-170
[14]
Crisp M, Cook L, Steane D. 2004. Radiation of the Australian flora: what can comparisons of molecular phylogenies across multiple taxa tell us about the evolution of diversity in present-day communities? Philosophical transactions of the Royal Society of London. Series B, Biological Sciences 359:1551-1571
[15]
Darlington CD. 1937. Recent advances in cytology (2nd edition). Philadelphia: P. Blakiston’s son and Co.
[16]
Darriba D, Taboada GL, Doallo R, Posada D. 2012. jModelTest 2: more models, new heuristics and parallel computing. Nature Methods 9:772
[17]
Doyle JJ. 2012. Polyploidy in legumes. In: Soltis PS, Soltis DE, eds. Polyploidy and genome evolution. Berlin, Heidelberg: Springer Berlin Heidelberg. 147-180
[18]
Doyle MJ, Brown AHD. 1985. Numerical analysis of isozyme variation in Glycine tomentella. Biochemical Systematics and Ecology 13:413-419
[19]
Doyle JJ, Doyle JL, Brown A, Palmer RG. 2002. Genomes, multiple origins, and lineage recombination in the glycine tomentella (Leguminosae) polyploid complex: histone H3-D gene sequences. Evolution 56(7):1388-1402
[20]
Doyle JJ, Doyle JL, Rauscher J, Brown A. 2004. Diploid and polyploid reticulate evolution throughout the history of the perennial soybeans (Glycine Subgenus Glycine) New Phytologist 161:121-132
[21]
Doyle JJ, Egan AN. 2010. Dating the origins of polyploidy events. New Phytologist 186:73-85
[22]
Doyle JJ, Schuler MA, Godette WD, Zenger V, Beachy RN, Slightom JL. 1986. The glycosylated seed storage proteins of Glycine max and Phaseolus vulgaris. Structural homologies of genes and proteins. Journal of Biological Chemistry 261(20):9228-9238
[23]
Drummond AJ, Suchard MA, Xie D, Rambaut A. 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular Biology and Evolution 29:1969-1973
[24]
Dufresne F, Stift M, Vergilino R, Mable BK. 2014. Recent progress and challenges in population genetics of polyploid organisms: an overview of current state-of-the-art molecular and statistical tools. Molecular Ecology 23:40-69
[25]
Du J, Tian Z, Sui Y, Zhao M, Song Q, Cannon SB, Cregan P, Ma J. 2012. Pericentromeric effects shape the patterns of divergence, retention, and expression of duplicated genes in the paleopolyploid soybean. The Plant Cell 24:21-32
[26]
Edwards SV. 2009. Is a new and general theory of molecular systematics emerging? Evolution 63:1-19
[27]
Egan AN, Doyle JJ. 2010. A comparison of global, gene-specific, and relaxed clock methods in a comparative genomics framework: dating the polyploid history of soybean (Glycine max) Systematic Biology 59:534-547
[28]
Evanno G, Regnaut S, Goudet J. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology 14:2611-2620
[29]
Freeling M, Woodhouse MR, Subramaniam S, Turco G, Lisch D, Schnable JC. 2012. Fractionation mutagenesis and similar consequences of mechanisms removing dispensable or less-expressed DNA in plants. Current Opinion in Plant Biology 15:131-139
[30]
Gilad Y, Pritchard JK, Thornton K. 2009. Characterizing natural variation using next-generation sequencing technologies. Trends in Genetics 25:463-471
[31]
González-Orozco CE, Brown AHD, Knerr N, Miller JT, Doyle JJ. 2012. Hotspots of diversity of wild Australian soybean relatives and their conservation in situ. Conservation Genetics 13:1269-1281
[32]
Grant JE, Brown AHD, Grace JP. 1984. Cytological and isozyme diversity in Glycine tomentella Hayata (Leguminosae) Australian Journal of Botany 32:665-677
[33]
Grover CE, Salmon A, Wendel JF. 2012. Targeted sequence capture as a powerful tool for evolutionary analysis. American Journal of Botany 99:312-319
[34]
Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology 52:696-704
[35]
Harbert RS, Brown AHD, Doyle JJ. 2014. Allopolyploidy, climate niche modeling, and evolutionary success in Glycine (Leguminosae) American Journal of Botany 101:710-721
[36]
Hegarty M, Coate J, Sherman-Broyles S, Abbott R, Hiscock S, Doyle J. 2013. Lessons from natural and artificial polyploids in higher plants. Cytogenetic and Genome Research 140:204-225
[37]
Heled J, Drummond AJ. 2010. Bayesian inference of species trees from multilocus data. Molecular Biology and Evolution 27:570-580
[38]
Hollister JD, Arnold BJ, Svedin E, Xue KS, Dilkes BP, Bomblies K. 2012. Genetic adaptation associated with genome-doubling in autotetraploid Arabidopsis arenosa. PLoS Genetics 8:e1003093
[39]
Hsing Y-LC, Hsieh J-S, Peng C-L, Chou C-H, Chiang T-Y. 2001. Systematic status of the Glycine tomentella and G. tabacina species complexes (Fabaceae) based on ITS sequences of nuclear ribosomal DNA. Journal of Plant Research 114:435-442
[40]
Hudjashov G, Kivisild T, Underhill PA, Endicott P, Sanchez JJ, Lin AA, Shen P, Oefner P, Renfrew C, Villems R, Forster P. 2007. Revealing the prehistoric settlement of Australia by Y chromosome and mtDNA analysis. Proceedings of the National Academy of Sciences of the United States of America 104:8726-8730
[41]
Huson DH, Bryant D. 2006. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23:254-267
[42]
Hymowitz T, Singh RJ, Kollipara KP. 2010. The genomes of the glycine. In: Plant breeding reviews. Oxford: John Wiley & Sons, Inc.. 289-317
[43]
Ilut DC, Coate JE, Luciano AK, Owens TG, May GD, Farmer A, Doyle JJ. 2012. A comparative transcriptomic study of an allotetraploid and its diploid progenitors illustrates the unique advantages and challenges of RNA-seq in plant species. American Journal of Botany 99:383-396
[44]
Innes RW, Ameline-Torregrosa C, Ashfield T, Cannon E, Cannon SB, Chacko B, Chen NWG, Couloux A, Dalwani A, Denny R, Deshpande S, Egan AN, Glover N, Hans CS, Howell S, Ilut D, Jackson S, Lai H, Mammadov J, del Campo SM, Metcalf M, Nguyen A, O’Bleness M, Pfeil BE, Podicheti R, Ratnaparkhe MB, Samain S, Sanders I, Segurens B, Sevignac M, Sherman-Broyles S, Thareau V, Tucker DM, Walling J, Wawrzynski A, Yi J, Doyle JJ, Geffroy V, Roe BA, Maroof MAS, Young ND. 2008. Differential accumulation of retroelements and diversification of NB-LRR disease resistance genes in duplicated regions following polyploidy in the ancestor of soybean. Plant Physiology 148:1740-1759
[45]
Jiao Y, Leebens-Mack J, Ayyampalayam S, Bowers JE, McKain MR, McNeal J, Rolf M, Ruzicka DR, Wafula E, Wickett NJ, Wickett X, Wu Y, Zhang J, Wang Y, Zhang EJ, Carpenter MK, Deyholos TM, Kutchan AS, Chanderbali PS, Soltis DW, Stevenson R, McCombie J, Pires G, Wong DE, Soltis CW, dePamphilis CW. 2012. A genome triplication associated with early diversification of the core eudicots. Genome Biology 13:R3
[46]
Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, Soltis DE, Clifton SW, Schlarbaum SE, Schuster SC, Ma H, Leebens-Mack J, dePamphilis CW. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473:U97-U113
[47]
Joly S, Rauscher JT, Sherman-Broyles SL, Brown AHD, Doyle JJ. 2004. Evolutionary dynamics and preferential expression of homeologous 18S-5.8S-26S nuclear ribosomal genes in natural and artificial glycine allopolyploids. Molecular Biology and Evolution 21:1409-1421
[48]
Jones G, Sagitov S, Oxelman B. 2013. Statistical inference of allopolyploid species networks in the presence of incomplete lineage sorting. Systematic Biology 62:467-478
[49]
Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nature Methods 9:357-359
[50]
Lawson DJ, Hellenthal G, Myers S, Falush D. 2012. Inference of population structure using dense haplotype data. PLoS Genetics 8:e1002453
[51]
Leitch IJ, Bennett MD. 2004. Genome downsizing in polyploid plants. Biological Journal of the Linnean Society 82:651-663
[52]
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078-2079
[53]
Lynch M, Conery JS. 2000. The evolutionary fate and consequences of duplicate genes. Science 290:1151-1155
[54]
Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M, Van de Peer Y. 2005. Modeling gene and genome duplications in eukaryotes. Proceedings of the National Academy of Sciences of the United States of America 102:5454-5459
[55]
McClintock B. 1984. The significance of responses of the genome to challenge. Science 226:792-801
[56]
Newell CA, Hymowitz T. 1978. Seed coat variation in glycine willd. Subgenus glycine (leguminosae) by sem. Brittonia 30:76-88
[57]
Pandit MK, Pocock MJO, Kunin WE. 2011. Ploidy influences rarity and invasiveness in plants. Journal of Ecology 99:1108-1115
[58]
Pritchard J, Stephens M, Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-959
[59]
Pugach I, Delfin F, Gunnarsdottir E, Kayser M, Stoneking M. 2013. Genome-wide data substantiate Holocene gene flow from India to Australia. Proceedings of the National Academy of Sciences of the United States of America 110:1803-1808
[60]
Rambaut A. 2012. FigTree (version 1.4.0). Available at http://tree.bio.ed.ac.uk/software/figtree/
[61]
Ramsey J, Schemske DW. 2002. Neopolyploidy in flowering plants. Annual Review of Ecology and Systematics 33:589-639
[62]
Ratnaparkhe MB, Singh RJ, Doyle JJ. 2011. Kole C, ed. Glycine. Wild crop relatives: genomic and breeding resources, legume crops and forage. Berlin, Heidelberg: Springer-Verlag Berlin Heidelberg. 83-116
[63]
Rauscher JT, Doyle JJ, Brown AHD. 2004. Multiple origins and nrDNA internal transcribed spacer homeologue evolution in the Glycine tomentella (Leguminosae) allopolyploid complex. Genetics 166:987-998
[64]
Schlueter J, Dixon P, Granger C, Grant D, Clark L, Doyle JJ, Shoemaker R. 2004. Mining EST databases to resolve evolutionary events in major crop species. Genome 47:868-876
[65]
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang X-C, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. 2010. Genome sequence of the palaeopolyploid soybean. Nature 463:178-183
[66]
Schnable JC, Freeling M. 2011. Genes identified by visible mutant phenotypes show increased bias toward one of two subgenomes of maize. PLoS ONE 6:e17855
[67]
Shoemaker RC, Schlueter J, Doyle JJ. 2006. Paleopolyploidy and gene duplication in soybean and other legumes. Current Opinion in Plant Biology 9:104-109
[68]
Singh RJ, Kim HH, Hymowitz T. 2001. Distribution of rDNA loci in the genus Glycine Willd. Theoretical and Applied Genetics 103:212-218
[69]
Singh RJ, Kollipara KP, Hymowitz T. 1998. The genomes of Glycine canescens FJ Herm., and G. tomentella Hayata of Western Australia and their phylogenetic relationships in the genus Glycine Willd. Genome 41:669-679
[70]
Slotte T, Bataillon T, Hansen TT, St Onge K, Wright SI, Schierup MH. 2011. Genomic determinants of protein evolution and polymorphism in Arabidopsis. Genome Biology and Evolution 3:1210-1219
[71]
Slotte T, Hazzouri KM, gren JA, Koenig D, Maumus F, Guo Y-L, Steige K, Platts AE, Escobar JS, Newman LK, Wang W, Mandáková T, Vello E, Smith LM, Henz SR, Steffen J, Takuno S, Brandvain Y, Coop G, Andolfatto P, Hu TT, Blanchette M, Clark RM, Quesneville H, Nordborg M, Gaut BS, Lysak MA, Jenkins J, Grimwood J, Chapman J, Prochnik S, Shu S, Rokhsar D, Schmutz J, Weigel D, Wright SI. 2013. The Capsella rubella genome and the genomic consequences of rapid mating system evolution. Nature Genetics 45:831-835
[72]
Soltis DE, Albert VA, Leebens-Mack J, Bell CD, Paterson AH, Zheng C, Sankoff D, dePamphilis CW, Wall PK, Soltis PS. 2009. Polyploidy and angiosperm diversification. American Journal of Botany 96:336-348
[73]
Soltis DE, Buggs RJA, Doyle JJ, Soltis PS. 2010. What we still don’t know about polyploidy. Taxon 59:1387-1403
[74]
Symonds VV, Soltis PS, Soltis DE. 2010. Dynamics of polyploid formation in Tragopogon (Asteraceae): recurrent formation, gene flow, and population structure. Evolution 64:1984-2003
[75]
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. 2012. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols 7:562-578
[76]
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology 28:511-174
[77]
Wang X, Tang H, Bowers JE, Feltus FA, Paterson AH. 2007. Extensive concerted evolution of rice paralogs and the road to regaining independence. Genetics 177:1753-1763
[78]
Wood TE, Takebayashi N, Barker MS, Mayrose I, Greenspoon PB, Rieseberg LH. 2009. The frequency of polyploid speciation in vascular plants. Proceedings of the National Academy of Sciences of the United States of America 106:13875-13879