|
The impact of incorporating molecular evolutionary model into predictions of phylogenetic signal and noiseKeywords: model selection, signal, noise, phylogenetic informativeness, phylogenetic inference, maximum likelihood, Bayesian estimation Abstract: Phylogenetic inference can be improved by the development and use of better models for inference given the data available, or by gathering more appropriate data given the potential inferences to be made. Numerous studies have demonstrated the crucial importance of selecting a best-fit model to conducting accurate phylogenetic inference given a data set, explicitly revealing how model choice affects the results of phylogenetic inferences. However, the importance of specifying a correct model of evolution for predictions of the best data to be gathered has never been examined. Here, we extend analyses of phylogenetic signal and noise that predict the potential to resolve nodes in a phylogeny to incorporate all time-reversible Markov models of nucleotide substitution. Extending previous results on the canonical four-taxon tree, our theory yields an analytical method that uses estimates of the rates of evolution and the model of molecular evolution to predict the distribution of signal, noise, and polytomy. We applied our methods to a study of 29 taxa of the yeast genus Candida and allied members to predict the power of five markers, COX2, ACT1, RPB1, RPB2, and D1/D2 LSU, to resolve a poorly supported backbone node corresponding to a clade of haploid Candida species, as well as 19 other nodes that are reasonably short and at least moderately deep in the consensus tree. The use of simple, unrealistic models that did not take into account transition/transversion rate differences led to some discrepancies in predictions, but overall our results demonstrate that predictions of signal and noise in phylogenetics are fairly robust to model specification.
|