[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

RE: New paper on Neoaves



Mickey Mortimer wrote:

I wonder.... how many well supported (>95% Bayesian or bootstrap) molecular nodes have later been 'disproven'? And of the ones that have been, in how many cases was it a matter of needing to throw more taxa or bases in the analysis?

You and David hold to the assumption that homoplasy is random, and can be overcome simply by enlarging the dataset or expanding the taxon sample. Thus, the phylogenetic signal, which denotes a common and shared ancestry, should eventually overcome this randomly distributed 'noise'. But with genes and proteins, this is often a brave assumption to make.


Here's one case that highlights the problem. Naylor and Brown (1998) used over 12,000 bases (from 19 mitochondrial genes) and recovered 100% bootstrap support for a clade that comprised vertebrates and echinoderms to the exclusion of amphioxus (lancelet). This topology was recovered regardless of the method of analysis. The authors didn't believe their own tree, since it is contradicted by compelling morphological evidence that vertebrates are closer to lancelets than to echinoderms. They argued *against* simply accruing even more taxa or even longer datasets (not possible for the mitochondrial genome - they had already used every gene!). Instead, they favored investigating the underlying factor(s) that are pulling the vertebrate and echinoderm sequences together, and dispute the assumption that homoplasy ('noise') is distributed randomly in the dataset.

I know I'm on my hobby horse here, and I apologize for the verbose postings. But the thing that is obvious about *morphology*-based phylogenetic analyses is that they are almost always followed by a discussion of which morphological characters (synapomorphies) unite which taxa. In other words, it's plain to see the identity of the characters that diagnose certain clades. This rarely happens with *molecular* clades. Here, the characters are at the level of genes and amino acids, and the structural and functional properties of the sequences are skimmed over. Instead, researchers tend to focus on bootstrap support (or posterior probabilities, in the Bayesian world) as the final determinant for a 'good' tree, and move on. However, I'd like to see more discussion of the gene- or amino-acid-level factors that are responsible for the topology of the tree. For example, what sequence-level characters are putting Coronaves together, or pulling Falconiformes apart?

Cheers

Tim