[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Re: Cladobabble



>>Also from Part 1, you asserted knowledge on the part of the analyst of
when a difference becomes significant.<
Actually, this is absolutely the opposite of what I meant. In fact, that was
what my post was aimed against. I was not trying to show how systematists
can pick and choose the characters which they want, I was trying to show how
they SHOULDN'T. I think if you reread the post from that
perspective, you may find we agree on more issues than you originally
thought.<

Thanks for your prompt and thoughtful response.
Before asking for further clarification, I have to mention an observation of
my own:  flukes happen!!  I work on gambling for a state government, and
part of my job includes calculating odds and, for fixed prize lottery games,
figuring out the likelihood that so much will be won that annual
profitability will be affected.  In one game I calculated that
a certain number of winners will occur once every 50 years.  Happened inside
6 months.  Another event had a probability of occurring of 0.4%, once in 250
trials.  Happened the first time.
When you're dealing with small numbers of samples, the idea that the
patterns you identify will apply to the entire population is very iffy.  The
total number of dino samples is very small, and considered probably
skewed.  My skepticism reflex is strong.

Now, concerning what you said above, please remember that you did observe
'character identification is a manual process and is therefore somewhat
subjective.'  You also gave reasons for intentionally excluding a character,
such as warping the result of the analysis by including too many related
features.  (Plato, he of the 'featherless biped', would be intrigued by the
bird-human clade.)  You've added another reason; a character should not be
included if it is the ancestral state in a single taxon.  (Leaving aside how
you would know the ancestral state without having already done an analysis.)
Now, if a character is removed, that does have an impact on the analysis.
So, you're arguing for restraint in managing the characters, but not an
entire absence of discernment.  This is the 'little bit pregnant' problem.

<But only analysis of the most parsimonious tree(s) (and, perhaps, less
parsimonious trees or some consensus of them) will suggest which characters
are "meaningful." this is, of course, assuming that "meaningful" means
"helpful
in elucidating phylogeny.">

Here, too, you're describing an intervention.  This is a feedback loop in
which characters would be (implicitly) excluded as they become irrelevant to
the analysis.

<This is why I advocate including all reasonable characters (reasonable
being non-redundant and phylogenetically
informative)
 This is, of course, one of the great things about a cladistic
analysis: if you don't like a character, YOU can rerun the dataset without
it.>

See the contradiction?

Now, the OTHER piece of the problem:


 < As for your latter point, this is, I feel, the only real issue at the
heart of the matter: when is an observation significant? The only anser I
can really give is: when the resulting phylogeny says it is. This is, of
course, unsatisfactory for most people, because you may not see that the
margin 5% increase in bone thickness is a significant difference. If, as I
alluded to in my post, your animals cluster such that this 5% difference or
whatever really represents a great rift in bone morphology in the ingroup, I
don't think anyone will contest the character. Unfortunately, evolution has
a nasty habit of providing us with intermediate, or at least POTENTIALLY
intermediate, steps in any character. So it is very hard (contrary to the
implications of my earlier post) to say that the difference is significant
or not a priori of the analysis.>

Let's assume you have done your analysis and found 2 taxa at one
measurement (as you've chosen to code it) and 50 others (not sure 500
samples of different taxa will be available to you) are 5% different.
First, you have to reassure yourself this is not just an artifact of your
coding decision.  I assume you've set ranges in assigning a single coded
item, and how do you know that a difference is large enough to be important,
that you're not accidentally coding just individual variation or some other
factor like preservation mechanism?  You have made an analytic conclusion
here a priori of applying the algorithm.
Even if your range decision is correct, you still have a problem:  you are
assuming that if you had the whole populations in question the results would
be the same.  You have not found one or two fluke members of a taxon.  As
mentioned, flukes happen.  You may have discovered intermediate forms,
unusual individuals, or even a temporary and isolated phenomenon.  For
example, if all you'd seen were red wolves, you'd assume that all wolves
were red with this logic.
In the absence of sufficient data, you have identified a difference among
the available samples, but you cannot say with certainty that the data
should be extrapolated to even a single taxon, let alone a group of
scattered taxa.
Let me assert that if the distinctions you're observing have no reason to be
considered important, you're doing pure probability analysis with an
infinitesimal, skewed sample.  You know your mathematical confidence, or
rather unconfidence in that case.
Anyway, thanks for considering my concerns.