Tuesday, May 6, 2008

Examining cases for or against homoplasy at designated DNA loci

Part of this topic is a carry over from the earlier topic of 12-repeat allele at DYS392 microsatellite of certain PN2 clades, including E-M78 [clickable link]

Argument #1 about the 12-repeat all at DYS392 : Characteristic of homoplasy...

— seems to reappear in different subclades of M78, particularly in regions where M78 appears to be fairly frequent [Cruciani et al.'s emphasis, 2008], but almost always doesn't seem to appear in all the haplotypes of the respective subclades where it makes its appearance.

Argument #2: Characteristic of a relatively stable site and hence, a part of the hereditary package of a monophyletic unit...

— while not necessarily geographically restricted, it appears to be sharply geographically-structured, with preponderance in eastern Africa, particularly sub-Saharan East Africa. [Semino et al. emphasis, 2004]

— occurs in E-P2*, E-M35*, and E-M78 but is almost *absent in all other haplogroups* [Semino et al. 2004]

— an ancient differentiation of the E-P2 haplogroup occurred in loco (East Africa). However, this also implies a low mutability of the associated microsatellite motif (DYS392-12/DYS19-11) [Semino et al. 2004]

high stability of the DYS392 locus (Brinkmann et al. 1998; Nebel et al. 2001) and of the shorter alleles of DYS19 (Carvalho-Silva et al. 1999) has been reported elsewhere.[Semino et al. 2004]

The first argument seems pretty self-explanatory, but let's take a quick look at the second one:

If we are to assume that this was the product of an ancient differentiation of E-P2* in loco — East Africa, one which is not that of homoplasic events, then one has to account for the 12-repeat allele's absence in subsets of haplotypes of the various E-P2 derived clades. If it were just about the E-P2* clade alone [sans the major downstream mutations], then the geographical structuring of the slightly differentiated E-P2 paragroup via the 12-repeat allele would be sensible, as it would have been after the fact of the P2 mutational event and carried by a single individual common recent ancestor.

However, this scenario would imply that the descendants of the just-mentioned common recent ancestor should have inherited the 12-repeat allele. Now, since downstream mutational events at M35* and M78 are supposed to represent single events, it ought to not make sense that some M35* and M78 lineages have the 12-repeat allele while other chromosomes of those same markers don't—again, that is if the 12-repeat allele is treated as a highly stable locus, which became part of the hereditary package of a Pn2*-bearing common ancestor. But...

...what if, one were to account for the above-mentioned inconsistency, by suggesting homoplasic revertant mutational events in M35* and M78 chromosomes at the DYS392-12 locus; this would tend to weaken that argument about the "low mutability" of the DYS392-12 locus, which seems to be one of the points of the second argument, but an important one. Though the 12-repeat allele does appear in chromosomes ancestral at either sites of M35 and M78, the latter are highly unlikely to be products of reoccurring nucleotide polymorphisms, because they have long been demonstrated to form a monophyletic unit [branch] of the Pn2 clade, with each forming their own sub-branches.

On the other hand, the overwhelming preponderance of the 12-repeat allele in ancestral P2 clades and its derivative clades in East Africa, particularly sub-Saharan East Africa, does seem to favor its origin therein, if the 12-repeat is to be treated as a reasonably stable locus which is a part of the hereditary package of a Pn2*-bearing [lacking the major downstream mutations associated with the M215 and the M35* phylogeny] common ancestor. One thing that the aforementioned "preponderant" distribution of the 12-repeat allele-bearing Pn2 clades in east Africa does less arguably demonstrate, is the commanding frequencies of the aforementioned three main Pn2+ clades in east Africa.

On another case study, Cruciani et al note:

We recently refined the phylogenetic relationship
between markers M215 and M35, which identify the
human Y chromosome haplogroup E3b in Underhill et al.
(2001). In the new topology of the Y chromosome tree, the
E3b haplogroup is defined by M215, E-M35 becoming a
subhaplogroup of E3b (E3b1) (Cruciani et al., 2004). We
also showed that a subhaplogroup of E-M35, E-M78, is
highly valuable for understanding the male-mediated
gene geography of Eurasia and Africa. We described short
tandem repeat (STR) alleles useful to define clusters of
related haplotypes within E-M78 (Cruciani et al., 2004),
single nucleotide polymorphisms (SNPs) which validate to
a large extent the same clusters (Cruciani et al., 2006),
and the geographic distribution of the E-M78 subclades
defined by the above markers in 517 subjects (Cruciani
et al., 2007). Dating results and population inferences
based on the above data set relied on the assumption of a
monophyletic origin of the C to T mutation at M78.

In an interesting paper recently published in the Journal,
Fernandes et al. (2008) present data on the typing of
130 E-M35 subjects for SNPs and STRs. They suggest
that three cases of equality in state at 11 STR loci between
pairs of chromosomes belonging to different E-M35 subclades,
denote the repeated occurrence of mutations at
M78 and M81, as either forward or reversion events.
In
addition, the authors prompt caution in using novel
arrangements of SNPs supported by a reduced number of
subjects as new clades, and thus reject the novel branching
pattern and the ensuing nomenclature proposed by
Cruciani et al. (2004) for the E3b (E-M215) haplogroup.


Here, we introduce a novel polymorphic marker (V68),
potentially useful to investigate the issue of hidden recurrent
mutational events at M78. V68 is an A to C transversion
obtained with the primers V68 FOR (50-CAACTGAAAAT
CAGAACTTTGG) and V68 REV (50-GTGGATCACGAGG
TCAGG). In discussing the position of V68 within the Y
phylogenetic tree, we retain the haplogroup nomenclature
based on the defining mutation (The Y Chromosome Consortium,
2002).

A total of 239 male subjects of African ancestry, previously
analyzed for the allelic state at the E3b (E-M215)
markers (Cruciani et al., 2004, 2007), were assayed for
V68 in order to assess the position of this marker in the Y
chromosome phylogenetic tree. The subjects represented
the four main E-M35 sub-clades (E-M78, E-M81, E-M123,
E-V6), as well as the E-M35*, E-M215*, and Y(xM215)*
paragroups, and were selected to equalize, when possible,
the number of chromosomes within each haplogroup/paragroup.
Thus, they do not faithfully represent each clade’s
frequency in the overall population sample (see Fig. 1).

The results shown in Figure 1 unequivocally assign the
marker V68 to the same branch as M78. For the time
being, the two markers are phylogenetically equivalent.
These results are strong evidence against the presence, in
our sample, of homoplasic M78 derived alleles, as well as
possible M78 revertants. In fact, in the first case any non-
M78 chromosome undergoing a recurrent M78 forward
mutation would result in an apparent E-M78 chromosome
carrying the ancestral allele at V68, a situation not yet
observed. The mirror event, a M78 reversion to the ancestral
state, would produce E-V68 chromosomes ancestral at
M78, again a pattern not yet found. The finding of the yet
unidentified chromosomes described earlier for either scenario
would raise two alternatives, i.e. the presence of a
novel Y tree branch resulting from the phylogenetic resolution
between V68 and M78, versus a recurrence/reversion
event. While the low mutation rate for single nucleotide
substitutions usually favors the first hypothesis (identifying
SNPs as Unique Event Polymorphisms, or UEPs)
(Underhill and Kivisild, 2007), the recurrence/reversion
hypothesis can be nevertheless considered. In this case,
the analysis of the involved SNP loci at the level of nucleotide
identity is mandatory.


Present author's take:
According to the extract, Fernandes et al. 2008 have reservations, but no mention is made of an actual refutation of Cruciani et al.'s argument about a strong link between V68 allele and M78 UEP—representing a monophyletic unit, and about either nucleotide polymorhisms—that is, M78 and V68—having the hallmarks of a UEP as opposed to relics of homoplasic events. To put it in the words of Cruciani et al. themselves, this argument "cannot be questioned until compelling evidence is put forward."

At any rate, this is a study that will likely be revisited here for further examination, as more details of the study come to attention, and as reactions from peer-reviews are publicized.
_________________________________________________________________
*References:

—Semino O. et al., Origin, Diffusion, and Differentiation of Y‐Chromosome Haplogroups E and J, 2004.

—Cruciani F. et al. 2007, Tracing Past Human Male Movements in Northern/Eastern Africa and Western Eurasia: New Clues from Y-chromosomal Haplogroups E-M78 and J-M12.

—Cruciani F. et al. 2008, Recurrent mutation in SNPs within Y chromosome E3b (E-M215) haplogroup: A rebuttal