Tuesday, January 29, 2008

RFLPs: Lucotte et al., A case study — Pt. 1

From Lucotte et al., we have:

North African Berber and Arab Influences in the Western Mediterranean Revealed by Y-Chromosome DNA Haplotypes

Nathalie Gérard, Sala Berriche, Annie Aouizérate, Florent Diéterlen, Gérard Lucotte. Human Biology. Detroit: Jun 2006. Vol.78, Iss. 3; pg. 307, 10 pgs

During the 7th century A.D., Muslim people coming from the Arabian peninsula and the Middle East invaded North Africa. The most important population movement relating both sides of the Mediterranean Sea was the conquest of the Iberian peninsula by North African populations (with recruited Berbers), soon after the first Muslim invasion. More than eight centuries (8th to 15th centuries) of Muslim domination in the southern part of Iberia imparted an important cultural legacy (Conrad 1998) and probable gene exchanges between North African and Iberian populations.

Variations in DNA sequences specific to the nonrecombinant part of the Y chromosome, relating to paternal ancestry, are particularly interesting from a human population genetics point of view. The first published and most informative probe used in Southern blots for this objective is p49 (locus DYSl), which is able to identify at last five TaqI male-specific fragments (A, C, D, F, and I) that are polymorphic between individuals (Lucotte and Ngo 1985). Sixteen main corresponding haplotypes (numbered I-XVI) were identified using the p49 probe on DNA samples of unrelated males living in France (Ngo et al. 1986). Only recently has the molecular basis of the p49 TaqI polymorphisms been established (Jovelin et al. 2003); the polymorphisms correspond to variable TaqI sites located in the four DAZ genes located in the AZF-c region of the Y chromosome.

In fact, the conventional p49 TaqI polymorphisms were the most popular markers used in various populations because of their ability to detect more than 100 different haplotypes [for a compilation on the subject until the end of 1995, see Poloni et al. (1997)]. Haplotype XV (A3,C1,D2,F1,I1) was the most widespread haplotype in our initial study (Ngo et al. 1986). Haplotype XV was also predominant in the first European study we published (Lucotte and Hazout 1996), with elevated frequencies in French Basques. The geographic distribution of haplotype XV in Europe reveals a gradient of decreasing frequencies from this Basque focus toward eastern peripheral countries (Lucotte and Loirat 1999) but also toward southwestern countries. According to the Y Chromosome Consortium (2002) nomenclature, haplotype XV corresponds to the M173 lineage (Diéterlen and Lucotte 2005).

Haplotype V (A2,C0,D0,F1,I1) is the most frequent haplotype in North Africa (Lucotte et al. 2000), with a particularly high frequency (55%) in the populations with a relative predominance of Berber origin. Our previous study on the subject examined the relative frequencies of haplotype V in four Iberian populations compared with a Berber population living in North Africa (Lucotte et al. 2001). The highest frequency of haplotype V (68.9%) was observed in Berbers from Morocco, and the geographic distribution of haplotype V revealed a gradient of decreasing frequencies with latitude in Iberia (40.8% in Andalusia, 36.2% in Portugal, 12.1% in Catalonia, and 11.3% in the Basque Country) (Lucotte et al. 2001); such a cline of decreasing haplotype V frequencies from the south to the north in Iberia clearly established a gene flow from North Africa toward Iberia.

According to the Y Chromosome Consortium (2002) nomenclature, haplogroup E is characterized by the mutations SRY4064, M96, and P29 on a background defined by the insertion of an Alu element (YAP + ). The third clade, E3 (defined by the mutation P2), of haplogroup E is further subdivided into two monophyletic forms, the second one (E3b) being characterized by mutations M35 and M125. All of the 110 p49 TaqI haplotype V subjects from Morocco (51 Berbers and 59 Arabs) that we had previously tested correspond to haplogroup E3b.

In the present study we have subdivided haplotype V into its Berber (Vb) and Arab (Va) components in order to distinguish the relative contributions of these two ethnicity-specific markers in the gene pools of the populations living in Iberia and in other populations in the northern part of the western Mediterranean area.

Materials and Methods
DNA Samples. This study concerns 2,196 unrelated male DNA samples (Table 1). We collected 904 new unrelated males subjects, from three different countries (Portugal, France, and Italy): 79 from North Portugal and 59 from South Portugal; 243 from the Marseilles region of France; 192 from Genoa, 64 from Rome, and 128 from Naples in continental Italy; 39 from Sicily; and 100 from Sardinia. All these new samples correspond to adult males, whose origin is based on the local birthplace of their fathers and (at least) grandfathers. We have obtained informed consent from each of the French subjects studied.

We add for comparison the following subjects, already tested as bearing haplotype V in previous studies: 11 subjects from Mauritania, 51 Berbers from Morocco, 59 Arabs from Rabat, 80 subjects from Algeria, 39 subjects from Tunisia, and 17 subjects from Libya (Lucotte et al. 2000); 29 Spaniards from Sevilla (Lucotte et al. 2001); 4 Spaniards from Barcelona and 9 French Catalans from Perpignan (Lucotte and Loirat 1999); 11 French Basques, 1 subject from Montpellier, and 7 subjects from Grasse in France and 6 subjects from Milan in Italy (Lucotte and Hazout 1996); and 44 subjects from Corsica (Lucotte et al. 2002).
Figure 1 indicates the representative geographic points in the western Mediterranean area for each of the 22 populations studied.

Genetic Analysis. Blood samples were collected by venipuncture using EDTA as an anticoagulant. Genomic DNA was extracted as described by Gautreau et al. (1983), using proteinase K and phenol-chloroform extractions.

Y-chromosome haplotypes were obtained using Southern blot analysis by hybridizing TaqI-restricted DNA successively with the p49f,a-specific probes, oligolabeled by random priming, according to the method described by Lucotte et al. (1994). Subdivision of haplotype V detected by Southern blot analysis into subhaplotypes Va and Vb was realized by polymerase chain reaction (PCR), using the assay published by Gonçalves and Lavinha (1994); the presence of the "low" XY275 allele (275G) defines subhaplotype Vb, and the other allele defines subhaplotype Va.

To compare subhaplotypes Va and Vb with the E3b1 and E3b2 subhaplo-groups [according to the Y Chromosome Consortium (2002) nomenclature], we further analyzed our Berber and Arabic DNA samples from Morocco for biallelic markers M78 and M81 using PCR (Underbill et al. 2000).

Realization of the Isofrequency Haplotype Maps. The maps of subhaplotype Vb and Va isofrequencies have been drawn with the Spatial Analyst program (Arcview software) using the Kringing procedure. We have used the inverse distance weighting method (with a power of 2), which is well adapted to scarce data in coastal areas and on islands in the western Mediterranean area. Five neighbors are calculated for each quadrant.

Table 1 summarizes the frequencies we obtained for haplotype V and sub-haplotypes Vb and Va in the 22 study populations. For the 2,196 males typed, 491 (22.3%) bear haplotype V. The frequency of haplotype V is 35.5% in Portugal, with a more elevated proportion in the south (49.2%) than in the north (25.3%). The frequency of haplotype V in the Marseilles region (11.1%) has a value similar to the mean value in continental France (9%). In Italy the highest frequency is attained in Sicily (28.2%), followed by Naples at 17.2%. As previously shown (Lucotte et al. 2000), haplotype V is found at the highest frequency (68.9%) in Berbers from Marrakech in Morocco; an apparently increasing east-west cline in haplotype V frequencies is shown in North Africa from Libya (44.7%) to Rabat (57.7%), with intermediate values for Tunisia (53.4%) and Algeria (56.7%). In Spain haplotype V is much more frequent (40.9%) in the south of the country [in Andalusia (Sevilla)] than in the north (12.9%) [in Catalonia (Barcelona)].

Subhaplotype Vb is the Berber subhaplotype because its most elevated relative value (63.5%) is obtained for the Berber population of Marrakech. In the non-Berber population of Rabat in Morocco, the frequency of subhaplotype Vb is only 20.6%, whereas the frequency of subhaplotype Va (Arab) is 37.3%. In order of decreasing values, the subhaplotype Vb frequencies are 40% in Mauritania, 35.9% in South Portugal, 25.4% in Andalusia, and 15.8% in Libya. Low frequencies of subhaplotype Vb are found in Sicily (5.1%), Algeria (2.8%), Tunisia (2.7%), and North Portugal (2.5%); frequencies less than 2% are found in French Basques (1.9%), in Naples (0.8%), and in Corsica (0.6%), Subhaplotype Vb is absent in Catalonia (Barcelona and Perpignan), in the south of France (Montpellier, Grasse, and the region of Marseilles), in continental Italy (Milan, Genoa, and Rome), and in Sardinia.

Table 2 summarizes the frequencies of subhaplotype Vb in North Africa, Iberia, the south of France, and Italy. The maximum value (63.5%) concerns the Berber population, but this frequency is notably lower (9.3%) for other populations from North Africa. In southern Iberia an elevated value (30%) is observed, but the frequency of subhaplotype Vb is only 1.8% in northern Iberia. These frequencies are less than 1% in France and Italy.

Figure 2 shows the isofrequency map of subhaplotype Vb in the western Mediterranean area (coordinates on the map: x = longitude, y = latitude). From the Berber focus in Berbers from southern Morocco, the frequencies of subhaplotype Vb decrease in North Africa to the north of Morocco and to the east in Algeria and Tunisia. For Iberia the most elevated value of subhaplotype Vb frequencies is in southern Portugal; relatively elevated values are observed in Andalusia, moderate values are observed in the southern part of Spain, and low values are seen in Catalonia.

In the present study all haplotype V non-subhaplotype Vb subjects are termed subhaplotype Va (Arab) subjects. Their maximum relative frequencies are 53.9% (Algeria), 50% (Tunisia), and 37.3% (Rabat) in North Africa. Table 3 summarizes the frequencies of subhaplotype Va in North Africa, Iberia, southern France, and Italy. The maximum value (45.8%) is found in North Africa. In northern Iberia a slightly more elevated value is observed (20%) compared to southern Iberia (14.6%). A frequency of 10.3% is seen in France, and in Italy the 14.6% value observed in the south is relatively more elevated than in the north (3.4%)

Figure 3 gives the isofrequency map of subhaplotype Vb. In North Africa frequencies decrease from east to west and southward. For southern Europe the map shows the relatively higher percentages observed in the south of Italy versus the north and (to a lesser degree) in the north of Iberia versus the south.

In our PCR assay the 68 Moroccan subjects with subhaplotype Vb (47 Berbers and 21 Arabs) were tested for the M81 marker: All subjects were positive for the M81 marker, so subhaplotype Vb is homologous with subhaplogroup E3b2. The 38 Moroccan non-Berber subjects were further tested for the M78 marker: Only 31 of them (80.8%) were positive for the M78 marker; we conclude that, in Morocco at least, subhaplotype Va corresponds only partly to subhaplogroup E3b1

P49a,f TaqI haplotype V, which is homologous with haplogroup E3b according to the Y Chromosome Consortium (2002) nomenclature, is the predominant Y-chromosome haplotype in North Africa (Lucotte et al. 2000), where its geographic distribution shows an east to west cline. In the present study we have extended the research of haplotype V frequencies (Lucotte et al. 2001) in various European populations located in the western Mediterranean basin to include France, Portugal, and Italy. The frequency of haplotype V in the Marseilles region is 11.1%, a value similar to the main value we obtained previously for continental France (Lucotte and Hazout 1996). In continental Italy we observed the highest haplotype V frequency in Naples (17.2%); Sicily, with a frequency of 28.2%, corresponds to the most elevated value we observed for Italy. In South Portugal the frequency of haplotype V is very high (49.2%); we had previously obtained a similar value for Libya and for Mauritania. The frequency of haplotype V for North Portugal (25.3%) is similar to the value we obtained for Sicily in the present study.

To better divide haplotype V into its ethnic components, we have subdivided it into subhaplotypes Vb (Berber) and Va (Arab). We have established that subhaplotype Vb is the Berber haplotype, because it is present at very elevated frequencies (63.5%) in our Berber population from Morocco but at relatively low frequencies (20.6%) in our non-Berber population of Rabat. Such a distinction of a Berber component was also realized by Scozzari et al. (2001), because they observed that the haplogroup they named 25.2 was also more frequent in the Berber population from Morocco than in Arabs. Our present results show that subhaplotype Vb frequencies in North Africa decrease from west to east, starting from the Berber focus in Morocco; in the western Mediterranean area subhaplotype Vb is at low frequencies along the south coast of Europe but occurs at relatively elevated frequencies in southern Iberia (peaking at 35.9% in South Portugal). Flores et al. (2004), in their important study of various locations in Iberia, observed that subhaplogroup E3b2 is more frequent in southern Iberia, attaining a maximum value of 11.5% in the region of Málaga.

In the present study all the non-subhaplotype Vb subjects bearing haplo-type V are classified as subhaplotype Va (Arab); they probably correspond to a heterogeneous group representing various ethnicities (our results concerning the incomplete correspondence between subhaplotypes Va and E3b1 in Morocco suggest that). We have shown here that in North Africa the focus of subhaplotype Va frequencies is in Algeria (53.9%) and Tunisia (50.6%); from this focus frequencies of subhaplotype Va decrease in the south and the west of the region.
Subhaplotype Va attains substantial frequencies along the southern coast of Europe; these frequencies reached relatively elevated frequencies in France (Perpignan, 11.8%) and in southern Italy (Naples, 16.4%; Sicily, 23.1%). For Iberia, relatively more elevated values are attained for Andalusia (15.5%) and for North Portugal (22.8%). Brion et al. (2004) also showed relatively higher frequencies of haplogroup E* (xE3a) (up to 18.3%) in their study concerning northern Iberia.

We had previously established (Lucotte et al. 2001) that haplotype V showed a gradient of decreasing frequencies with latitude in Iberia, and we interpreted this pattern as a consequence of the historical Islamic occupation of the peninsula (Conrad 1998). The results reported in the present study concerning subhaplotypes Vb and Va (subhaplotype isofrequencies maps given in Figures 2 and 3) have again shown both of these gradients. From this perspective, the opposite pattern of gradient frequencies observed in Iberia for the western European haplotype XV (Diéterlen and Lucotte 2005) is reconciled with the slow reconquest of the Iberian peninsula from the north by the Christians, which lasted seven centuries and ended in Granada in 1492

Examination of the above, with the assistance of references to older Lucotte et al. studies:

With regards to Lucotte et al.'s earlier reference to RFLP haplotype V as "Arab", Keita is right about the "Arabic" label being misleading, but in fact, if one reads Lucotte et al. later work, it is clear that they associate this haplotype with North Africans. Lucotte et al. refers to North Egyptians, as Egyptian "Arabs", and makes reference to groups in other parts of North Africa as "Arabs" as well. So, in actuality Lucotte et al. were associating haplotype V with what they perceived as "Arabized" north Africans. And so, as one can see, they refered to haplotype V as "Arab" and "Berberian", and made note of the fact that the Falasha had a high frequency "haplotype V and XI", which attests to their African provenance.

Keita associates haplotype V and XI with African origin, but so does Lucotte et al. Keita associates V and XI [barring his reference to other contexts used by other researchers] with M35/215, but if Lucotte et al. associate these with "North Africans" and Ethiopian Jews, and proclaim that is of African provenance, they too must be associating it with M35/215. M81 is the predominant "Berber" variant of M35. So the question is, if haplotype V is predominantly "Berber" and associated with "Berber origin", and haplotype XI is noted to have high frequencies in Eastern Africa, and decreases as one moves west of the African continent, then what is haplotype V and haplotype XI, as presented by Lucotte et al.?...In the meantime from Keita's publication:

Some TaqI 49 a,f variants have multiple associations; for example VIII is affiliated with several lineages (Al-Zahery, 2003). So far research indicates that haplotype V in Africa is associated with the M35/215 (or 215/M35) subclade, **as is XI**, and IV with the M2/PN1/M180 lineage, both of the YAP/M145/M213 cluster. These lineages that in Africa that affiliate with haplotypes V, XI, and IV (called “sub-Saharan”), are joined by a transition mutation: “(M)ost notably the PN2 transition…unites two high frequency subclades, defined by M2/PN1/M180 mutations in sub-Saharan Africa, and M35/215 in north and east Africa…” (Underhill et al., 2001, p.50; see also Cruciani et al., 2002).

Hence, Lucotte et al.'s 'Arab' and 'Berber' appellations to V haplotypes in north Africa, appear to be what they deem 'Arabic' speaking North Africans and 'Berber' speaking North Africans. On another note, it also appears that haplotype XI is also affiliated with E-M35 in the Lucotte et al. data Keita used. To be certain about any of this, one might as well examine primary texts from Lucotte et al. themselves:

Y-chromosome DNA haplotypes in North African populations

The frequency distribution of Y-chromosome haplotypes at DNA polymorphism p49/TaqI was studied in a sample of 505 North Africans from Mauritania, Morocco, Algeria, Tunisia, Libya, and Egypt. A particulary high frequency (55.0%) of Y-haplotype 5 (A2,CO,DO,F1,11 ) was observed in these populations, with a relative predominance in those of Berber origin. Examination of the relative frequencies of other haplotypes in these populations, mainly haplotype 4 (the "African" haplotype), haplotype 15 (the "European" haplotype), and haplotypes 7 and 8 (the "Near-East" haplotypes), permit useful comparisons with neighboring peoples living in sub-Saharan Africa, Europe, and the Near East.

The highest frequency of haplotype 5 (68.9%) was previously observed in Berbers from Morocco, and it has been established that this haplotype is a characteristic Berber haplotype in North Africa....

Haplotype 5 (A2, C0, D0, F1, I1) has a particularly high frequency (55%) in North Africa (Lucotte et al. 2000), and is of predominantly Berber origin. — Lucotte et al.; North African genes in Iberia studied by Y-chromosome DNA haplotype 5; 2001.

At least based on this, with caution, it is strongly suggestive of M81 derivatives. The authors note that it is supposed to be a characteristic Berber haplotype in North Africa, and is of predominantly Berber origin. However, we also know that the Lucotte et al. data cited by Keita also shows V haplotypes in Egypt, along with XI haplotypes. V haplotype in Egypt has a gradient that increases as one moves from south to north, while that of XI is the opposite, with a gradient increasing as one moves from north to south. So in the Egyptian context, does this mean that V is still suggestive of E-M81 chromosomes? Who knows; but M81 is certainly attested to in Egypt. What about XI? Could that be suggestive of an M78 derivative? Plausible, given its high frequencies in sub-Saharan Africa, particularly east Africa. One has to ascertain this plausibility. **To be certain, one would have to be familiar with the specific binary markers that Lucotte et al. would have searched [usually done, once a restrictive digest [by restrictive enzyme] is undertaken to cut DNA into fragments] and amplified [PCR] for haplotype V in the 2001 study above and their 2003 study cited by Keita, to see if they continued to be in the same exact contexts or if variant binary markers were used in respective studies. As noted in the linked discussion, in Lucotte et al.'s case, this doesn't appear to be the case. However, the authors of the current head topic have addressed this issue, reassuring us with relatively more precision, what specific V haplogroups were involved in Lucotte et al.'s several studies.

To be continued.

Link to part 2:
RFLPs: Lucotte et al., A case study — Pt. 2 [clickable]


Anonymous said...

In an earlier writing, perhaps (Studies of ancient crania from North Africa, etc.. ) Keita notes that the terms "Berber" and "black" are not mutually exclusive in the broad sense of physical variation, and references the Taita (I think) along with some others as an illustration. What is the general genetic makeup of the peoples typically dubbed "Berber"? Can they be conveniently plugged into a "white" checkbox or is there a range of DNA variation within that population? I wonder if the so called "Berber" haplotype noted in the post can be attributed exclusively to any one checkbox. How does the M78 and 81 tie-in?

Mystery Solver said...

anonymous writes:

In an earlier writing, perhaps (Studies of ancient crania from North Africa, etc.. ) Keita notes that the terms "Berber" and "black" are not mutually exclusive in the broad sense of physical variation, and references the Taita (I think) along with some others as an illustration. What is the general genetic makeup of the peoples typically dubbed "Berber"?

The folks so-designated, i.e. "Berber" are not a monolithic group by any stretch; what ties them together primarily, is the their languages are subphylums of a common ancestral language, which in turn was a ultimately subhylum of the proto-Afrasan (otherwise also called proto-Afro.Asiatic) ancestral language. "Berber" was a name that stuck to coastal north African Amazighan speakers, and became something of a standardized reference to these people in European vocabulary in the "Medieval" era thanks to Arab reference to that region as al-Barbar [I'll repost more elaborate material on this matter under its own heading]. It is now generally used in Eurocentric scholarship mainly as a linguistic construct. The other thing that seems to tie in these groups, though relatively more loosely than the language connection, is reoccurring uniparental markers, that is suggestive of ultimate from descent a common recent ancestor in a proto-Amazighan speaking population; predominantly, from the paternal side, this is primarily the E-M81 marker, closely followed by various clusters of the E-M78 marker. On the maternal end, an interesting pattern is observed: a clinal distribution along geographic lines finds expression, with recent European ancestral substantially represented along the the north coast regions, wherein its frequency thereof progressively fades as one proceeds further into the content, first through the Sahara to Sahel, and then onto sub-Saharan Africa. By the same token, the more traditional/typical African markers are more considerably represented in Sahelian and Saharan Amazighan speakers. This maternal distribution pattern seems to parallel the phenotypic trends like that of skin pigmentation variations; with darker skinned Amazighan speakers predominant in the Saharan and Sahelian areas, while the coastal north regions notably sports considerable segments of populations, though not exclusively, of lighter toned Amazighan and Arabized Amazighan speakers.

anonymous writes:

Can they be conveniently plugged into a "white" checkbox or is there a range of DNA variation within that population?

Well, that would be left to the discretion of the person checking the checkbox; Amazighan (Berber) speakers sport noticeable degree of variation from tawny looking groups to noticeably dark hued groups, and intermediary grades in between the extremes of this phenotypic manifestation amongst Amazighan speakers.

anonymous writes:

I wonder if the so called "Berber" haplotype noted in the post can be attributed exclusively to any one checkbox. How does the M78 and 81 tie-in?

For the reasons just pointed out, that would be inaccurate. And as already pointed out, while there are certain reoccurring markers that have come to typify Amazighan speakers notwithstanding clinal variations in lineage distribution patterns [like e.g.: E-M81 paternally,followed by E-M78 clusters, and on the maternal end...recurring U6 and M1 markers for example, in the north coasts where European maternal contribution has been substantial and in the Sahel where typical African maternal contribution has been more substantial], there are other ancestries amongst Amazighan/Berber speakers as in any other populations, thus not boxing them into a single checkbox for a lineage.