top of page

Introduction to historical population genetics

Origins, spread and ethnic association of European haplogroups and subclades

DNA studies have permitted to categorise all humans on Earth in genealogical groups sharing one common ancestor at one given point in prehistory. They are called haplogroups. There are two kinds of haplogroups: the paternally inherited Y-chromosome DNA (Y-DNA) haplogroups, and the maternally inherited mitochondrial DNA (mtDNA) haplogroups. They respectively indicate the agnatic (or patrilineal) and cognatic (or matrilineal) ancestry.

Y-DNA haplogroups are useful to determine whether two apparently unrelated individuals sharing the same surname do indeed descend from a common ancestor in a not too distant past (3 to 20 generations). This is achieved by comparing the haplotypes through the STR markers. Deep SNP testing allows to go back much farther in time, and to identify the ancient ethnic group to which one's ancestors belonged (e.g. Celtic, Germanic, Slavic, Greco-Roman, Basque, Iberian, Phoenician, Jewish, etc.).

In Europe, mtDNA haplogroups are quite evenly spread over the continent, and therefore cannot be associated easily with ancient ethnicities. However, they can sometimes reveal some potential medical conditions (see diseases associated with mtDNA mutations). Some mtDNA subclades are associated with Jewish ancestry, notably K1a1b1a, K1a9,d K2a2a and N1b.

The study of Y-chromosomes is far more interesting than that of mitochondrial DNA for two reasons.

1.      Firstly, the Y chromosome is a sequence of 60 million "characters" (nucleobases), against only 16,569 for mtDNA. The Y chromosome therefore offers a much greater resolution as mutations are more common, and indeed happen every generation. In contrast, mtDNA mutations happen much more infrequently. Since the time of the Mitochondrial Eve, approximately 200,000 years ago, modern humans have acquired in average 20 mtDNA mutations in each lineage - about one every ten thousand years. Even though the number of mutations has accelerated with the soaring of human population over the last 10,000 years, the dating of lineages based on mtDNA alone remains very approximate, and practically useless for historical times. By sequencing the full Y chromosome, it is theoretically possible to map the entire patrilineal genealogy of humanity (or any other species) within a few generations (or even within one generation). This is a collossal task, and an expensive one too, since full chromosome sequencing (reading every nucleobase one by one) remains very expensive compared to SNP genotyping (checking only for mutations already discovered in other individuals). The arrival of the full Y-chromosome sequencing (or even whole genome sequencing) on the market has permitted to achieve an optimal resolution, but their price remains well above the standard commercial tests. This restricts the overall reach of these tests and the most common haplogroups in rich countries are at present much better studied than the other ones.

2.      The second advantage of Y-DNA over mtDNA is that men have traditionally been less mobile than women (except during military invasions, like the Indo-Europeans, the Vikings or the Arabs). In almost every settled, agricultural society, men are the ones who inherit their parents's property, and therefore remain in the same location generation after generation. Women, on the other hand, were often send away to marry in another village or town, so that their lineages spread more evenly over time, thus progressively erasing the traces of ancient settlement patterns.

dna pic 1.JPG

Paternal and maternal haplogroups in prehistoric Europe

Mesolithic Europe

Following the end of the last Ice Age approximately 12,000 years ago, European hunter-gatherers recolonised the continent from the Ice Age refugia in southern Europe. The vast majority of Mesolithic Europeans would have belonged to Y-haplogroup I. This included I* (the * means that no further subclade was identified), pre-I1, I1, I2*, I2a*, I2a2, I2c, but the most widespread appears to have been I2a1, which was found in most parts of Europe. Northeast Europeans would have belonged mostly to haplogroup R1a, and to a lower extent also I2a2 and R1b. Other minor male lineages were certainly also present in parts of Europe, notably haplogroup A1a, C-V20, and possibly even Q1a.

The maternal lineages of Mesolithic Europeans appears to have been predominantly U4 and U5, but also included several H subclades (H1, H3, H17), T, U2 (U2d et U2e) and V. The presence of mt-haplogroups I and W in Eastern Europe or the North Caucasus is possible but hasn't been confirmed yet.

Based on their modern distributions, mtDNA haplogroups H10 and H11 might well have Mesolithic/Palaeolithic European origins.

There seem to have been several Palaeolithic and/or Mesolithic migrations from Northwest Africa to Iberia. The oldest might have brought West African paternal haplogroup A1a to Western and Northern Europe during the Palaeolithic. A1a has been found in modern populations as far north as Ireland, Scotland, Scandinavia and Finland. The presence of African maternal lineages (L2, L3 and possibly L1b1) has been attested in Neolithic Iberia. Northwest Africans would also have brought U6 and possibly HV0/V lineages to Europe.

A small percentage of sub-Saharan African admixture has been identified in Late Mesolithic Swedes from the Pitted Ware culture (2800-2000 BCE), which would imply that A1a was already present in northern Europe at the time. Another Mesolithic sample from Loschbour in Luxembourg had dark hair and considerably darker skin than modern Europeans.

dna pic 2.JPG

​ Neolithic and Chalcolithic Europe

Agriculture first developed in the Levant, then spread to Anatolia, Greece, the Balkans, Italy, Central and Eastern Europe. These Neolithic farmers were confirmed to have belonged primarily to Y-DNA haplogroups G2a, but also included minorities of C1a2, E1b1b, H2 (formerly F3), J1, J2 and T1a lineages, who could have been assimilated in Anatolia before entering Europe. As they advanced across Europe, Neolithic farmers also increasingly assimilated European lineages, notably I2a1 in Southeast Europe, I1 and I2a1 in Central Europe, I2a1 and I2a2a in Western Europe, and E-M78, I2a1 and I2a2a in Southwest Europe.

Hundreds of Neolithic samples from all over Europe (but especially Central Europe and Iberia) have been tested. The new lineages brought by these Near Eastern immigrants included mt-haplogroups HV, J1, J2, K1, K2, N*, N1, T1a, T2b, T2c, T2e, T2f, U3, W, X1, X2, and many subclades of H (including H2, H5, H7, H13 and H20). H4, H8 and H9 seem to have originated in the Near East as well, although no Neolithic sample has been identified in Europe yet.

However, due to the proximity of the Caucasus from the Indo-European homeland, many of these mt-haplogroups were almost certainly also transported by the Indo-Europeans themselves. This would notably be the case of H5, K1a, T2b, U3, W and X2.

The Bronze Age and the Indo-European migrations

The origin of the Indo-European peoples is a subject that has caused much ink to flow among archaeologists and historians. Their Urheimat (original homeland) has been speculated to lie in Anatolia, around the Caucasus, in Iran, in India, in Central Asia, in Russia, or even in Scandinavia. Thanks to Paleogenetics we now know that these people expanded during the Late Copper and Early Bronze Age from the Pontic Steppe to the North of the Black Sea and the Caucasus. There seems to have been two distinct, though closely related, groups of tribes speaking the Proto-Indo-European language, from which descend almost all the European languages today (apart from Basque, Hungarian, Estonian, Finnish and Sami) as well as Armenian, Kurdish, Persian and most North Indian languages. Tribes belonging mainly to the paternal haplogroup R1a reportedly occupied the North of the steppe (forest-steppe and tundra), while in the South (open steppe) were nomadic cow herders belonging mainly to haplogroup R1b.

Their migration both westward to Europe and eastward to Central and South Asia makes it easy to infer which mtDNA haplogroups they carried (=> see also Identifying the original Indo-European mtDNA from isolated settlements). The best matches for R1a are C4a, H1b, H1c, H2a1, H6, H11, K1b1b, K1c, K2b, T1a1a1, T2a1b1, T2b2, T2b4, U2e, U4, U5a1a, W, and several I subclades.

The R1b branch would have originated in eastern Anatolia and/or northern Mesopotamia/Syria during the Early Neolithic period, where they probably domesticated cattle and became primarily cattle herders. Then would have migrated to the western part of the Iranian plateau, crossed the Caucasus to the Pontic Steppe in search for pasture for their cattle, where they mixed to some extent with I2a2 and R1a tribes that inhabited those lands. The maternal lineages of these Near Eastern R1b people would have included haplogroups H5a, H6, H8, H15, I1a1, J1b1a, K1a3, K2a6, U5, and some V subclades (like V15).

MtDNA haplogroups H4 has not been found in Europe before the Late Chalcolithic (Corded Ware culture) and the Early Bronze Age (Unetice culture) and might have been brought by the Indo-Europeans. Likewise, H6 is absent from all Mesolithic or Neolithic samples, and its strong presence in the North Caucasus and Central Asia supports an Indo-European connection.

sna pic 3.JPG

Y-DNA Haplogroups

Chronological development of Y-DNA haplogroups

·         C => 66,000 years ago (in the East Africa)

·         E => 62,500 years ago (in Africa)

·         G => 48,000 years ago (in the Middle East)

·         K => 46,000 years ago (between the Caucasus and India)

·         I => 43,000 years ago (around the Black Sea)

·         J => 43,000 years ago (in the Middle East or the Caucasus)

·         T => 42,000 years ago (around the Iranian Plateau)

·         C1a2 => 41,500 years ago (in the Middle East)

·         E1b1b => 35,000 years ago (in Northeast Africa)

·         Q & R => 32,000 years ago (in the Central Asia or Siberia)

·         J1 => 31,000 years ago (in the Caucasus or Zagros mountains)

·         J2 => 31,000 years ago (in northern Mesopotamia or the Caucasus)

·         I1 & I2 => 27,500 years ago (in Europe)

·         T1a => 27,000 years ago (around the Iranian Plateau)

·         R1b => 23,000 years ago (around the Caspian Sea or in Russia)

·         R1a => 23,000 years ago (in Russia)

·         E-M78 => 20,000 years ago (in north-eastern Africa)

·         G2a => 20,000 years ago (in the Middle East)

·         I2a1a (M26) & I2a1b (M423) => 18,500 years ago (in southern Europe)

·         J2a1 => 18,500 years ago (in northern Mesopotamia or in the Caucasus)

·         E-M123 => 18,000 years ago (around the Red Sea or in the Levant)

·         I2a2a (M223) => 17,500 years ago (in southern Europe)

·         J2b1 & J2b2 => 16,000 years ago (around the Iranian Plateau or the Caucasus)

·         N1c1 => 15,500 years ago (in northern China)

·         E-M81 => 14,000 years ago (in North Africa)

·         R1b-M269 => 13,500 years ago (around the Caspian Sea)

·         I2a2b (L38) => 12,500 years ago (in central Europe)

·         J1-P58 => 11,500 years ago (in the Middle East or the Caucasus)

·         I2a2a-L801 => 9,500 years ago (in central or northern Europe)

·         R1a1a1 (M417) => 8,500 years ago (in Northeast Europe)

·         T1a-CTS2214 => 8,500 years ago (in the Middle East)

·         E-V13 => 7,500 years ago (in Central or Southeast Europe)

·         N1c1-L1026 => 6,500 years ago (in Northeast Europe)

·         Q1b1a-L245 => 6,500 years ago (in central Asia or in the Middle East)

·         R1b-L23 => 6,500 years ago (around the Caucasus)

·         J1-L858 => 5,500 years ago (in the Middle East)

·         R1b-U106 & R1b-P312 => 5,000 years ago (in Central Europe)

·         I1a (DF29) => 4,500 years ago (in Scandinavia)

Q1a2-Y4827 => 3,000 years ago (in Scandinavia) Other haplogroups found in Europe

Haplogroup C (Y-DNA)

Haplogroup C is an extremely old lineage thought to have appear before or soon after the first migration of Homo Sapiens outside Africa, some 70,000 years ago. Men belonging to haplogroup C would have departed from East Africa during the Ice Age and followed the coasts of Indian Ocean, settling in the Arabian peninsula, the Indian subcontinent, south-east Asia, north-east Asia and Oceania.

The first group to split away was C-Z1426, which colonised the Middle East and South Asia. One branch (CTS11043) might have moved north to Central Asia, then split into two: one tribe moving west to Europe (haplogroup C-V20) while the other migrated to East Asia and survives only in Japan today (haplogroup C-M8). Haplogroup C-V20 probably represents the first migration of Homo Sapiens to Europe 45,000 years ago, and would therefore have been the first to come into contact with European Neanderthals, although Homo sapiens are likely to have interbred with Neanderthals in the Middle East before that.

The second branch of C-Z1426 spread around South Asia, Southwest Asia, and Central Asia, where it is found at low frequencies nowadays (haplogroup C-M356).

During that time, other C tribes continued their eastward migration to south-east Asia, where they split in four main regional clusters. The first branch colonised Indonesia, Melanesia, Micronesia, and Polynesia (haplogroup C2-M38). A second branch would have gone south to Australia, where they became the Aborigenes (haplogroup C4-M347). Another settled in the highlands of New Guinea (haplogroup C-P55). The fourth branch went all the way up the north-east Asia (haplogroup C3-M217) and is found nowadays chiefly among the Mongols, tribes descended from the Mongols (Kalmyks, Hazaras) including Turkic people (Kazakhs, Kyrgyz, Uyghurs, Uzbeks, Tuvans, Yakuts), East Siberian tribes (Buryats, Chukchi, Itelmens, Nivkh, Tungusic peoples), Chinese (Han, Hui, Manchus, Oroqens, Tujia), Koreans and Japanese (especially the Ainus), but also among several indigenous peoples of North America, including some Na-Dené-, Algonquian-, or Siouan-speaking populations.

Haplogroup C is a very rare lineage in Europe. The few Europeans who belong C either belong to the European C-V20, the Middle Eastern C-M358, or the Mongolian C3-M217. Haplogroup C3 has also been identified in one Hunnic skeleton from the Iron Age in present-day Mongolia. Its presence in Europe can therefore be linked to the Hunnic and Mongolian invasions, like haplogroup Q1a.

 

Haplogroup L (Y-DNA)

Geographic distribution

Haplogroup L is found mostly in West Asia and South Asia. Its overall frequency ranges between 5 and 15% in Pakistan and western India, with a peak of 23% among the Kalash of northwest Pakistan, and from 1 to 10% in central Asia (mostly in Uzbekistan, Tajikistan and Afghanistan). It is also found in the Middle East (5% in Lebanon, 4.5% in Turkish Kurdistan, 4% in Iran, 3% in Syria), in parts of the the Caucasus (7% in Azerbaijan and Chechnya, 3% in Armenia and Ingushetia), and in isolated parts of Europe (3.5% in north-east Italy, from 0.2% to 1% in the Balkans and Greece, 0.5% in Flanders).

Subclades

Haplogroup L is divided in four main subclades:

·         L1a (M27) is the mostly found in India and Sri Lanka, with frequencies decreasing towards Pakistan, southern Iran, the Arabian peninsula. It has also been found in Piedmont (Italy), Rhineland (Germany) and Flanders (Belgium).

·         L1b (M317) is found chiefly in the South Caucasus, eastern Anatolia and Lebanon. It has also been found in South Tyrol, Russia and Central Asia. Its main subclade L1b1 (M349) has been found in Italy, Switzerland, Austria, Germany, Belgium, England, northern Ireland, and scattered around most of central and eastern Europe and the eastern Mediterranean. The presence of L1b and L1b1 in Europe probably dates back to the Neolithic period.

·         L1c (M357) is an essentially Gedrosian subclade, found among the Burushos, Kalashs (L1c1-PK3 subclade), and Pashtuns of Pakistan and Afghanistan, but also among the Chechens in the north-east Caucasus. It is also found at low frequencies in other populations of Pakistan, in India, northern Iran, Georgia and Ingushetia. In Europe it has been found in Sicily.

·         At present L2 (L595) has been found exclusively in Europe (Greece, Italy, southern Germany, Russia) and in the South Caucasus.

Haplogroup H (Y-DNA)

Haplogroup H is typically found among Dravidian populations in the Indian subcontinent, especially in South India and Sri Lanka. In Europe it is found almost exclusively among the Gypsies (Romani), who belong predominantly (between 15% and 50%) to the H1a (M82) subclade of Indian origin. The highest frequencies of haplogroup H among non-Romani Europeans are found in regions with large Romani populations, such as Romania, Slovakia, the southern Balkans, and Andalusia, suggesting that these lineages are also of Romani origin.

Haplogroup H2 P96, known as F3 until 2013) was a minor lineage early Neolithic farmers in the Levant, Anatolia and Europe. It is still found at very low frequencies in western Europe, Armenia, Iran and India.

Haplogroup A (Y-DNA)

A is the oldest of all Y-DNA haplogroups. It originated in sub-Saharan Africa over 140,000 years ago, and possibly as much as 340,000 years ago if we include haplogroup A00. Modern populations with the highest percentages of haplogroup A are the Khoisan (such as the Bushmen) and the southern Sudanese.

There are only rare and isolated cases of European men belonging to haplogroup A. Commercial tests have identified a few Scottish and Irish families (surnames Boyd, Logan and Taylor) all belonging to the same A1b1b2 (M13) subclade. This subclade is normally found in East Africa (Ethiopia, Sudan), but has also been found in Egypt, the Arabian peninsula, Palestine, Jordan, Turkey, Sicily, Sardinia and Algeria. It was certainly brought to Europe by Levantine people, be it during the Neolithic or later (Phoenicians, Jews, immigration within the Roman Empire).

Haplogroup A1a* (M31) has been found in Finland, Norway and eastern England. This subclade is normally found along the west coast of Africa (Guinea-Bissau, Cape Verde, Mali, Morocco) and could have come to Europe during the Paleolithic. Indeed a few percent of sub-Saharan admixture was found among ancient DNA samples from Mesolithic Scandinavia tested by Skoglund et al. (2012).

MtDNA Haplogroups

All mtDNA haplogroups found in Europe descend from the N group, which is thought to represent one of the two initial migrations by modern humans out of Africa, some 60,000 to 80,000 years ago. Nowadays haplogroup N is only found at extremely low frequencies in various parts of Eurasia.

Unfortunately, the tiny size of mitochondrial DNA (approximately 16,500 base pairs as opposed to 60 million for Y-DNA) does not allow a very accurate tracing of ancestry. Basal mitochondrial haplogroups all arose during the Ice Age, a period when humans were nomadic hunter-gatherers, well before the establishment of cities and civilizations. Evene deep subclades generally point to a common Neolithic or Bronze Age ancestry, but rarely later than that, and do not necessarily match any recognisable historical ethnic and linguistic groups. One likely reason is that women, through whom mtDNA is passed, tended to marry outside their ethnic group more often than men (e.g. to secure an alliance between two tribes or kingdoms). Haplogroups associated with European or Middle Eastern descent are H, I, J, K, T, U, V, W and X (except the branch X2a which found among Native Americans).

 

Chronological development of mtDNA haplogroups

Note that the age of mitochondrial haplogroups is much more difficult to estimate than Y-DNA haplogroups, due to the tiny sequence of mtDNA and the few number of mutations available. The error margin for the dates below is typically of +-5,000 years, but could even exceed that for older haplogroups.

·         N => 75,000 years ago (arose in North-East Africa)

·         R => 70,000 years ago (in South-West Asia)

·         U => 60,000 years ago (in North-East Africa or South-West Asia)

·         pre-JT => 55,000 years ago (in the Middle East)

·         JT => 50,000 years ago (in the Middle East)

·         U5 => 50,000 years ago (in Western Asia)

·         U6 => 50,000 years ago (in North Africa)

·         U8 => 50,000 years ago (in Western Asia)

·         pre-HV => 50,000 years ago (in the Near East)

·         J => 45,000 years ago (in the Near East or Caucasus)

·         HV => 40,000 years ago (in the Near East)

·         H => over 35,000 years ago (in the Near East or Southern Europe)

·         X => over 30,000 years ago (in north-east Europe)

·         U5a1 => 30,000 years ago (in Europe)

·         I => 30,000 years ago (Caucasus or north-east Europe)

·         J1a => 27,000 years ago (in the Near East)

·         W => 25,000 years ago (in north-east Europe or north-west Asia)

·         U4 => 25,000 years ago (in Central Asia)

·         J1b => 23,000 years ago (in the Near East)

·         T => 17,000 years ago (in Mesopotamia)

·         K => 16,000 years ago (in the Near East)

·         V => 15,000 years ago (arose in Iberia and moved to Scandinavia)

·         H1b => 13,000 years ago (in Europe)

·         K1 => 12,000 years ago (in the Near East)

·         H3 => 10,000 years ago (in Western Europe)

·          

The testing of ancient DNA help us understand how long each haplogroup has been in Europe. Dozens of samples from the Paleolithic and Mesolithic, and hundreds from the Neolithic, Chalcolithic and Bronze Age have already been tested. You can check this non-exhaustive list of Prehistoric European mtDNA by period and culture.

dna pic 4.JPG

European mtDNA haplogroups and their subclades

Haplogroups H & V (mtDNA)

​

Haplogroup H is by far the most common all over Europe, amounting to about 40% of the European population. It is also found (though in lower frequencies) in North Africa, the Middle East, Central Asia, Northern Asia, as well as along the East coast of Africa as far as Madagascar.

H1, H3 and V are the most common subclades of HV in Western Europe. H1 peaks in Norway (30% of the population) and Iberia (18 to 25%), and is also high among the Sardinians, Finns and Estonians (16%), as well as Western and Central European in general (10 to 12%) and North-West Africans (10 to 20%). H3 is commonest in Portugal (12%), Sardinia (11%), Galicia (10%), the Basque country (10%), Ireland (6%), Norway (6%), Hungary (6%) and southwestern France (5%). Haplogroup V reaches its highest frequency in northern Scandinavia (40% of the Sami), northern Spain, the Netherlands (8%), Sardinia, the Croatian islands and the Maghreb. It is likely that H1, H3 and V, along with haplogroup U5, were the main haplogroups of Western European hunter-gatherers living in the Franco-Cantabrian refuge during the last Ice Age, and repopulated much of Central and Northern Europe from 15,000 years ago.

Haplogroup H13 is most common in Sardinia and around the Caucasus. Its distribution is reminiscent of Y-DNA haplogroup G2a. The same is true of H2 to a lower extent. This would suggest a Caucasian or Anatolian origin.

H5 and H7 are also common in the Caucasus, but their lower incidence around the Mediterranean, and higher frequency from Anatolia to the Alps via the Danube suggest a possible link with the spread of agriculture (YDNA G2a, etc.) or of the Indo-Europeans (R1b-L23).

Main articles : Haplogroup HV (mtDNA), Haplogroup H (mtDNA) and Haplogroup V (mtDNA)

​

Haplogroups U & K (mtDNA)

​

Haplogroup U is extremely old. It originated some 60,000 years ago at the confine of North-East Africa and the Middle East, soon after the first Homo Sapiens ventured out of Africa. This is why each of its top-level subclade (U1, U2, U3...) can be seen as a haplogroup in its own right. The main European subclades are U3, U4, U5 and U8/K. U1 is mostly found in the Middle East, U6 in North Africa, U7 from the Near East to India, and the rare U9 from Ethiopia and the Arabian peninsula to Pakistan.

Main article : Haplogroup U6 (mtDNA)

Haplogroup U2 is found primarily in South Asia, but probably is of Indo-European origin as it is found at low frequencies throughout the Pontic-Caspian steppe and has been identified in a 30,000 year-old Cro-Magnon from the middle Don valley in Russia. It might have been the dominant haplogroup of the northern forest-steppe foragers who later became the Proto-Indo-Iranian speakers (see R1a above) and moved massively to Central and South Asia.

Main article : Haplogroup U2 (mtDNA)

Haplogroup U3 is centered around the Black Sea, with a particularly strong concentration in the north-eastern part. It appears to be most strongly related to Y-haplogroups J1 and J2.

Main article : Haplogroup U3 (mtDNA)

Haplogroup U4 is strongly associated with Y-haplogroup R1a. It is found in most of Europe, but especially in Balto-Slavic countries, but also in Siberia, Central Asia, Afghanistan and northern Pakistan. U4 was already present in many parts of Europe (Russia, Sweden, Germany, Portugal) during the Mesolithic period, but seems to have almost disappeared from central Europe during the Neolithic, before being re-introduced by the Proto-Indo-European speakers from Russia and Ukraine during the Bronze Age.

Main article : Haplogroup U4 (mtDNA)

Haplogroup U5 is the most common in Western and Northern Europe. DNA tests on ancient skeletons have shown that U5 was the principal mitochondrial haplogroup of Paleolithic and Mesolithic hunter-gatherers in Northern Europe. Ancient DNA tests conducted in Britain, Germany and Scandinavia indicate that the frequency of U5 has progressively declined over time through the Neolithic, Bronze Age, Iron Age and Middle Ages. Nowadays it remains most common in the far north of Europe, where the Mesolithic population has been least affected by subsequent migrations. For instance, 30 to 50% of the Sami people of northern Scandinavia belong to haplogroup U5b (and about 40% to haplogroup V, which is also pre-Neolithic European origin).

Main article : Haplogroup U5 (mtDNA)

Haplogroup K is the main subclade of U8. It is found throughout Europe and Western Asia, as far away as India. Its highest concentration is in North-West and Central Europe, Anatolia and the southern Arabian peninsula. It is believed to have first arisen somewhere between Egypt and Anatolia approximately 16,000 years ago (estimates range from 22,000 years to as little as 10,000 years before present). It has the largest number of subclades of any haplogroup in spite of its fairly recent age. K1a is the largest subclade. The relatively important presence of K1a in the Near East suggest that it predates the Neolithic migration to Europe. This has been supported by the ancient mtDNA from Neolithic sites. Haplogroup K was never found in Europe prior to to the Neolithic, then suddenly appears at a frequency (17%) much higher than in modern Europeans and similar to that of the present-day Levant. Most of the Neolithic K belongs to the K1a subclade.

Most K1a4, K1a10, K1b, K1c and K2 subclades are typically European. K1a4 is also common in Anatolia and Greece, and could indeed have spread to the rest of Europe from there during the Neolithic period, along with haplogroups J and T (and Y-DNA haplogroups E1b1b, J2 and T). The Indo-Europeans from Anatolia could also have contributed to the propagation of K. K1a1b1a and K1a9 are found primarily among Ashkenazi Jews.

Main article : Haplogroup K (mtDNA)

​

Haplogroups J & T (mtDNA)

​

Haplogroup J originated in the Middle East 45,000 years, making it one of the oldest mitochondiral haplogroups in Europe and the Middle East. Haplogroups J1c and J2a1 might have been present in Southeast Europe since the Epipaleolithic, then were probably diffused by Neolithic farmers across the rest of Europe. J2b1a, a mostly Near Eastern subclade, has been found in Neolithic samples in Europe alongside J1c.

Haplogroup J1b is found across the Near East, particularly between the Caucasus, Iran and Arabia. J1b1a is the only J1b subclade typically found among Europeans. It is present all over Europe as well as around the Caucasus, in Central Asia and the Altai, and was almost certainly spread by the R1b branch of the Indo-Europeans.

J1d, J2a2, J2b2 are essentially confined to the Middle East (+ North Africa for J2a2).

Main article : Haplogroup J (mtDNA)

Haplogroup T is thought to have originated in the Middle East about 30,000 years ago. It is found throughout Europe, the northern half of Africa through the Near East to Central Asia and Siberia, with pockets in India and North-West China (Xinjiang). Some T1 and T2 subclades are thought to have entered Europe during Late Glacial and the immediate postglacial periods, but to have been dispersed around Europe mostly by later population movements, first with agriculturalists during the Neolithic, then with the Proto-Indo-European speakers from the Pontic Steppe during the Bronze Age.

Main article : Haplogroup T (mtDNA)

​

Haplogroup W (mtDNA)

​

Present at low frequencies in most of Europe, in Anatolia, around the Caspian Sea, and from the Indo-Pakistani border to Xinjiang, haplogroup W is one of the best maternal markers of Indo-European ancestry (mtDNA equivalent of R1a and R1b). Its highest frequency is in Ukraine, European Russia, Baltic countries and Finland (3 to 5% overall), as well as in northern Pakistan (15%), Punjab (9%) and Gujarat (12%). In India, it is considerably more common among the upper castes and among Indo-European speakers.

Main article : Haplogroup W (mtDNA)

​

Haplogroup I (mtDNA)

​

Like haplogroup W, haplogroup I is found at low frequency over most of Europe, especially in northern and eastern Europe, and across Central Asia as far as Pakistan and North-West India, with a characteristic presence in the North Caucasus. Haplogroup I first appears in Europe with the arrival of Proto-Indo-European cultures, notably the Unetice culture associated with Y-haplogroup R1b. The absence of haplogroup I from Paleolithic, Mesolithic and Neolithic sites, and from modern non-Indo-European speaking populations such as the Saami, the Basques and the Maghrebians all play in favour of an Indo-European origin.

Main article : Haplogroup I (mtDNA)

​

Haplogroup X (mtDNA)

​

Haplogroup X is a very old and scattered haplogroup found all over Eurasia, North Africa as well as among Native North Americans. It frequency rarely exceeds 5% of the population in any ethnic group, and is more often restricted to 1 or 2%. X1 is found almost exclusively in North Africa, while X2a is the only lineage present among Amerindians. X2d, X2e, X2n and X4 are found in Europe and Central Asia, and could therefore have been spread at least partially by the Proto-Indo-Europeans.

The strong presence of X2 around the Caucasus, progressively fading towards the Near East and Mediterranean , hints that it could be related to the spread of Y-DNA haplogroup G2a. R1b1b and G2a both having origins around the Caucasus it is unsurprising to find X2 alongside these two Y-DNA haplogroups.

Main article : Haplogroup X (mtDNA)

​

Haplogroup R (mtDNA)

​

Haplogroup R is the main subclade of N, the one that was to generate the 6 most common European haplogroups (H, V, J, T, U, K). At the time of writing R subclades were numbered from R0 (a.k.a. pre-HV) to R31. Most of them are found in South Asia (R5, R6, R7, R8, R30, R31), Southeast Asia (R9, R21, R22, R24), East Asia (R9/F, R11/B), and even among Papuans (R14) and Australian aborigenes (R12). R0a peaks in the southern Arabian peninsula is common among Arabs and Middle-Easterners. R1a (not to be confused with the homonymous Y-chromosome haplogroup) is found among the Adygei people from the North Caucasus (related to the Maykop culture => see R1b section), Brahmins from northern India, northwestern Russians and Poles - basically all people closely related with the Indo-European expansion. R2 is found from northwest India and Pakistan to Iran, Georgia and Turkey. It could be connected to the Indo-Iranians.

​

Finno-Uralic mtDNA

​

Finno-Uralic people have an overall mtDNA admixture similar to other Europeans, with a higher percentage of W and U5b, and a small percentage of Siberian haplogroups such as N or A. The Sami are characterised by a high percentage of haplogroups U5b1 and V.

​

Berber mtDNA

​

The Berbers are the indigenous populationof north-west Africa. Although their Y-DNA is almost perfectly homogenous, belonging to haplogroup E-M81, Berber maternal lineages show a much greater diversity, as well as regional disparity. At least half (and up to 90% in some regions) of the Berbers belong to some Eurasian lineages, such as H, HV, R0, J, T, U, K, N1, N2, and X2, mostly of Middle or Near Eastern origin. 5 to 45% of the Berbers will have sub-Saharan mtDNA (L0, L1, L2, L3, L4, L5). There are only three native North African lineages, U6, X1 and M1, representing 0 to 35% of the people depending on the region.

Haplogroup U6 has been observed from the Iberia and the Canary Islands to Senegal in the West, and from Syria to Ethiopia and Kenya in the East. It is also found at low density in Europe, though mostly limited to Iberia. Approximately 10% of all North Africans belong to this lineage.

​

Gypsy mtDNA

​

The Gypsies (Romani people) originated in the Indian subcontinent and mixed with local population in the Middle East and Eastern Europe over the centuries. About half of the Gypsy population belong to haplogroup M, and more specifically M5 (reflected by Y-haplogroup H1a), which is otherwise exclusive to South Asia. The other mtDNA haplogroups found among the Gypsy community are mostly of Eastern European, Caucasian or Middle Eastern origin, such as H (H1, H2, H5, H9, H11, H20, among others), J (J1b, J1d, J2b), T, U3, U5b, I, W et X (X1b1, X2a1, X2f) (sources). The same diversity exist on the Y-DNA side (45% of H1a, followed by I1, I2a, J2a4b, E1b1b, R1b1b, R1a1a).

Sources

The list below is non-exhaustive and include many of the numerous references linked on these websites. Some studies and databases not published on the Web were also used.

​

Haplogroup-specific studies (Y-DNA)

​

  • Phylogeography of Y-Chromosome Haplogroup I Reveals Distinct Domains of Prehistoric Gene Flow in Europe, Rootsi et al. (2004)

  • Origin, Diffusion, and Differentiation of Y-Chromosome Haplogroups E and J: Inferences on the Neolithization of Europe and Later Migratory Events in the Mediterranean Area, Semino et al. (2004)

  • Phylogeographic Analysis of Haplogroup E3b (E-M215) Y Chromosomes Reveals Multiple Migratory Events Within and Out Of Africa, Cruciani et al. (2004)

  • Y chromosomal haplogroup J as a signature of the post-neolithic colonization of Europe, Di Giacomo et al. (2004) (PDF)

  • Tracing Past Human Male Movements in Northern/Eastern Africa and Western Eurasia: New Clues from Y-Chromosomal Haplogroups E-M78 and J-M12, Cruciani et al. (2007)

  • The emergence of Y-chromosome haplogroup J1e among Arabic-speaking populations, Chiaroni et al. (2009)

  • J1-M267 Y lineage marks climate-driven pre-historical human displacements, Tofanelli et al. (2009)

  • Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a, Underhill et al. (2009)

  • Moors and Saracens in Europe: estimating the medieval North African male legacy in southern Europe, Capelli et al. (2009)

  • A major Y-chromosome haplogroup R1b Holocene era founder effect in Central and Western Europe, Myres et al. (2010)

  • Human Y chromosome haplogroup R-V88: a paternal genetic record of early mid Holocene trans-Saharan connections and the spread of Chadic languages, Cruciano et al. (2010)

  • The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269, Busby et al. (2011)

  • ​

British Isles

​

  • A Y Chromosome Census of the British Isles, Capelli et al. (2003) (PDF)

  • Excavating Past Population Structures by Surname-Based Sampling: The Genetic Legacy of the Vikings in Northwest England, Bowden et al. (2007) (PDF)

  • People of the British Isles: preliminary analysis of genotypes and surnames in a UK-control population, Winney et al. (2011) (PDF)

  • ​

Iberian peninsula

​

  • Reduced genetic structure of the Iberian peninsula revealed by Y-chromosome analysis: implications for population demography, Flores et al. (PDF)

  • The Genetic Legacy of Religious Diversity and Intolerance: Paternal Lineages of Christians, Jews, and Muslims in the Iberian Peninsula, Adams et al. (PDF)

  • Micro-phylogeographic and demographic history of Portuguese male lineages., Beleza et al.

  • Y-chromosome Lineages from Portugal, Madeira and A¸cores Record Elements of Sephardim and Berber Ancestry, Gonçalves et al. (PDF)

  • Y Chromosome and Mitochondrial DNA Characterization of Pasiegos, a Human Isolate from Cantabria (Spain), Maca-Meyer et al. (PDF)

  • The Basques in the Genetic Landscape of Europe, by Kristin Leigh Young (PDF)

  • ​

France & Benelux

​

  • Phylogeography of French male lineages, Ramos-Luis et al. (2009) (PDF)

  • Micro-geographic distribution of Y-chromosomal variation in the central-western European region Brabant, Larmuseau et al. (2010)

  • ​

Scandinavia, Finland & Baltic

​

  • Estimating Scandinavian and Gaelic Ancestry in the Male Settlers of Iceland, Helgason et al. (2000) (PDF)

  • Y-chromosomal diversity suggests that Baltic males share common Finno-Ugric-speaking forefathers, Laitinen et al. (2002)

  • Different genetic components in the Norwegian population revealed by the analysis of mtDNA and Y chromosome polymorphisms, Passarino et al. (2002) (PDF)

  • Y Chromosome and Mitochondrial DNA Variation in Lithuanians, Kasperaviciute et al. (2004) (PDF)

  • The Western and Eastern Roots of the Saami—the Story of Genetic “Outliers” Told by Mitochondrial DNA and Y Chromosomes, Tambets et al. (2004) (PDF)

  • Y chromosome SNP haplogroups in Danes, Greenlanders and Somalis, Sancheza et al. (2004) (PDF)

  • Y-chromosome diversity in Sweden - A long-timeperspective, Karlsson et al. (2006) (PDF)

  • Geographical heterogeneity of Y-chromosomal lineages in Norway, Dupuy et al. (2005) (PDF)

  • Regional differences among the Finns: A Y-chromosomal perspective, Lappalainen et al. (2006)

  • Population Structure in Contemporary Sweden: A Y-Chromosomal and Mitochondrial DNA Analysis, Lappalainen et al. (2008)

  • Migration Waves to the Baltic Sea Region, Lappalainen et al. (2008)

  • ​

Central Europe

​

  • MtDNA and Y chromosome polymorphisms in Hungary: inferences from the palaeolithic, neolithic and Uralic influences on the modern Hungarian gene pool, Semino et al. (2000) (PDF)

  • Significant genetic differentiation between Poland and Germany follows present-day political borders, as revealed by Y-chromosome analysis, Kayser et al. (2005) (PDF)

  • Y-chromosome STR haplotypes in a population sample from Switzerland (Zurich area)., Haas et al. (2006)

  • Y-Chromosomal Variation in the Czech Republic, Luca et al. (2006) (PDF)

  • Y-chromosome analysis of ancient Hungarian and two modern Hungarian-speaking populations from the Carpathian Basin, Csányi et al. (2008)

  • The genetic structure of the Slovak population revealed by Y-chromosome polymorphisms, Petrejcíková et al. (2009) (PDF)

  • New Y-chromosome binary markers improve phylogenetic resolution within haplogroup R1a1, Pamjav et al. (2012)

  • Pasture Names with Romance and Slavic Roots Facilitate Dissection of Y Chromosome Variation in an Exclusively German-Speaking Alpine Region, Niederstätter et al. (2012)

  • Contemporary paternal genetic landscape of Polish and German populations: from early medieval Slavic expansion to post-World War II resettlements, Rebala et al. (2013)

  • ​

Italy & Greece

​

  • Clinal patterns of human Y chromosomal diversity in continental Italy and Greece are dominated by drift and founder effects, Di Giacomo et al. (2003) (PDF)

  • Peopling of three Mediterranean islands (Corsica, Sardinia, and Sicily) inferred by Y-chromosome biallelic variability, Francalacci, et al. (2003)

  • Population Structure in the Mediterranean Basin: A Y Chromosome Perspective, Capelli et al. (2006) (PDF)

  • Y chromosome genetic variation in the Italian peninsula is clinal and supports an admixture model for the Mesolithic-Neolithic encounter, Capelli et al. (2007)

  • New genetic evidence supports isolation and drift in the Ladin communities of the South Tyrolean Alps but not an ancient origin in the Middle East, Thomas et al. (2007)

  • Y-chromosome genetic structure in sub-Apennine populations of Central Italy by SNP and STR analysis, Onofri et al. (2007)

  • Y-chromosomal evidence for a limited Greek contribution to the Pathan population of Pakistan, Firasat et al. (2007) (PDF)

  • Paleolithic Y-haplogroup heritage predominates in a Cretan highland plateau, Martinez et al. (2007) (PDF)

  • Slow and fast evolving markers typing in Modena males (North Italy), Ferri et al. (2008)

  • Male haplotypes and haplogroups differences between urban (Rimini) and rural area (Valmarecchia) in Romagna region (North Italy), Ferri et al. (2008)

  • Y-Chromosome Based Evidence for Pre-Neolithic Origin of the Genetically Homogeneous but Diverse Sardinian Population: Inference for Association Scans, Contu et al. (2008) (PDF)

  • Moors and Saracens in Europe: estimating the medieval North African male legacy in southern Europe, Capelli et al. (2009)

  • Differential Greek and northern African migrations to Sicily are supported by genetic evidence from the Y chromosome, Di Gaetano et al. (2009)

  • Genetic Structure in Contemporary South Tyrolean Isolated Populations Revealed by Analysis of Y-Chromosome, mtDNA, and Alu Polymorphisms, Pichler et al. (2009)

  • The coming of the Greeks to Provence and Corsica: Y-chromosome models of archaic Greek colonization of the western Mediterranean, King et al. (2011) (PDF)

  • Uniparental Markers of Contemporary Italian Population Reveals Details on Its Pre-Roman Heritage, Brisighelli et al. (2012)

  • Uniparental Markers in Italy Reveal a Sex-Biased Genetic Structure and Different Historical Strata, Boattini et al. (2013)

  • Low-Pass DNA Sequencing of 1200 Sardinians Reconstructs European Y-Chromosome Phylogeny, Francalacci et al. (2013)

  • ​

Dinaric Alps & Balkans

​

  • Y chromosomal heritage of Croatian population and its island isolates, Barac et al. (2003) (PDF)

  • High-Resolution Phylogenetic Analysis of Southeastern Europe Traces Major Episodes of Paternal Gene Flow Among Slavic Populations, Pericic et al. (2005) (PDF)

  • The Peopling of Modern Bosnia-Herzegovina : Y-chromosome haplogroups of the three main ethnic groups, Marjanovic et al. (2005) (PDF)

  • Y-chromosomal STR haplotypes in Macedonian population samples, Mirko Spiroski et al. (2005)

  • Paternal and maternal lineages in the Balkans show a homogeneous landscape over linguistic barriers, except for the isolated Aromuns, Bosch et al. (2006) (PDF)

  • Population History of the Dniester-Carpathians: Evidence from Alu Insertion and Y-Chromosome Polymorphisms, Alexander Varzari (2006)

  • Allele frequencies and population data for 17 Y-chromosome STR loci in a Serbian population sample from Vojvodina province, Veselinovic et al. (2007)

  • Y-chromosomal evidence of the cultural diffusion of agriculture in southeast Europe, Battaglia et al. (2008) (PDF)

  • Y Chromosome Single Nucleotide Polymorphisms Typing by SNaPshot Minisequencing, Noveski et al. (2009)

  • Human Y-chromosome short tandem repeats: A tale of acculturation and migrations as mechanisms for the diffusion of agriculture in the Balkan Peninsula, Mirabal et al. (2011)

  • Croatian national reference Y-STR haplotype database, Mrsic et al. (2012)

  • High levels of Paleolithic Y-chromosome lineages characterize Serbia, Regueiro et al. (2012)

  • Y-Chromosome Analysis in Individuals Bearing the Basarab Name of the First Dynasty of Wallachian Kings, Martinez-Cruz et al. (2012)

  • Y-Chromosome Diversity in Modern Bulgarians: New Clues about Their Ancestry, Karachanak et al. (2013)

  • Paleo-Balkan and Slavic Contributions to the Genetic Pool of Moldavians: Insights from the Y Chromosome, Varzari et al. (2013)

  • ​

Russia, Belarus & Ukraine

​

  • Gene Pool Structure of Eastern Ukrainians as Inferred from the Y-Chromosome Haplogroups, Kharkov et al. (2004)

  • Genetic Evidence for the Mongolian Ancestry of Kalmyks, Nasidze et al. (2005)

  • Gene pool differences between Northern and Southern Altaians inferred from the data on Y-chromosomal haplogroups, Kharkov et al. (2007)

  • Two Sources of the Russian Patrilineal Heritage in Their Eurasian Context, Balanovsky et al. (2008)

  • Y-Chromosome distribution within the geo-linguistic landscape of northwestern Russia, Mirabal et al. (2009)

  • Structure of the Gene Pool of Bashkir Subpopulations, Lobov, et al. (2009) (PDF)

  • Uniparental Genetic Heritage of Belarusians: Encounter of Rare Middle Eastern Matrilineages with a Central European Mitochondrial DNA Pool, Alena Kushniarevich (2013)

  • ​

Caucasus

​

  • Testing hypotheses of language replacement in the Caucasus: evidence from the Y-chromosome, Nasidze et al. (2003)

  • Mitochondrial DNA and Y-Chromosome Variation in the Caucasus, Nasidze et al. (2004)

  • Culture creates genetic structure in the Caucasus: Autosomal, mitochondrial, and Y-chromosomal variation in Daghestan, Marchani et al. (2008) (PDF)

  • Neolithic patrilineal signals indicate that the Armenian plateau was repopulated by agriculturalists, Herrera et al. (2011)

  • The Caucasus as an asymmetric semipermeable barrier to ancient human migrations, Yunusbayev et al. (2011)

  • ​

Near East

​

  • The Y Chromosome Pool of Jews as Part of the Genetic Landscape of the Middle East, Nebel et al. (2001)

  • Excavating Y-chromosome haplotype strata in Anatolia, Cinnioglu et al. (2004)

  • Isolates in a corridor of migrations: a high-resolution analysis of Y-chromosome variation in Jordan, Flores et al. (2005)

  • MtDNA and Y-chromosome Variation in Kurdish Groups, Nasidze et al. (2005)

  • : tricontinental nexus for Y-chromosome driven migration, Regueiro et al. (2006)

  • Y-Chromosomal Diversity in Lebanon Is Structured by Recent Historical Events, Zalloua et al. (2008)

  • Saudi Arabian Y-Chromosome diversity and its relationship with nearby regions, Abu-Amero et al. (2009)

  • Geographical Structure of the Y-chromosomal Genetic Landscape of the Levant: A coastal-inland contrast, El-Sibai et al. (2009)

  • In search of the genetic footprints of Sumerians: a survey of Y-chromosome and mtDNA variation in the Marsh Arabs of Iraq, Al-Zahery et al. (2011)

  • Ancient Migratory Events in the Middle East: New Clues from the Y-Chromosome Variation of Modern Iranians, Grugni et al. (2012)

  • ​

North Africa

​

  • Y-chromosome haplotypes in Egypt, G. Lucotte & G. Mercier (2003)

  • The Levant versus the Horn of Africa: Evidence for Bidirectional Corridors of Human Migrations, Luis et al. (2004)

  • Introducing the Algerian Mitochondrial DNA and YChromosome Profiles into the North African Landscape, Bekada et al. (2013)

  • ​

Y-DNA Projects & Databases

​

  • FTDNA : Ireland Y-DNA Project

  • FTDNA : Scottish DNA Project

  • FTDNA : Wales Cymru DNA Project

  • FTDNA : German Language Area DNA Research Project

  • FTDNA : French Heritage Project

  • FTDNA : House of Spain DNA Project

  • FTDNA : Italy DNA Project

  • FTDNA : Haplogroup G (Y-DNA) Project

  • FTDNA : Haplogroup I2* Project

  • FTDNA : Haplogroup I2b1 Project

  • FTDNA : Haplogroup I2b2 Project

  • FTDNA : Haplogroup Q Project

  • R1b-S28/U152 database, David Faux

  • Iberian Roots : commercial Y-DNA database of Spanish and Portuguese regions

  • ​

Multiple regions (Y-DNA)

​

  • Y-Chromosomal Diversity in Europe Is Clinal and Influenced Primarily by Geography, Rather than by Language, Rosser et al. (PDF)

  • The Genetic Legacy of Paleolithic Homo sapiens sapiens in Extant Europeans: A Y Chromosome Perspective, Semino et al.

  • Where Did European Men Come From, Kalevi Wiik (PDF)

  • Human Y-Chromosome Variation in the Western Mediterranean Area: Implications for the Peopling of the Region, Scozzari et al.

  • ​

Multiple regions (mtDNA)

​

·  Geographic Patterns of mtDNA Diversity in Europe, Lucia Simoni et al.

·  Phylogenetic Network for European mtDNA, Saara Finnila et al.

·  High-resolution mtDNA evidence for the late-glacial resettlement of Europe from an Iberian refugium, Luísa Pereira et al.

·  Y-Chromosomal Diversity in Europe Is Clinal and Influenced Primarily by Geography, Rather than by Language, Rosser et al.

roma europe.JPG
bottom of page