Origins and spread of Indo-European languages: an alternative view

142 Comments

After over 5 years of being away without officially leaving, I’ve finally got around to write a closing post for this blog. And unsurprisingly, it deals with the Indo-European (IE) question which has been the main focus of ancient DNA studies and the subject that has brought the most interest to the people who followed them. I thought I’d never have to write this post, since back then when I stopped writing things were already clear enough and it should have been a matter of months that the mainstream publications would have written what I’m going to do now. However, 5 years down the line this is still pending and many linguists have been trusting that the current mainstream view is essentially proved and that they had to adapt their theories to those findings. When I found out this some 2-3 years ago is when I first thought to write a final post about the subject (but it’s taken a while to actually do it), and that brings me to the main purpose of this post: It’s written mostly for linguists working on the history of languages, not for people interested in ancient DNA studies. This is because the latter can already make up their own minds about the interpretation of the findings, while the former are largely dependent on the interpretation and conclusions that they are given by those writing the studies. And I think they deserve to have an alternative view before they go too far in changing their works just because they may not fit well into what they’ve been given as proven facts about the origin and spread of IE languages.

Since this is quite a big subject, I’ll be looking at each geographical area (from the west to the east) and cover the basic evidence we have from each of them before going on to look at the whole picture and a final summary. And ultimately, as it’s been the case with all the previous posts on this blog, it will be in the comments section where there will be further discussions and details about all of the things mentioned in the post, so stay tuned for those and feel free to participate in them with any questions or thoughts so that we can all get a better understanding of the data available and its interpretation.

Europe

We’ll start with this area of the IE speaking world which is probably the easiest to understand, but unfortunately the ancient DNA studies have not been able to explain it the way that both historians and linguists would require in order to understand it in a way that is useful for them. Here we’ll try to address that issue and explain the historical reality of the region on a population basis.

The Upper Paleolithic in Europe has been explained well enough, basically showing the discontinuities between different periods (Aurignacian, Gravettian, Magdalenian…). The old idea that the earliest Anatomically Modern Humans (AMH) that populated Europe are the ones from who modern Europeans mostly descend from (often Basques being cited as the most direct descendants of the Cro-Magnon people) has been thoroughly disproved. All of those early populations went extinct, and the last one to arrive did so probably just before the Last Glacial Maximum (LGM) some 25-22 thousand years ago (likely from Anatolia) and was not associated at the time with any specific culture until the Epi-Gravettian from after the LGM, mostly from Italy. It was this population the one that inhabited Europe during the Mesolithic period and have been named as Western Hunter-Gatherers (WHG).

With the advent of the Neolithic a new population started to colonise Europe (again, from Anatolia, though this time the origin is proved and not just likely) bringing farming with them. They are know as the Early European Farmers (EEF) or, sometimes, Anatolian Farmers. Their expansion throughout Europe was slow (as these were sedentary populations) but they eventually populated most of Europe replacing the Mesolithic WHG populations that preceded them. However, throughout this long period and as they further ventured deeper into Europe, the farmers did get increasing levels of admixture from those WHG populations reaching levels of up to 25% of WHG genes in their Anatolian Farmers’ genomes.

However, by the end of the Neolithic another big event happened at the population level in Europe, and it is the most crucial one for the purposes of this post and the one that has not been very well explained so far.

Depopulation and Repopulation of Northern and Western Europe

Some 10 years ago, genetic studies started to appear (Haak et al. 2015 was the first of them) showing a surprisingly large migration from the Eurasian steppe into Europe at the end of the Neolithic (starting c. 3000 BC) that changed the genetics of the European populations to formed the basis of what modern European are. These steppe migrations were associated with the Corded Ware Culture (CWC) in Northern to Central Europe, and with the Bell Beaker Culture in Western Europe. It was said that their genetic impact was roughly 50% across Northern Europe, going down to some ~30% towards Iberia. It was also stated that there was a male bias in this genetic impact, given that the Y Chromosome (passed from fathers to sons) from Europeans turned to be of a steppe origin while the mitochondrial DNA (passed from mothers to sons and daughters) was largely of Neolithic origin. Somehow, our (modern European’s) fathers came from the steppe and our mothers from Anatolia. It was also speculated the reason for this could be due to the steppe people bringing some pathogens with them that may have impacted severely the Neolithic populations they came across, with some strain of Yersinia pestis (the cause of the plague) found in steppe samples being the main suspect (link, link). This also was in line with other genetic studies (link) that had shown a big bottleneck in populations across Europe (particularly all across northern and western Europe, with Italy and the Balkans seeing a smaller one and Greece not seeing it at all, see Figure 2) at the end of the Neolithic as well as rapid expansion of a few paternal lineages (link).

Now, this may be correct (or mostly), but it fails to provide an explanation of what happened that can be easily understood by anyone confronting this information. We have to take a step back, focusing not so much on the genes but rather on the people (the communities of people) who carried those genes so that we can make better sense out of it.

During the Neolithic period, communities of people from Anatolia started to settle in Europe, advancing slowly until they occupied the majority of the European territory. They had a distinct genetic profile when compared to the WHG that lived in Europe before their arrival. This applies both to the autosomes (basically their whole genome) as well as their uniparental markers (the Y Chromosome for the paternal ones and the Mitochondrial DNA for the maternal ones). The most prevalent paternal lineages were the ones under the G2a branch. WHG, on the other hand, had most of their paternal lineages under the I2a branch. Minor paternal lineages in both populations didn’t overlap either, at least initially. However, slowly along the 4000 years between ~7000 BC and ~3000 BC, the farming communities admixed occasionally with the hunter-gatherers, which resulted in acquiring genome-wide signatures of WHG (very low in the Balkans, but increasing towards central, northern and western Europe, to around 25%) as well as uniparental markers. Interestingly, the WHG paternal lineage I2a once it entered the farmer’s gene pool, it rose in frequency to the point that by the end of the Neolithic it had become the most common one among farmers, relegating their original G2a to a second place. This pattern usually points to some sort of selection, though in this case the reason is unclear (and for the purposes of this post, irrelevant anyway).

Then around 3000 BC something happened throughout Europe, affecting specially all of the northern and western parts of it, and causing a big population collapse. The reasons for this are unknown – could be a change in the climate (the end of the warm period known as the Holocene Climate Optimum, that triggered another series of events that could include hunger due to the lack of crops, disease, increase of violent conflicts, etc…), but once again the reason is not really relevant for the purpose of this post. Suffice to say that the Neolithic population across Europe got severely decimated, with many areas becoming completely depopulated.

Meanwhile, in the North Pontic steppe small populations of pastoralists had started to thrive with their mobile economy that was not based on crops, but instead on animal husbandry. These types of populations have proven to be more resilient to the sort of changes that greatly affect the larger, more densely packed and sedentary ones that rely heavily on crops. They’re also much more mobile and can occupy the territory much faster if the conditions allow for it. And this is basically what they did when the Neolithic communities from Europe collapsed.

These steppe populations had originated probably in the North Caspian shores (maybe when they started to have domesticates in the mid-late 6th millennium BC, if not earlier) but it was not until they moved west to the North Pontic region and the conditions allowed for it (the invention of wheeled vehicles ca. 3500 BC seems to have been a crucial factor, though pulled by oxen, since they didn’t have horses as once believed) that they started to expand very successfully. The initial separation into the two main groups may have happened around that time (mid 4th mill. from some North Pontic culture like the Lower Mikhaylovka groups) to end up forming the Yamnaya Culture and the Corded Ware Culture (CWC) with the former occupying most of the steppe (specially if we include the very closely related Afanasievo Culture) and the latter expanding into what I will refer to as the Corded Ware Horizon (CWH) which would include the Bell Beaker Culture (BBC) to the west and the forest steppe cultures of the time (Fatyanovo-Balanovo, Abashevo, Sintashta, Andronovo) to the east, covering an extremely vast territory that went from Western Europe to Southern Siberia by the end of the 3rd mill. For now we’ll turn our attention to this CWH group.

The CWH people separated from the other main steppe population by heading north and leaving the steppe for the forest steppe. While the exact place and time of their initial steps is not known so far, we do know that they started to expand ca. 3000 BC reaching the Baltic Sea and moving west to Central Europe where they appear around 2800 BC. During this initial expansion, they encountered a few areas where the Neolithic populations had not died out completely. And the reason why we know this is because the steppe populations started to show admixture from those Neolithic Farmers (probably from those left from the Globular Amphora Culture) and we know that this admixture came from incorporating EEF females into their communities. We don’t know the details of how these “foreign” females were incorporated (could be from peaceful agreements, could be by force, we don’t know), nor their exact status in these steppe communities. But we do know that their offspring must have had the exact same status as the rest of the people in the community, since there’s no genetic difference across these communities, where the “foreign” genes were spreading equally among the whole community. Population growth must have been a priority for these small steppe communities, probably because the conditions they were finding allowed (and maybe demanded) such growth. They were successfully populating vast, largely depopulated areas that they could exploit and it seems that whenever they had the chance to incorporate females from the few Neolithic communities they found along the way, they did so in order to increase the growth rate. Males, on the other hand, didn’t seem to have been welcome, probably due to the patriarchal nature of these steppe people that organised themselves in family clans (much like the late Neolithic farmers from Europe did too). The evidence for this dynamic of incorporating females but not males is very clearly seen by looking at the uniparental markers, where we do see European Neolithic haplogroups in their mitochondrial DNA, but not a single European Neolichic haplogroup in their Y chromosome, and then at their autosomes which show the genome-wide admixture they were getting via these females.

By the time they reached Central Europe around 2800 BC, the CWH people had around 30% admixture from the Neolithic farmers. Quite a significant amount, but not surprising given how fast a small population can change genetically when they start incorporating “foreign” genes into their pool. Then during their stay in Central Europe, this admixture increased to around 50% by ca. 2500 BC (which means that they still found some Neolithic communities that survived there and from which they could incorporate females). However, one may wonder what was happening in those Neolithic communities meanwhile. That’s something we don’t really know. We don’t have a single sample in the ancient DNA record from the Neolithic communities from the periods just before, during or after the arrival of the steppe communities. The only evidence we have that some of them survived the collapse comes precisely from the admixture that we see in the steppe communities that were occupying their former territories. So, essentially, a few of the Neolithic communities lived just long enough to see the steppe ones arriving and acquiring females from their communities before they died out completely (we don’t know if it was this “borrowing” of females from the steppe groups what precipitated their final extinction, though that’s a possibility even if the “borrowing” of females didn’t imply any violence).

Then from Central Europe, around 2500 BC, these steppe communities continued their expansion to Western Europe. We know the communities that did so as the Bell Beaker Culture (BBC), but they were the same people. Curiously, this expansion to Western Europe started from a very small clan within the CWC people. And we know this because they had a Y Chromosome haplogroup that was very rare among the CWC (the vast majority of males from the CWC had a subclade of the R1a branch, while the males from the BBC had one from the R1b branch). This also stresses how small the initial population that repopulated Western Europe must have been. Essentially a small family clan that once they settled in Central Europe started to be successful and then went on to occupy the whole of Western Europe. This, again, was facilitated by the fact that most of Western Europe had become almost completely depopulated. For example, the BBC people who colonised the British Islands were genetically identical to how they were already in Central Europe. In other words, on their way to the Islands and on the Islands themselves, they didn’t seem to have found any females from surviving Neolithic groups to incorporate into their own communities and grow faster. Here, both the lack of direct evidence of any surviving Neolithic community as well as the indirect one from no traces of admixture in the steppe populations that moved across that territory indicate that it was almost completely (if not completely) depopulated.

On their way to the Iberian peninsula and in the peninsula itself, however, they did find some surviving Neolithic communities as again we see further admixture coming from the “foreign” females they were incorporating into their own communities. By the time they had settled the Iberian peninsula, this admixture had increased to around 70%. But again, we have no direct evidence of these surviving Neolithic communities from the time when the steppe people arrived. It’s just the indirect one (in the form of admixture in steppe populations) that allows us to know that they must have been there, even if it was quite shortly once the steppe people arrived (if the arrival of the steppe people is what precipitated their extinction is something that, once more, we don’t really know – but it seems plausible).

What all the process described in the above paragraphs basically means is that Northern and Western Europe were completely (re)populated by people who came from the steppe. By communities, clans, of people that came from the steppe. This was not a 50% replacement of the previous Neolithic population. It was a 100% replacement. Every single Neolithic community died out before or at the time the steppe communities arrived. The fact that some (or many) of the genes from EEF survived (through those females that were incorporated into the steppe communities and passed their genes along) does not have any historical (and therefor linguistic) relevance. The people, the communities of people with their culture and language, that populated all of these parts of Europe were originally from the steppe. All of them. We don’t have evidence of even a single exception. The paternal lineages from the Neolithic people disappeared simply because the Neolithic communities of people disappeared.

Thus, after this expansion throughout the 3rd mill., we have the CWH people all the way from Western Europe to the Altai Mountains of South Siberia. And they were the sole occupants of all that area. Basically a big family very closely related to each other and without any discontinuity in their occupied territory. Which means, clearly,  that they all spoke the same language, and probably that the divergence between the language spoken by someone in Ireland or Iberia (BBC) and someone in Southern Siberia (Sintashta-Andronovo cultures) ca. 2000 BC was not very large. Which takes us to the next question about which language was that.

Mainstream studies have been suggesting that the CWC must have spoken something they called “Indo-Slavic”, i.e, and Indo-European language from which both Balto-Slavic and Indo-Iranian languages descended from. But that would imply that such language was also spoken throughout Western Europe, something that we don’t have any evidence of whatsoever. Moreover, it would imply that Celtic and Italic would be descendants of Indo-Slavic, something that is at odds with basic linguistics.

Therefor, it would be better to suggest that they spoke an older form on Indo-European language from which all others descended from (except the Anatolian branch, and maybe Tocharian). The only problem is that not only we don’t have any evidence to support this, but that all the evidence we have contradicts this idea.

To examine this, we should start from the easiest place: The Iberian peninsula, where we have earlier evidence of languages that in the rest of the territory of the CWH, and that being rather isolated in the far west is free of confounding factors. And when we look at the earliest languages known from there, we see that the languages spoken that were not replaced by the recent (at the time of the recorded languages) Celtic expansion were non-Indo-European. I already wrote a few years back some insights about the languages of Iberia, looking at the relationship between Basque and Iberian, as well as to the substrates. There I presented some of the latest linguistic research (which was, and probably still is, only available in Spanish) showing the shift in the paradigm that used to consider the relationship between Basque and Iberian a sort of a legend to become the most accepted idea that they are indeed family related. I also explained that one of the obstacles that this possible relationship had to overcome was the believe that the Basque and the Iberian people were completely unrelated, with Basques being descendants of the first European AMHs and Iberians being a Mediterranean population. This problem is not only solved now, but the fact that we now know that Basques and Iberians were the exact same people who arrived shortly after 2500 BC and settled the whole peninsula (without any of the Neolithic populations that lived there before surviving) actually makes it almost impossible to argue that they could speak different (unrelated) languages. This is one of those cases where ancient DNA has come at the right time to confirm without a doubt the recent (and at the time slightly controversial) linguistic research. (As a side note, when talking about Iberia I don’t refer to Tartessian because of it’s unclear classification, with the only possibilities being that it was either a Celtic language or, more likely, a form of Iberian).

When it came to substrates, I pointed out how the ancient DNA evidence had disproved a line of research that had become very popular and accepted: the Indo-European substrate throughout the Iberian peninsula, specially strong in areas (south, east and even the Basque Country itself) where non-Indo-European languages were spoken at the time of our first records. This theory was championed by the prominent linguist Francisco Villar, who was finding Indo-European substrates everywhere, but he was very adamant in pointing out that they were non-Celtic and non-Italic (obviously neither Indo-Slavic, just “unknown” IE). This was all a way to prove the Paleolithic Continuity Theory, and it had several followers who contributed to it. In that mentioned article, I looked at one study (in English) by Leonard A. Curchin where he goes through the substrate in Catalonia (an Iberian speaking region) where he finds that 50% of it comes from that non-Celtic, non-Italic IE branch (in contrast, he only finds 10% of the substrate to be Iberian). The confirmation that this theory cannot be correct has significant implications, since the reason why that 50% substrate was considered IE was none other than the fact that it was found in many other parts of Europe (where Iberian could have never been spoken, according to the tradition). This brings us to the next point, which is the large amount of non-IE words incorporated into the reconstructed PIE. This heavily “vasconised” (from Vasconic) reconstruction of PIE has also been found by some researcher based on statistical analysis (I can’t comment on the validity of the method used, but somehow the result seems to be correct, even if by chance):

The new surprise is that PIE, as usually reconstructed, appears to be a sister-language of Basque, in complete breakaway from Hittite. Amazingly, PIE would be as close to Basque as the North Caucasic languages are close to each other. This clearly shows that PIE, as usually reconstructed, must be seriously erroneous and contains plenty of substratic Paleo-European words, that drag the general picture away from Hittite and closer to Basque.A lexico-statistical comparison of Basque, Arnaud Fournet (draft, 2018).

A related phenomenon was found by Ranko Matasović when looking at the substrate in Balto-Slavic, noticing a common substrate in Northern and Western European IE languages not present in SE European ones:

“This paper presents an analysis of those words, attested in Balto-Slavic, that do not have a clear Indo-European etymology and that could have been borrowed from some substratum language. It is shown that Balto-Slavic shares most of those words with other Indo-European languages of Northern and Western Europe (especially with Germanic), while lexical parallels in languages of Southern Europe (Greek and Albanian) are much less numerous.” Ranko Matasović, Substratum words in Balto-Slavic,  2013.

When we look at modern Basque, we see that it’s absolutely full of Latin/Romance loanwords, which is expected given the last 2000 years of history, while it has very few Celtic ones (also expected, since their resistance to the Celtic expansion must have made them enemies and limited their contacts during the several centuries or neighbourhood), but there’s not trace of the old IE language that the CWH people would have spoken during the previous 2000 years to the arrival of Celtic.

Looking outside of Iberia we keep finding problems that can’t be explained if the CWH people had spoken an IE language. A non-Indo-European substrate in insular Celtic (usually considered either Afro-Asiatic -which now we know can’t be correct- or Vasconic) wouldn’t make any sense. As it wouldn’t make any sense for Germanic to be the least Indo-European of all the known IE branches at its core. There is a clear necessity for Northern and Western Europe to have a non-IE substrate, and and even more clear necessity to have a source for the non-IE languages attested. For a substrate, you need longstanding interaction between locals and migrants, with locals (usually the majority of the population) switching gradually to the language of the incoming people, first as a second language and eventually as the only one. This didn’t happen here, since interactions between locals and incoming people were from very short to non-existent depending on the place, and no local population switched to the language of the migrating one because, quite simply, no local populations survived.

In summary:

  • Northern and Western Europe experienced a population collapse at the end of the Neolithic (starting around 3000 BC and finishing around 2300 BC in some southern areas of Iberia).
  • Populations from the steppe (CWC and BBC, who were the same people) repopulated all of Northern and Western Europe. A 100% population turnover.
  • These populations from the steppe came from a small group initially, so they all had to share the same language.
  • That language had to be non-IE according to all the evidence we have.

However, since the good thing when it comes to both IE and whatever language was spoken by the CWH -I will refer to the latter, due to its geographical and temporal location as North Eurasian Bronze Age (NEBA) language family from now on- the areas covered are very large, we will go through the rest of them to confront what I’ve proposed here with the data we have from the rest of the areas.

Italy

In contrast to Northern and Western Europe, Italy didn’t experienced a complete collapse of the Neolithic population. It’s likely that several areas got severely decimated or even completely depopulated, but Neolithic communities still persisted during and after the arrival of the people from the steppe. Therefor, the picture we have is quite different, with two populations of different origin inhabiting the area during the Bronze Age.

From a linguistic point of view this would mean that two language families may have been used along the Bronze Age, one from the EFF (unknown family) and the other one from the CWH (NEBA language). The picture we get by the Iron Age when we start to have evidence of the languages spoken in continental and peninsular Italy is analogous to what we see in Iberia: All the populations that didn’t switch to the recently arrived Celtic and Italic languages spoke a non-IE one. We don’t have any traces of an Indo-Slavic language or any other old form of IE that could be attributed to an arrival ca. 2500 BC.

Looking at the genetics, we have samples from Etruscan and Italic Speakers from Central Italy and they are both more or less identical and both largely descend from the CWH people (not 100% as in Northern and Western Europe, since in Italy they did admix further with the EEF that lived on along the Bronze Age). In other words, while no conclusive evidence can be learned from Italy alone, it’s all compatible with what we’ve seen in the previous section. To clarify, the Etruscan language itself could either come from the CWH (more likely) or from the EEF (less likely, but perfectly possible). This is ultimately a linguistic problem. (NOTE: As I was writing this, a new study with samples from Iron Age Picenes from Novilara and Pesaro -North Picene speakers, a poorly attested and controversial language- has been published. No surprises, as the samples resemble the above mentioned ones being largely of steppe origin). UPDATE: some models of Etruscans here and here)

As a side note, and for the sake of completeness, a short note about Sardinia. Modern Sardinians are outliers among the European populations in that they derive most of their ancestry from the EEF that colonised Europe from Anatolia during the Neolithic. However, ancient DNA does not show a complete continuity since the Neolithic. We have samples from the Bronze Age that have steppe origins. The contacts between Sardinia and the Mediterranean coasts of Iberia, France and Italy is then proved by these samples, though even without them it would still be reasonable to think that there were longstanding contacts between Sardinia and those other areas that were inhabited by CWH people. Therefor, it would be a mistake to assume that Paleo-Sardinian must be a language that came from EEF based on the modern DNA. It may well be from that source, but it may as well be a NEBA language borrowed from the neighbouring regions of mainland Europe. Once more, this is just a linguistic problem since DNA allows for both options to be possible.

South Eastern Europe

Unlike the rest of Europe, the Balkans didn’t see any migration from the CWH people. Instead, it was the sister branch, the Yamnaya people, who moved into the Balkans in the period from ca. 3200 BC to 2500 BC. As in Italy, the Balkans didn’t see a full collapse of the Neolithic populations, but probably the northern parts of it did see a significant decimation in the Neolithic people that facilitated the arrival of steppe populations. The southern parts (modern day Greece) remained fully populated by its Neolithic inhabitants along the 3rd millennium.

It’s hard to estimate accurately the impact of the steppe migrations in the Balkans due to not having enough samples so far, but in general we can say that it was significant but relatively modest compared to the rest of Europe. After 2500 BC, it’s likely that no new migrations occurred from the steppe, and the steppe people who were already in the Balkans must have started to mix with the local populations (more on this later).

From a linguistic point of view, what is remarkable at first sight is that we don’t have any surviving non-IE language in mainland SEE, even though it’s the area where languages could be attested earliest compared to the rest of Europe. And the better explanation for this is the fact that Indo-European speakers entered SEE at an earlier date, replacing the languages from both EEF and Yamnaya people before the Iron Age.

We are still missing the direct evidence from the critical samples, but we’ve had the indirect evidence for quite a while. Let’s look at the details.

Indo-European populations started to enter SEE Europe during the period from 2400-2000 BC. They came from West Asia (North West Anatolia was the immediate origin, but ultimately their origin had to be deeper into West Asia, around the South Caucasus) and settled the area of Thrace during this period. We don’t have the direct genetic evidence of this, since we simply lack any samples from this place and time, so I’ll quote from a relevant paper about the archaeological side of it:

“So, while the first half of the 3rd millennium BC in Thrace is characterised by a (comparatively) moderate level of social and economic complexity and the ideological dominance of pastoral tribes of a north-Pontic origin, there is a real explosion in complexity in the period between 2400 and 2000 BC and the region becomes increasingly included within a much wider network that is now dominated by frequent and highly visible exchange and trade, and new forms of prestige and status expression”

“The same conclusion of the existence of foreigners is also indicated by the use of many exotic and prestigious objects, often made of silver. This metal was not readily available in EBA Thrace. We can also note that tin-bronzes may have arrived into this region via Anatolia rather than Europe […] and it is difficult to imagine how such a quantity and quality, and the imaginations and customs behind these, can be transferred to Europe without having individuals or groups of people carrying them, and the infrastructure to organise their transport and wider distribution”

“There can be no doubt that the driving force behind this influx of goods and people is enhanced exchange and organised trade, and it is in no way an accident that concurrently the largest exchange network the world had seen up until then arrived at its peak. This network was centred in southern Mesopotamia, a region that had been fully urbanised for at least a millennium, and it stretched from as far away as western India on one side to southeast Europe on the other, and it also incorporated large parts of Central Asia”

Kanlıgeçit – Selimpaşa – Mikhalich and the Question of Anatolian Colonies in Early Bronze Age Southeast Europe, Heyd et al. 2016.

Now we’ll have to look at some genetic details from Greece in order to see how this may be reflected on the ancient DNA that we have available. As mentioned earlier, Greece didn’t see a population collapse in the period around 3000-2500 BC. There was a continuity since the early neolithic until after 2500 BC (just small amounts of ongoing genetic exchange with neighbouring regions, but nothing remarkable about it). The steppe population that moved through the Balkans during the EBA didn’t reach Greece during that period. It was once they settled and admixed with local populations from the Balkans when we first see an intrusion into Greek territory in the last part of the 3rd mill. To see the sequence of events, we’ll start by looking at 4 samples labelled as Greece_Perachora_BA (G31, G62, G65 and G76a) dated 2700-2200 BC:

To understand what this shows: In the columns there are sampled populations from different locations and periods. In this case the first two columns (after the initial one with the target samples from Greece mentioned above) represent samples from Bulgaria Chalcolithic (BGR_C) and from Greece Neolithic (GRC_Peloponnese_N), and they are supposed to represent the Neolithic/Chalcolithic population from the Balkans. The next tree columns represent West Asian populations (the Kura-Araxes Bronze Age culture from the South Caucasus with samples from what is today Armenia, then samples from the Levant Early Neolithic, if I remember correctly from what is today Israel, and finally samples from Central Anatolia Chalcolithic). The last column are samples from the Yamnaya culture from the steppe, from around 3000-2500 BC.

In the rows we have the four samples from Greece (Perachora, Bronze Age) mentioned above. And what we see is that they can be mostly modelled (97.2% average) with the first two columns representing local populations from the Balkans Neolithic/Chalcolithic. There’s only a 2.5% of West Asian admixture over whatever was already there in the Neolithic/Chalcolithic (which wasn’t much) and the 0.4% from the steppe is within the noise levels, so basically nothing at all.

However, during the period from 2300-1900 BC we have a few samples that are clearly different:

These samples derive two thirds of their ancestry from the Balkans Neolithic/Chalcolithic, and the other third from the steppe. We don’t know from where these samples may have come from, but probably from the Western Balkans there steppe admixture was higher.

However, this was not the last movement of populations into Greece. Here we have some groups of Mycenaean samples from 1600-1200 BC:

Here we see that Mycenaean Greeks have 20% ancestry from West Asia that was not present before their arrival, indicating a very significant change in the population somewhere between 1900 BC and 1600 BC. This Mycenaean type of ancestry is the one that persisted during the classical period, as we can see from these other two samples from the Greek colony in North East Iberia of Empuries, dating one to around 750-400 BC and the other one around 350-200 BC:

Note that the above samples were outliers among the ones from that colony, where the other were local Iberians that are very different as can be seen below:

Since we are missing samples from South East Europe from the period around 2400-2000 BC it’s difficult to pinpoint the exact origin of the Mycenaean people, but it had to be somewhere around Thrace or North West Anatolia. Once we get samples from that time and place, we’ll also be able to better asses their origin within West Asia. But since we know that the largest part of Anatolia was settled by speakers of the Anatolian branch of IE languages, it seems necessary that the origin was beyond Anatolia, with the South Caucasus being the most likely place.

A last note for completeness about Crete. There the West Asian admixture arrived earlier than in mainland Greece, and its likely source was South East Anatolia. This leaves us with two options about the affiliation of the Minoan language: it could either come from the local Neolithic inhabitants (EEF) which would basically make it an isolated language, or it could come from the Anatolian side and be an IE language of the Anatolian branch. There’s no evidence that it could be related to Greek itself. For a reference, here’s how they look:

And with this we’ll leave Europe for now (more later) and move on to Asia.

Asia

Anatolia

Anatolia was the origin of the Neolithic population of Europe, as mentioned. In the early neolithic, they had their characteristic genetic signature, but as time passed there was a significant mixing among West Asian populations that made all of them get admixture from the others. Since Anatolia is at the west end, that admixture was mainly from the east (South Caucasus/North Mesopotamia and beyond), and from the south (Levant). This makes it a bit more difficult to distinguish migrations between these areas, since we need enough resolution to see a significant change in a short period of time in a specific place to know that there was a migration and not just the ongoing general admixture that was happening all the time. The increase in admixture from the South Caucasus from the Neolithic to the Chalcolithic is evident and can perfectly justify the arrival of IE languages from the east (though let’s remind ourselves that a migration is not always necessary for the spread of a language, nor does a migration guarantee a language shift unless it’s a complete replacement as seen in Europe). We’d just need a higher resolution to find the specifics that might have brought the IE language from the Caucasus to Anatolia in the period around 4000-3500 BC.

Above, two Neolithic populations from around Central Anatolia. Below two Late Chalcolithic ones:

The shift to the “east” (more admixture from South Caucasus, less from Western Anatolia) is very clear, but this is very general and we’d need more detailed data to pinpoint a putative IE arrival.

In any case, the last publication (The genetic origin of the Indo-Europeans) from one of the main teams doing this research already went with the hypothesis that the IE languages arrived to Anatolia from the South Caucasus, which should be correct, so I don’t think I should extend any further about this point.

South Caucasus

Here is where my views diverge from the above mentioned study. The reasons should be obvious already, since in that paper they argue that PIE (what they call Indo-Anatolian) originated in the North Caucasus/Lower Volga area, and from there it crossed to the South Caucasus from where it went to Anatolia. They need this scenario because they still argue that the steppe populations (Yamnaya and CWH) were the ones that spread the rest of the IE languages (all except the Anatolian branch), while I’ve been arguing so far that those steppe populations spread non-IE languages that I’ve referred to as NEBA languages. Apart from the fact that the European linguistic reality requires a non-IE substrate, not to mention a source for the known non-IE languages, the probability of the Chalcolithic societies from the South Caucasus to have adopted the language of the incipient pastoralists of the steppe Eneolithic is not very plausible. It would have been much more likely to go the other way, but for what we know it didn’t, and the steppe pastoralists kept their original language (at least at this stage – more on this later).

My preferred view about the arrival of IE languages to the South Caucasus is that they did so from the east. Reading several papers about the archaeology of the South Caucasus some years back, there was a clear suggestion that new people started to arrive there around 4200 BC, and these people were the ones who later formed the Kura-Araxes Culture (which is more commonly dated to start around 3700 BC – this probably because this was a migration that was slow and lasted a few centuries). The origin was unknown. However, we’ve been lucky to get some of those early samples from around 4200-4000 BC from Armenia (Areni Cave) and they are in fact considered as part of the Kura-Araxes Culture despite their early dates. Coincidentally, it’s those same samples that the latest study mentioned above choose as the earliest IE speakers in the South Caucasus, arguing that they came from North of the Caucasus since they have steppe admixture. However, those samples also have admixture from the east (though I’d also say those samples are quite strange in their genetic profile and difficult to analyse), and crucially they happen to carry a strange male lineage (the Y chromosome haplogroup L1a) which is quite rare, but clearly came from much further east and not from the steppe. Later samples are more clear in their autosomal profile, so as an illustration here are the oldest 3 samples (other than those from the 5th mill. from the Areni Cave) from the Kura Araxes Culture, dated to the late 4th mill. (3350-3000 BC) as well as the 3 oldest from the Maykop Culture from the North West Caucasus, also from the 4th mill. (3375-3500 BC):

As seen the largest part is still local, but there are some significant contributions from the north (represented by some samples from the steppe north of the Caucasus mountains from around 4200 BC) and from the east (represented by some samples from Turkmenistan -Geoksiur- Neolithic).

I said above that an arrival (of IE languages to the Caucasus) from the east would be my preferred scenario because I don’t consider it completely necessary. The alternative would be that the South Caucasus was already part of the pre-IE speaking area since the Neolithic, but that would make for a larger PIE homeland which is less parsimonious from a linguistic point of view.

The second matter I want to examine from this area is a hypothesis that if correct it would be important, not so much for the IE languages (though it would help convince some sceptics), but mostly for the NEBA languages. It’s the origin of the Hurrians.

Hurrians from the steppe?

I’ll start by looking at some linguistic considerations that first brought my attention to this topic. For a long time, linguists have tried to find the origin of the Basque language, or at least to find some other language related to it. The most recurring suggestions have always liked it to the Caucasus languages, and more specifically to the North East Caucasus ones. This, of course, was a very controversial hypothesis, given the distance between the Basque Country and the Caucasus, together with the lack of any plausible connection from a cultural or population level. As an example of this hypothesis, here’s a quote from one of its more recent and prominent proponents, John D. Bengtson, from his book “Basque and its closest relatives: A new paradigm“:

“In direct contradiction of these kinds of statements [the uniqueness of Basque], the thesis of this book is that Basque is demonstrably related to other languages, i.e., that a scientific analysis of the evidence leads to the most probable conclusion that Basque is, at first remove, most closely related to the North Caucasian language family.

However, with all the data that we now have, a connection between the Basque Country and the North Caucasus has become much easier to explain, given that Basques, just as all the rest of Western and North Europeans came from the steppe and that the North Caucasus is just bordering the steppe from which they came from. Everything indicates too that Basque is indeed a relict from the languages spoken by the CWH people who settled most of Europe around 3000-2500 BC, and North Caucasians (and specially NE Caucasians) are the modern population that’s genetically closest to the original steppe people (like the Yamnaya people), while the Caucasus mountains are an area where their language could have survived more easily once the IE languages replaced it throughout the steppe.

The next link in this chain is the fact that those looking for the origin of North (especially NE) Caucasian languages have found Hurrian and Urartian as the most likely ancestors. While I can’t asses any of this from a linguistic point of view, I’d like to look at the genetic evidence that we have and could help solve these questions.

We know more or less (indirectly) that people from the steppe started to cross the Caucasus around the second half of the 3rd mill. during the late Yamnaya period or early Catacomb one (the Catacomb Culture people were a continuation of the Yamnaya people). We more or less know that horses were domesticated around the middle of the 3rd mill. in the steppe, somewhere between the Caspian and the Black Sea (link). And these horses must have started to be traded across the Caucasus shortly after (the earliest sample of a domestic horse of the modern type that we have comes from Anatolia ca. 2100 BC). Whether the domestication of the horse and its trade was the reason why people from the steppe started to venture into West Asia is unclear, but it probably helped that the trade was established.

The oldest references to Hurrians that we have date to around that period (they were established in North Mesopotamia around 2250 BC). Their strong connection to horses is well known:

It seems that one of the first important results of the Mozan/Urkesh excavations, at least from the point of view of Indo-European studies, was the discovery of a beautiful sculptural image of a horse head dating from the middle of the third millenium B.C. From much later representations of horses, possibly continuing the same Hurro-Urartian tradition, one may particularly compare a bronze horse head from Karmir-Blur (VIII c. B.C.). Subsequent findings in Mozan/Urkesh have shown a number of horse figurines coming from the storeroom of Tupkish’s palace (about 2200 B.C.), some of which represent the domesticated animal. These numerous figurines, which belong to the following period of the history of Urkesh in the last quarter of the III mil. B.C., make it clear that the horse was extremely important in the life of the society. Particularly interesting seem horse figurines showing the harness, thus documenting the use of horses in transportation.Horse Symbols and the Name of the Horse in Hurrian, Vyacheslav V. Ivanov, 1998.

From the point of view of ancient DNA, we have some interesting clues so far. The first one comes from a site in the Levant, Tel Megiddo, in modern day Israel. During the mid 2dn mill. this area is said to have had a significant Hurrian population, and apparently Tel Megiddo itself had a king with a Hurrian name. We have many samples from this site, and all of them are of local origin except 3 outliers (two of them are brother and sister, so grouped as one, and dated to 1600-1500 BC, while the third one is dated to 1688-1535 cal BCE). This is how the local samples from the same period look like:

And this is how the outliers look like:

Clearly, these outliers had steppe origins, with the brother (the only male) probably having the typical paternal lineage of the Yamnaya people (but due to low resolution in the Y chromosome we don’t know for sure since it’s just labelled as R without the subclade). Of course, we don’t know if these outliers were Hurrians or not, but given the historical knowledge it seems more likely that they were indeed Hurrians rather than some random travellers.

The second clue comes from later Hurrian and Urartian samples, which are already from ca. 1000 BC and later and their steppe ancestry has greatly diluted, but the males remain having largely the Yamnaya paternal lineage.

None of these clues alone can tell us if Hurrians came from the steppe, but together they do make for a compelling case. Ultimately, we’ll need to wait for samples from early Hurrians (pre-2000 BC ideally) to know with certainty. However, things may become a bit more complicated when we take a look again at a possible role of the Yamnaya population from the steppe when we get back to Europe.

Central Asia and North India

Finally we get to the last area that is relevant for the IE question. When it comes to Central Asia, we have to divide it into North (mostly Kazakhstan), which was part of the steppe and was settled by the CWH people around 2000-1400 BC with the Andronovo Culture, South (Turkmenistan, Uzbekistan and Tajikistan, which we will refer to as Turan, following the literature published about it), which had a local population dating back to the early neolithic period and the eastern edge (Tajikistan, Kyrgyzstan and SE Kazakhstan and till the Altai Mountains) that we will refer as the Inner Asia Mountain Corridor (IAMC) which has its own distinct population from the Paleo-Mesolithic period.

The period between 2000-1500 BC is the critical one when it comes to asses the linguistic side of things since during that period we have the different populations from Central Asia, plus the population of North India, plus a population that reached the Near East (the Mitanni), all speaking the same language: an early form of Indo-Iranian that was close to Sanskrit (Sanskrit itself being the form spoken in North India at the time. For example, about the dating of the Rig Veda, David Anthony’s “The Horse, the Wheel and Language” (2007) states: “The oldest texts in Old Indic are the “family books,” books 2 through 7, of the Rig Veda (RV). These hymns and prayers were compiled into “books” or mandalas about 1500-1300 BCE, but many had been composed earlier.”). This means that since the population from the steppe had just arrived to the area from the west, either they switched to the language spoken in those other places during the 2000-1500 BC period, or that they managed to spread their own language to all of those places during that same period of time. The most accepted traditional view has been that the latter is what happened. Here instead, we will explain that the former is the scenario that is compatible with all the data that we have.

With regards to genetics, it’s relatively simple. What we see is that during that period of 2000-1500 BC there is a low level admixture in both populations of north (steppe) and south (Turan) from each other. This was largely mediated via females, since the male lineages largely remain unchanged in both of them. Basically, there’s really not much in the genetics that would suggest a language shift from any of them, though there is enough to see that they were in contact and therefor a language transfer is compatible with the data. But this had to be more due to the cultural exchange than to actual migrations. Here are the samples we have from Turan from that period (minus two outliers from Bustan looked to come from the South Caucasus). The earliest we have from after 2000 BC is dated to 1650 BC and they go down to 1250 BC:

Meanwhile, the steppe populations during that same period were much more diverse (it’s a much larger area too), with some complex admixture in many individuals, while others stayed much more unadmixed as seen in the two figures below:

The archaeology in which the traditional view of Indo-Iranians being originally from the steppe is based is now mostly outdated. For example, Elena Kuzmina considered that “The Andronovo provenance of the fire-cult and the cremation rite is beyond dispute” (The Origin of Indo-Iranians, 2007). And goes on to remark the importance of it for the spread of Indo-Iranian from the steppes to the south:

Northern Bactria provides a unique opportunity to trace the southward migrational process of the Andronovo population and its assimilation with the locals. Since the material culture of the aborigines was highly developed and adapted to the ecological environment, the newcomers adopted in its entirety the complex of their material culture, while retaining their ethnical distinction in the most important sphere—ideology: in the cults and burial rite. As is well known, the principle condition for maintaining ideology in traditional culture is the preservation of the language which conveys mythological concepts and ritual texts. […] Since in the assimilation process in northern Bactria it was the ideological concepts of the Andronovans that took the upper hand, it means that their language conveying ideology and ritual activity became the winner too.

However, since then, it has been found that the cremation and fire cult have clear antecedents in the population from the IAMC, at sites like Begash and Tasbas, As David Anthony has already pointed out:

“The pre-Andronovo mortuary custom of cremation documented at Tasbas and Begash continued into the Andronovo period as a distinctive trait of Fedorovo mortuary rituals in the Tien Shan region but with the addition of a kurgan, stone fences, and other Andronovo traits absent from the Begash Ia and Tasbas level 1 mortuary customs.Samara Valley Project and evolution of pastoral economies in Eurasian steppe (2016).

A recent paper with new dates from the Tulkhar necropolis (Bishkent Culture, now dated to 2800-2400 BC) confirms the same:

“The new materials and the new calibrated radiocarbon dates significantly amend the understanding of many processes that took place during the Bronze Age both in Central Asia and far outside of it. Materials of the Early Tulkhar Necropolis (South Tajikistan) are often used to prove active contacts between the steppe livestock-farming Andornovo people and the settled crop-farming Central Asia people. Andronovo influences in the first place are found in the cremated burials of this necropolis. E.E. Kuzmina considers these burials archaeological evidence of her hypothesis about the Andronovo people (Indo-Aryans) migrating across Central Asia (Bishkent culture) to the North-West Pakistan (Swat culture) and North India. The new materials and the new calibrated radiocarbon dates recently appeared. They prevent relating the Andronovo people and the cremated burials in the Early Tulkhar Necropolis. The South Urals Fedorovo culture stands out with cremated burials and dates back to 1742–1451 calBC according to the latest data. The Tulkhar cremated burials appeared a lot earlier, namely no later than in the early 3rd millennium BC.” Materials of the Early Tulkhar necropolis in the light of the hypothesis of Andronovo population migration to the south: problems of chronologySevetlana V. Sotnikova, 2024.

For reference, here’s what Elena E. Kuz’mina (The Origin of the Indo-Iranians, 2007) thought about the Bishkent Culture:

The origin of the culture is open to debate. B. Litvinsky and L. P’yankova believe that the culture is genetically related to the BMAC and reflects a change-over of a part of the farming population to pastoral stockkeeping. A. Mandel’shtam and E. Kuz’mina, on the other hand, hold that it was created by Andronovo pastoralists and, possibly, representatives of the Zaman-Baba culture. They came to use the ceramics of the neighboring farmers and also began making hand-made pottery, which imitated in shape that produced on the potter’s wheel.

Of decisive importance is the evidence concerning the burial rite. The early monuments of the Bishkent culture maintain the characteristic features of the Andronovo Fedorovo burial tradition: burial mound, enclosure, stone cist, cremation, swastika, and the hand-made ceramics. Later there appeared graves with a downward passage and catacombs. The origin of this rite in Central Asia remains debatable. It is known both in the Bactria-Margiana culture, but its genesis there is unclear, and in the Zaman-Baba culture where it may be a heritage of the Catacomb culture of the European steppe. In types II and III of the Bishkent burials Andronovo features are preserved: burial mounds and stone enclosures, small cists, the position of the deceased and the custom of double-burial, the round and rectangular shape of the sacrificial hearths, the vivid manifestations of the fire cult. As long as the burial rite is an ethnic indicator of a culture, which is upheld even during long-distance migrations to another ecological niche, and wheel-made ceramics are quickly borrowed by new-comers, there are serious grounds to believe that the creators of the Bishkent culture were by origin Andronovo pastoralists, who came into contact with representatives of the BMAC, which is also expressly indicated in the farming culture of Tadzhikistan and Uzbekistan.”

The Bishkent related Vakhsh Culture also predates Andronovo:

Recent discoveries and radiocarbon dates provide good evidence to consider anew the Vakhsh culture of southern Tajikistan. This “culture” is almost exclusively identified by its burials under kurgans (“classical Vakhsh culture”) except for one settlement, and by its handmade pottery. A detailed classification of the pottery coupled with the available dates or comparisons is presented here. It can now safely be dated between the second half of the 3rd millennium and the 17th century BC as shown by radiocarbon dates and is thus contemporary with the Bactria-Margiana Archaeological Complex (BMAC). A few Vakhsh pots have been found in southern Bactria up to Herat and parallels can also be found in graves from Gonur Depe. It has no connection with the Andronovo culture but presents affinities with communities of the Altai-Xinjiang area.The “classical Vakhsh culture”, Mike Teufer, 2020.

Again E. E. Kuz’mina on the Bishkent-Vakhsh Culture (which she considered together):

“A. M. Mandel’shtam (1968: 131-141) conducted a systematic analysis of the funeral practice of the Bishkent (Vakhsh) culture and demonstrated specific correspondences with Indo-Aryan practices. He viewed the Bishkent culture as cattle raising, coming from the north-west in transit to India, and he noted its similarity to the Andronovo culture. B. A. Litvinsky (1964: 158; 1967: 122-126) connected this culture with the Nuristani languages and showed its analogies in Swat. E. E.Kuz’mina (1972 a: 134-143; 1972b: 116-121; 1974: 188-193; 1975: 64-7) emphasized the Indo-Iranian attribution of the culture, its connection with Swat and Gomal and the participation of the Zamanbaba and Andronovo components in its formation.”

Moreover, the archaeology that Kuzmina cites for the expansion of the Indo-Iranians to the south (from the steppe) is dated to very late layers of the sites she mentions, like Bustan or Dzharkutan, where steppe finds are in the layers from around 1000 BC which is 1000 years too late for the spread of Indo-Iranian (the samples we have from those sites that date to the period from 1650-1250 BC are local people, with the slight steppe admixture as seen above). She also refers to the light skin and eyes of some modern) populations of North India/Pakistan as a proof of the steppe origin of their language, which is something irrelevant for many reasons that I won’t extend here about.

Basically no evidence at all for the sort of huge events that should have happened in order for the Indo-Iranian languages to spread from the steppe to such a big area in such a short period of time. Nor any evidence that the people from the steppe could have spoken an IE language in the first place (quite the contrary, as already seen from other areas). Instead, we have a much easier explanation for the steppe populations to have acquired the Indo-Iranian language from their southern neighbours, along with much of the culture, technology, rituals and economy (for the change in the economy of the steppe population before and after the contact with the populations of Turan and IAMC, a graph (figure 16.12 here) from David Anthony’s “The Horse, the Wheel and Language” (2007) is quite revealing, showing the change of diet from an animal based to a mixed one.

When it comes to India, unfortunately the ancient DNA record is almost completely missing. Very few samples (to my knowledge) have been analysed so far and none of them published. But the DNA we have from the surrounding areas already tells us with high confidence how the early Vedic people should look like: Basically just like their predecessors from the Indus Valley Civilization. We don’t have direct samples from the latter either (except one of very low quality that was published years ago), but we have outliers from the surroundings that clearly had an Indian origin (known as Indus Periphery samples). The ones from the Indus Valley itself should look similar but with a significantly higher proportion of the specific Indian signature, usually referred to as Ancient South Indian (ASI or AASI). And indeed, the unpublished samples from the core Vedic area dating to the mid 2nd mill. (late Rigvedic period) are, as far as I know, exactly like that. But we still have to wait for samples to be published in order to be certain about it.

Some of the genetic remarks in the literature that suggest that Indic speakers came from the steppe are based on modern DNA, and as in the case of Kuzmina’s mention to the light skin and eyes of modern Dardic and Nuristani people I won’t comment of the details of why they are irrelevant. Overall, the hypothesis of Indo-Iranian languages reaching India from the steppe is simply not possible with the current data available. If some surprising evidence emerges at some point we could revisit the subject, but for now there’s not much more to say about it.

Now let’s briefly mention the Mitanni people that moved to the Near East in the 2nd mill BC. They have been usually considered an Indo-Aryan population (rather than Iranian), but that’s just because at the time they started to move to the west (likely around 1900 BC or slightly later), Proto-Indo-Iranian (PII) was just starting to break up and all the dialects from that time are similar to Sanskrit. The Mitanni Kingdom itself is first mentioned around 1550 BC, but the people must have started to arrive (from Turan) quite a bit earlier. We lack Mitanni samples so far, and the closest we have is an outlier from the site of Alalakh, in the Levant, dating to ca. 1550 BC which has a clear origin in Turan. But of course, we don’t know if it’s a Mitanni sample or not. However, given the origin of the Mitanni and their language, they should all look the same to that sample, i.e, like all other samples from Turan (though as time passes, with local admixture, obviously, like the one shown by the later Iron Age samples from Ascalon in the Southern Levant included below too, dating to around 1200-1100 BC). Once more, we’ll have to wait for more relevant samples to confirm this.

With the above said, the question still remains as to where was the origin of Indo-Iranian. And in my opinion the only way to explain the successful spread of the language is consider that Proto-Indo-Iranian became a prestige language and eventually a lingua franca during the mature period of the IVC and BMAC which would be around 2500-2000 BC. There seems to be no other way that can easily explain the fact that this language was spoken in both places at the same time. We do know that these two civilizations had intensive contacts, so it seems reasonable to think that during the peak of their development and trade, they established a common language that became the language of all the people in those areas, as well as those in contact with them. Whether the original pre-Indo-Iranian was spoken in one place or the other is something that would be quite more difficult to asses, so I won’t get into it. After the collapse of these two civilizations, the language must have started to break up, but we still know that during the period immediately after 2000-1500 BC they all must have been quite similar (Sanskrit in North India, Mitanni in the Near East and the language of the early Scythians on the steppe).

The big picture

The devil is in the details, they say, so we’ve first gone through the most important ones of each area. Now it’s time to step back and look at the big picture:

Approximate extension of steppe populations and Indo-European languages c. 2000 BC

Notice that the above map is not intended to be accurate in the details, but just to give a broad approximation. For the steppe populations, the dotted areas represent where they were alongside local populations, while the solid area is where they were the only population living in that huge area and speaking their native (NEBA) language.

The PIE homeland

From all of what we’ve commented so far, as well as from the map above, it may be clear to anyone that’s gotten this far that the PIE homeland must be placed in North Iran and Turan. The two main factors that make it necessary to place it there are the presence of IE languages in India and the Tocharian language in Xinjiang (China). From further west, those two things would be too difficult to explain.

The origin of the language must have been in the South Caspian area, from where it went with the Neolithic to Turan. These areas must have spoken pre-IE from the early Neolithic. PIE would be the phase prior to it’s expansion outside of that homeland, which would be close to 4500 BC. We lack ancient DNA from India to know the date or arrival of the population that formed the North Indian one, but certain anthropological studies suggest that there was a change around 4500 BC. And from the samples that we have from later dates, we know that North Indians can be modelled as a mixture of populations from Turan and ASI. What we don’t really know is if this possible migration to India meant a split in the PIE language or it stayed as a language continuum due to the continuous contacts. Regardless of the level of divergence that may have existed, it was later erased when Proto-Indo-Iranian because the common language in the 3rd mill.

The first know split, then, should be the one that lead to the Anatolian branch, which as mentioned before must have happened when people from North Iran moved to the South Caucasus ca. 4200 BC. Though the divergence didn’t happen in the South Caucasus, where it stayed close to the core area, but rather when the language went from the South Caucasus to Anatolia somewhere around 4000-3500 BC. It must have been in the southern parts of Anatolia where the language stayed more isolated from the rest and diverged from the other branches.

The next split had to be the one that lead to Tocharian, and for this we’ll have to look a bit closer at the IAMC.

The Inner Asia Mountain Corridor

This corridor at the eastern edge of Central Asia had a native population that was genetically what has been called Ancient North Eurasian (ANE). This genetic profile was also found throughout Siberia in the Paleolithic, and forms part of the Native American populations (admixed with East Asian). In its pure form, it survived from the South Urals to the Altai and through this IAMC well into the Holocene. We have a Mesolithic sample from the site of Tutkaul (Tajikistan) dated to around 6200 BC, a time corresponding with the Hissar Culture which probably started to have contacts with the Neolithic neighbouring regions and eventually led to this population of the IAMC to adopt pastoralism during the 6th Mill (link). We have evidence (though indirect) that this population was moving between Central Asia and China, since they’ve been found to have seeds that originate from both places (see, for example, Frachetti et al. 2014). Some indirect evidence comes too from faunal remains in Inner Mongolia (China), where domestic sheep of the West Asian type has been found and was probably there since the mid 5th mill. (link). In the Altai, we have the earliest evidence from seeds too dating to the end of the 4th mill (link) though it’s probable that they were there since earlier.

What the evidence suggests is that this population adopted an IE language from their southern neighbours (from Turan) at an early date, probably before 4000 BC. It may have been around 3500-3000 BC when part of this population settled in a more permanent way in Xinjiang, what led to the partial isolation of their language which would evolve into Tocharian (while those who stayed along the IAMC would have continued to evolve their language in conjunction with that of Turan, becoming speakers of Proto-Indo-Iranian when it became the language of BMAC). From a genetic point of view, we can look at a few samples that would support this idea.

A sample from the site of Dali (Kazakhstan), part of the IAMC, dated to 2700 BC already shows some admixture from both the southern neighbours of Turan and from the steppe population that arrived to the Altai region ca. 3000 BC, Afanasievo, which shows how these people were moving along that IAMC from the north to the south. However, samples from a later date (c. 2000-1800 BC) from the Tarim Basin in Xinjiang, shows them to be unadmixed, suggesting a larger degree of isolation from before the date of the Dali sample:

Archaeological and genetic evidence provide already good evidence on which to base the idea that Tocharian must have come from this population (the idea that Tocharian may have come from the Afanasievo people lacks both types of evidence, for example) while it also avoids the linguistic problems that were always found in the alternative Afanasievo hypothesis.

The importance of this population when it comes to the spread of IE languages doesn’t end there, since as we’ve seen before they may have been the first ones responsible for introducing the Indo-Iranian language to the steppe people around the eastern Altai region, where their influence is visible in the Fedorovo burial rites.

Back to Europe

We’ve seen so far a probable way in which IE languages must have reached South East Europe, but now we’ll have a look at how they spread to the rest of Europe. The details are still fuzzy and it’s not too important for the purpose of this post. While the Steppe Hypothesis required a more detailed explanation given that not much else other than the languages would have come from the steppe (obviously the discovery that the people of most of Europe came from the steppe gave the theory a perfect basis. It’s just that the time and places of their expansions does not match with those of the IE languages, and that’s now its main problem), when it comes to something going from West Asia to SEE and then spreading to the rest of Europe there’s nothing controversial about it. Basically, everything came from West Asia to SEE Europe and then spread throughout the continent, whether it was farming, any innovations like metal working in its different varieties, writing systems, coinage, civilizations themselves or even Christianity. That’s just the natural way things went in ancient Europe.

We first have to look at the possibilities of how did IE languages spread throughout the Balkans. We’ve provided a credible scenario for Greek, but is that scenario valid for the rest of the Balkans? Let’s look at this problematic question (indeed, the most problematic one). The first solution would be that it was the cultural package from Thrace and Greece was largely responsible for spreading the IE language throughout the Balkans., since outside Bulgaria and Greece (maybe Romania to some extent), there doesn’t seem to be any West Asian ancestry between 2000-1500 BC which would be the time when we’d need IE languages to have spread throughout the Balkans. In any case let’s take the chance to have a closer look at the population dynamics that took place in the Balkans during the Bronze Age which will also show the big difference with Northern and Eastern Europe. We have just enough samples from Bulgaria to show this process:

People from the steppe (Yamnaya Culture) started to move to the Balkans c. 3200 BC and this is how steppe communities from Bulgaria c. 3000-2800 BC looked like:

And here are contemporary local communities from that early period:

As can be seen, there’s a stark difference between them, with the steppe communities having very little admixture from locals, and local communities having very little admixture from the steppe ones (with two outliers at the bottom, from a first and second generation admixture event presumably).

After a few centuries, this is how a steppe community would look like (c. 2400 BC):

And how a local community looked like after a few centuries too (2800-2500 BC):

The steppe community had only one third of its ancestry left, while the local community had some 10% admixture from the steppe. By the time the communities finished admixing this is how they looked like (samples from Early Iron Age, c. 1000-500 BC since we lack from the Late Bronze Age, but they should be about the same):

Once the communities from both sides fused, their paternal and maternal lineages should more or less correspond with the amount of admixture contributed by each community. For example, in the samples above, there are 6 males: 5 of them have local paternal lineages and 1 has a steppe paternal lineage.

In the Western Balkans it seems like steppe communities represented a higher percentage of the population, since we see from 2000 BC and later some 30% steppe in the mixed communities (with paternal and maternal lineages from the steppe also being at around that level). Here are some Late Bronze Age (c. 1200-1100 BC) from Montenegro:

Clearly more steppe admixture and no West Asian admixture.

To reiterate what was said in the first part of this post, the sort of evidence shown here is the one that we lack from Northern and Western Europe. Not because we lack samples (we actually have a lot more) but because at the time the steppe communities started to arrive, the Neolithic ones were mostly gone, and where they still lived it was just long enough for the steppe communities to take females from them before they died out. So not only we lack direct evidence of any of those few communities that survived until the arrival of the steppe, we also can’t show how a fused community between steppe and local would look like after several centuries because such thing never happened. There were no mixed communities. The only ones that existed were the ones from the steppe, with 100% of the paternal lineages being from the steppe.

Back to the problem about the Balkans. We need the Indo-European languages to be all over the Balkans by 1500 BC or shortly after, but we don’t have any clear evidence of how this may have happened. Genetics don’t give us any solution, so only archaeology can help here. The spread of IE languages had to have been mostly a cultural transmission, but I will leave this for people with more expertise in the archaeology of the Balkans and meanwhile offer a possible alternative to this cultural transmission.

We could speculate that the Yamnaya people had already shifted to an IE language from the Caucasus in the period from 3500-3300 BC (i.e, after the CWH people had separated, and probably Afanasievo too). A language transfer across the Caucasus is much more possible during this Maykop-Novosvobodnaya phase than anything related to the preceding Darkveti-Meshoko period. And the language transfer would go the natural way, from the more settled, higher culture society to the more mobile, pastoralist one. In this scenario, Yamnaya would have spread the ancestor of Italo-Celtic, Germanic and pre-Balto-Slavic to the Balkans before being replaced in the steppe by the Srubnaya Culture (c. 1800 BC) which would have brought a non-IE language again until the arrival of the Scythians. However, while having evidence of actual people from the steppe moving around the Balkans seems better than no evidence from West Asian admixed populations, it’s still true that they were the minority, lived separated for quite a while from the locals and didn’t have a superior culture that would attract the locals to it. Rather the contrary. So it’s up to each reader to decide if this scenario does really improve things over the first one where cultural transmission would be the basic reason for the adoption of IE languages. Lastly, this alternative scenario, would be incompatible with Hurrians being from the steppe, so if the latter is confirmed it would invalidate this possibility.

From the Balkans to the rest of Europe it won’t be of much help to get ancient DNA because people within Europe were already very similar to each other, and it’s difficult to detect movements of people from genetics unless we have a very high resolution. From what we know about Celtic or Italic, we shouldn’t expect a large amount of people to have been migrating with the languages as they expanded (and very small genetic impact). Following the spread of innovations like iron or war chariots puled by domestic horses may be a better way to track the spread of IE languages throughout the rest of Europe (war chariots may have already played a role in their spread through the Balkans, at least Greece). I will be very brief about this, since the details of it are beyond the scope of this post.

For the Balto-Slavic languages we have some constraints that allow us to know the approximate place and time where they formed, since their formation was strongly influenced by Indo-Iranian language. We know that Indo-Iranian started to be adopted on the steppe at its eastern edge around the Altai region of South Siberia shortly after 2000 BC. These early adopters could be considered Proto-Scythians, and the genesis of the Scythian culture throughout Central Asia would continue till around 1200 BC. At that point, they started to move to the west through the steppe, replacing the preceding Srubnaya Culture (which like the Andronovo Culture was a descendant of the Sintashta Culture, but didn’t go through the language shift that happened in the Andronovo one and therefor would have still spoken its original NEBA language). The Scythians may have arrived to the western edge of the steppe around 1200-1000 BC, which would be the earliest date for starting contacts with IE speakers from the adjacent area in Europe.

The population that would become Proto-Balto-Slavic must have already spoken an IE language by then, but in a older centum form. This would mean that IE had already spread by then to the north of the Balkans, as was said earlier to be required. A good candidate given its time and location for being the culture where Proto-Balto-Slavic formed would be the Chernoles Culture, that started at the end of the 2dn mill. and continued until 500-200 BC. If these were Herodotus’ Scythian ploughmen, as speculated (no reference there by who or why), it would align very well with this possibility, since we should be looking for a population native to Europe and being sedentary farmers, not nomads, but who shared several cultural traits with the Scythians, which would easily explain the influence in their language too.

Notice that dating Proto-Balto-Slavic to around 1000-500 BC and to that approximate area is something necessary due to the clear Indo-Iranian influence that cannot be explained in any other way. After that formation period, we’d have the Baltic branch separating around the latter stage and expanding to the north. The details of this are something I’ve not tried to figure out and it’s not relevant for the purpose of this post. The important thing is that the case of Balto-Slavic formation that can be located and dated with significant accuracy should serve and an example of how IE languages formed and spread to the rest of Europe from the Balkans. Baltic languages (considered to be a very old form of IE) dating to around 500 BC when they started to expand to the north should also help to put into perspective the age of IE languages in Europe.

The details of other language branches should be somehow analogous. Italic and Celtic proto languages (whether one prefers to consider Italo-Celtic a proto language or just some areal features that defined an Italo-Celtic sprachbund area) would have formed in the North Western parts of the Balkans and adjacent areas, with Italic then separating and moving into the Italic peninsula while Celtic expanded to the west from around the eastern part of the Alps, mostly as always proposed.

Germanic is the least clear one, but it should have been a similar process. If we consider that the Chernoles Culture was roughly preceded in the area by the eastern part of the Trzciniec Culture, and that preceding Trzciniec culture had already become IE (somewhere around 1800-1500 BC), then the western part of it would have become IE too and would already be in the right place and time to be ancestral to Proto-Germanic. I’m not specifically proposing that scenario, it’s just for the sake of giving an example.

UPDATE: Checking the available samples, I noticed we have a good sequence from Czechia from the Bronze Age to the Iron Age. Looking at them, I see a significant change between 1600 BC and 1500 BC approx. coinciding with the end of the Únětice Culture and the beginning of the Tumulus Culture:

The samples from the Tumulus culture are dated to 1500-1250 BC, without C14 dating and there are only 4 of them. But the change is persistent through the Late Bronze Age (samples from Knoviz, c. 1100 BC, not shown) and into the Early Iron Age samples from Hallstatt period below:

The Bulgaria EBA samples are a relatively distant source, so the significant 27.5% impact is underestimated. With a more proximate source (like samples from Mokrin, in the Serbian border with Hungary, dating to c. 2000 BC) the impact is around 40%. Quite a big change in a small period between two consecutive cultures.

I leave here some references to some interesting papers related to the formation of the Nordic Bronze Age (c. 1600 BC) and its connections to the Carpathian Basin and ultimately the Aegean world (thanks to Jaydeepsihn Rathod for pointing out this in the comments):

  •  Issues with the steppe hypothesis: An archaeological perspective Iconography, mythology and language in Neolithic and Early Bronze Age southern Scandinavia. Rune Iversen, 2024.
  • It is therefore not surprising that Europe and the Aegean during the 15th–14th centuries bc shared the use of similar efficient warrior swords of the flange-hilted type, as well as select elements of shared lifestyle, such as campstools. Linked to this are also tools for body care, such as razors and tweezers. This whole Mycenaean package, including spiral decoration, was most directly adopted in South Scandinavia after 1500 bc, creating a specific and selective Nordic variety of Mycenaean high culture that was not adopted in the intermediate region (Kaul Reference Kaul2013). This could hardly have come about without intense communication and practice by travelling warriors or mercenaries. Swords come in different types and have different fighting styles (Reference KristiansenKristiansen Reference Kristiansen2002; Molloy Reference Molloy2010). Therefore they are not easily adapted: they are part of a system of warfare and skills that demand long-term training. Furthermore they demand changes in social organisation in order to sustain the new role of warriors. It therefore seems likely that warriors were at the same time also traders, or they accompanied traders to protect them. We may therefore accept that the shared use of sword types among Scandinavia, Central Europe, and the Aegean during this period would also lead to similarities in the social institutions linked to warriors. This seems indeed to be the case: the dual organisation of leadership between a Wanax and a Lawagetas in the Mycenaean realm is replicated in the Nordic realm, which also copied Mycenaean material culture closely (Reference KristiansenKristiansen & Larsson Reference Kristiansen and Larsson2005, chaps 5.4 & 6.5).Kristiansen K, Suchowska-Ducke P. Connected Histories: the Dynamics of Bronze Age Interaction and Trade 1500–1100 bc. Proceedings of the Prehistoric Society. 2015;81:361-392. doi:10.1017/ppr.2015.17
  • A suite of linked histories across Europe transpires, when attaching importance to the fact that the time period in which it all began, c. 1600 BC, was a turning point on a European scale. The precise timing may be debated, but it is here suggested that the link of change could ultimately have emanated from the early post-eruption Aegean with embryonic Mycenaean hegemonies. […] While commencing c. 1600 BC,NBA IB, in a manner of speaking, did not come into full fruition until c. 1500/1465 BC in NBA II, which is therefore justifiable as the first true highlight of the southern Scandinavian Bronze Age (Kristiansen, 1998; Kristiansen & Larsson, 2005). In NBA II, however, the Carpathian connection is no longer culturally visible but rather completely absorbed in the now uniform Nordic koiné. Instead, clearer glimpses of Mycenaean cultural impact occur in Scandinavia (Kristiansen & Larsson, 2005). This is now sustained by the testimony of lead isotope analyses (Ling et al., 2014). The Aegean seems from 1500 BC directly included in the Nordic sphere of interaction.” Vandkilde H. Breakthrough of the Nordic Bronze Age: Transcultural Warriorhood and a Carpathian Crossroad in the Sixteenth Century BC. European Journal of Archaeology. 2014;17(4):602-633. doi:10.1179/1461957114Y.0000000064

What’s missing and perspectives

There are many details missing in this brief overview, but I want to point out the ones that are technically missing in order to confirm (or deny) the basics of what I have explained here:

  • Getting samples from North India dating to the early Vedic culture (2000-1500 BC) to confirm (or deny) that they were local people. I’d give this a probability of > 95%.
  • Getting samples from North India dating to the period of 5000-4000 BC to see if there was a big change in the population at that time which could correspond with the arrival of IE speakers to the subcontinent. The probability of this I’ll leave it as “unknown”.
  • Getting samples from early Hurrians (2300-1800 BC) to know if they came from the steppe. This one is not too important for IE questions (except for that alternative possibility of Yamnaya-Catacomb cultures being IE), but if Hurrian could be confirmed to be a steppe (NEBA) language, it would be the key to investigate the whole language family. I’d give this 60-70% chances of being correct.

There are many other samples that we are missing and would help for better knowing all the details, but I’ve listed the most important ones for the purposes of this post. Let’s hope that we don’t have to wait too long to get the answers.

Now I’d like to summarise the languages that may have come from the steppe (those I’ve been referring here as NEBA languages). If the existence of this language family can be confirmed, it would become a very interesting and important subject for the study of European linguistic (pre)history. The fact that linguists can now know with certainty that all of northern and western Europe was repopulated by newcomers from the steppe between 3000-2300 BC, and therefor that all of that area can have one and only one substrate, common to all the area is an amazing step to finally be able to study the substrates of Europe in a scientific and coherent way. Here is a list of the more likely candidates to be part of this proposed NEBA language family:

  • Basque/Aquitanian (> 95% probability).
  • Iberian (> 95% probability).
  • Tartessian (if not a Celtic language, > 95% probability. If Celtic, then it’s Celtic. I could mention Pictish in this same category, though Pictish has much more chances of being a Celtic language).
  • Etruscan (> 60% probability. Further aDNA samples won’t tell us more than what we already know, so it’s essentially a linguistic issue).
  • Hurrian (60-70% as it stands now, but getting the right samples from ancient DNA would confirm it or deny it with almost 100% certainty either way). Urartian would be linked to the outcome of Hurrian.
  • North East Caucasian (~50% chances. It depends a lot on the outcome of Hurrian).
  • North West Caucasian (Unknown probability. It’s strictly a linguistic issue, largely about it being related to NE Caucasian or not).
  • Uralic languages (~50% chances. See the Appendix I for some insights into the matter).
  • Paleo-Sardinian (poorly attested, it’s again a linguistic issue where aDNA has already told us it’s at least possible. Probability around 30%?).

Finally, I’d like to stress that this post is in no way intended to be particularly complete (that would require to write a book with a lot of research) nor a definitive solution to the problems it tries to address. As the title says, this is an alternative view (interpretation) of the evidence we have so far, and my hope is that it can serve as a framework for linguists interested in IE languages or Old European languages to be able to better understand the data that is available and decide to what extent they agree with one view or another, as well as serving as a way to asses future ancient DNA studies with some of the ideas and predictions contained here to see how they fit with either view.

142 thoughts on “Origins and spread of Indo-European languages: an alternative view

  1. -Basques and Iberians 

    I think more people are coming to terms with the idea that BB was not IE. However, the scenario they came up with is the following: a small CWC clan migrates to the Netherlands and picks up a non-IE language from there then explodes all over western Europe. 

    “Indo-European populations started to enter SEE Europe during the period from 2400-2000 BC.”

    In my opinion the only way I see this working is if the Alaca Höyük royal graves were actually from a Maykop related group and were replaced by local non-IEs as was previously believed. However, there’s no consensus amongst archaeologists on that anymore.
    If true then there were 2 separate migrations to Anatolia one southern bringing Proto-Anatolian through Adaniya (Danu?)> Proto-Hittite goes to Cappadocia (Kussara and Kanesh)/ Luwic splits in Konya. I followed a Yakubovich route for Lydian but after reading some recent stuff on Phrygian I’m not sure. The other later migration would cross the Hattian and Kaskian territory and end up being defeated and migrating across northwestern Anatolia through Thrace. There’s so much supporting the first one and not so much the second at the moment. Hopefully, we’ll get some new samples soon.
    As for Greek, Orpheus has put out this idea that the Graeco-Phrygian homeland is NW Anatolia and the supposed Phrygian migration is a Greek historian myth. Funny enough, according to the same legends the house of Atreus itself descends from an Anatolian.

    Also, there’s something pretty interesting: 
    “It is quite probable that the Greek-speaking communities were also present in the Late Bronze Age Troy (Wilusa). However, there are strong doubts that Troy, as well as the whole north-western part of Anatolia (the Troad and Mysia), can be properly defined as ‘Anatolian’ in an ethnolinguistic sense (i.e. speaking one of the languages belonging to the Anatolian branch of the Indo-European languages)”
    https://www.academia.edu/37372307/Anatolian_linguistic_influences_in_Early_Greek_1500_800_BC_Critical_observations_against_sociolinguistic_and_areal_background
    Have you seen Moldova – Zhyvotylivka (I17973) –  J2b2b2~ (J-Z42942) BTW? 

    -Hurrian
    This one is tough. We do have samples from the east Hurrian confederacy of Turrukum and Itabalhum (Urmia basin) and they do show Yamnaya ancestry. The issue is that as with all confederacies there’s no proof they spoke a single language and that language is Yamnaya derived. Though, Hurrians being the “horse people” of the Near East is pretty much confirmed.
    BTW, the Megiddo outliers are frequently thrown around as proof of Mitanni warriors from Androvo even though it’s pretty obvious they harbor Yamnaya related ancestry. 

    “They have been usually considered an Indo-Aryan population (rather than Iranian), but that’s just because at the time they started to move to the west (likely around 1900 BC or slightly later), Proto-Indo-Iranian (PII) was just starting to break up and all the dialects from that time are similar to Sanskrit”

    Well, you know my take on this. I don’t think it’s likely that I-I splits and suddenly in a century or two these Indo-Iranian groups end up with pretty diverse religious beliefs. With *Dyeus based gods for the Iranians and him retiring in Vedas with Indra(+Mitra/Varuna) taking over. One thing is certain though, the Mitanni elites brought Vedic deities and a BMAC cultural package. Luckily for me, no one solved that yet.    

  2. @Vara

    Thanks for chiming in! Alaca Hoyuk was part of the Anatolian Trade Network, and the Royal Tombs are from around the early period when IE would have been moving west from the Caucasus to Europe. But whether they belong to those IE speakers is hard to tell.

    But yes, there would have been two movements of IE speakers from the Caucasus to Anatolia, but the second one would have been more of a movement through Anatolia rather than to Anatolia (except for the NW edge of it, which I would agree was not part of the Anatolian languages territory).

    That sample you mention from Moldova carrying a Caucasus lineage I guess is about the possibility of Yamnaya being IE? It could be, as I suggest in the post, but largely wouldn’t have been mediated by genes (nor some genes would prove the point made). I think the cultural interactions in the Maykop period (with the Mikhaylovka culture and related ones) would be enough to justify a language shift. The Yamnaya people in the Balkans were, however, a minority (some 15% on the eastern parts, 30% on the west) and it’s hard to say that they were the more advanced culture (or that they were the ruling class) that would have made the locals shift to their language. So while the whole thing is possible, I wouldn’t argue too strongly for it unless it becomes something necessary.

    For Hurrians let’s see. We need samples from the most relevant areas, but those have been in war for many years, so who knows when we will be able to get samples. I suppose that many of the people from the steppe that moved into the Near East (especially in the highlands) integrated with other groups and didn’t keep their own ethnic identity. So I don’t expect that having steppe ancestry means being Hurrian, but I do expect (to some extent) that Hurrians did come from the steppe, which would mean that they should largely have the Yamnaya paternal lineages during the first few centuries. This would be easy to verify or falsify, but the samples may not come anytime soon.

    The question about Iranian and Indo-Aryan is something we’ll hopefully be able to discuss when the time comes. I don’t have any strong opinion about it and my comments are very generic.

  3. -PIE

    What does the Kurgan Hypothesis have going for it now as in what are the supposed “IE ethnic indicators” now? The horse nonsense is disproven now even though it was obvious that the Divine Twins were not a part of the core IE belief as I used to argue with Davidski. Nomads imposing their language everywhere they went is so unlikely. A quick glance at the later Central Asian nomads and “Huns” and how often they changed languages and identities should be enough.

    The way I’ve always seen it, actual IE societies were run by warrior elites and priests with highly advanced metallurgy. The last part is why I don’t agree with Heggarty’s model. If PIE was actually a thing it should’ve been spoken in the copper age (5200-4700BCE at the earliest) and not in the neolithic. PIEs being from the northern parts of Iran, where they controlled the Greater Khorasan road, is more likely than the Zagros. I’m not sure if Turan was IE before 3500-3000 BCE, since it’s most likely a dead end as it was conquered by a group from the collapsing Iranian network that later formed the Geoksyur horizon and probably went on destroying some other stuff in the south.

    I also think the Caucasus is not likely to be the PIE homeland. Archaeology describes 3 different traditions/networks intersecting in the Caucasus, the native one ie. the primitive one, the Mesopotamian one and the Iranian highlander one with the advanced metallurgy. The Maykop/Novosvobodnaya elites had clear links to the Greater Khorasan Road. For whatever reason both Maykop and this Iranian network collapsed around the same time. Also, even though KA samples look pretty close to the Maykop ones, KAC is an entirely unrelated phenomena and pretty much started out as one of the native south Caucasian traditions.

    The Zhyvotylivka sample shows that there were straight up Caucasians moving to the western parts of the steppe. But yeah, according to D. Anthony Yamnaya = Mikhailovka + Novosvobodnaya anyways. Novosvobodnaya was lightyears more advanced than anything on the steppe and suddenly Yamnaya also inherited metallurgical traditions which are usually monopolized and can easily go extinct. Yamnaya is pretty much the first steppe culture with advanced metallurgy and special smith burials. I find it unlikely that they picked up all these traditions, from what might have been the most advanced culture in the region if not the world, without language change. As of now the only issue with Yamnaya is the lack of proper settlements and agriculture. Whether or not it Indo-Europeanized the Balkans is a different story though. CWC being LPIE is the latest goalpost shifted nonsense.

    “that would require to write a book with a lot of research”

    Sooner or later someone has to reconstruct PIE culture again. Too bad I don’t have the liberal arts degree for it.

  4. @Vara

    Yes, North Iran has to be the origin of IE, and PIE should not be dated to before 5000 BC. The question is whether Turan could have spoken a different language at that time. If the Neolithic came from North Iran, then it’s likely that they spoke the same language regardless of later movements of people around the area. There’s the constraint of Tocharian too, which should not split after 3000 BC (probably closer to 3500 BC) before I-Ir took over. And then there’s India. We lack data from the area, so for now the best clue we have is that anthropology proposing a West Eurasian migration around 4500 BC. IE could have arrived later, but without more data it’s hard to find a better opportunity. That’s why I include Turan in the PIE homeland at that time.

    For the Caucasus I favour an arrival by the end of the 5th mill., and that would be the first real split that lead to the Anatolian languages.

    As for Yamnaya, I’m undecided. If we could solve the Hurrian problem it may push me one way or another. But the cultural links are there in any case.

    The post is intentionally generic, because getting into too many details would have been difficult both for me and for the target audience. So there’s ample space for debating all of those details.

  5. I’m not sure if Tocharian should be a constraint to any theory seeing as how there are many alternative theories, Hamp’s NWIE, extremely divergent Iranian, and even hoax.

    I think there’s a conquest scenario in India much like Turan. The IVC starts with the massive destruction and abandonment Kot Diji. I’d say this could be the most destructive transition in the history of India. So even if the Vedic Aryans of 1500 BCE look exactly like the earlier chalcolithic population of India such destruction should’ve initiated a language shift.

    I agree with the Caucasus. I think 4200ish BCE is when Indo-Europeans showed up.

    Yeah, I think there’s many potential posts left.

  6. Un saludo Alberto, es un placer volver a conectar contigo, espero que todo vaya bien. Como veo que tienes mi correo electrónico puedes ponerte en contacto conmigo cuando quieras. No he tenido tiempo bastante para leer tu comentario pero lo haré estos días. Me ha parecido entender que estás de acuerdo en que la cultura campaniforme NO hablaba IE?

    Para mi los datos genéticos son claros y R1b-P312 no hablaba IE.
    Has visto los resultados del último papel en el que colabora Kroonen? U106 en la Motilla del Azuer y U152 en el yacimiento de El Argar, además del consabido mar de Df27.

    Por cierto hay un muy buen blog lingüistico que se llama Trifinium dirigido por Joseba Abaitua, tal vez estés interesado en participar en el. Yo he comentado temas genéticos porque no soy linguista pero es muy interesante especialemente los últimos hilos sobre la mano de Irulegui y la evidente relación entre vascuence e ibérico. Son muy buenos linguistas y actúan de manera imparcial (fuera de veleidades nacionalistas).

    En fin el próximo comentario si mantienes en blog, lo haré en inglés, no lo he hecho porque estoy cansado de hacerlo en eurogenes.

    Un saludo

  7. Hola Gaska, gracias por pasarte por aquí y dejar un comentario. Es bueno verte de nuevo y saber que sigues activo.

    Sí, yo sigo manteniendo que la cultura campaniforme no puede ser IE. No hay ninguna evidencia que sugiera eso, y en cambio hay evidencia clara de lo contrario. Según recuerdo, en lo que no estábamos de acuerdo era en su procedencia, ya que yo lo veo como una extensión de la cultura de la cerámica cordada y si no recuerdo mal tú como algo nativo a Europa occidental (aunque hace mucho años de eso, claro, no sé lo que piensas ahora).

    Buscaré ese blog, que seguro que es interesante, aunque la verdad es que he publicado este último artículo para dejar el blog con un cierre más apropiado en el que resumo mi visión del problema. Así que no creo que siga publicando, pero intento sacar algún rato de vez en cuando para ver qué se publica que pueda ser interesante.

    No estoy seguro de haber visto el estudio al que te refieres en el que colabora Kroonen, así que lo buscaré también y añadiré aquí si creo que pueda ser relevante.

    Un saludo!

  8. @Vara

    I wouldn’t necessarily associate destruction with an invasion or a language shift, unless we either know who was who from genetics or we can deduce it from archaeology by seeing a new culture that can be linked to a different area. Collapse is a recurrent theme in societies and more in prehistorical ones, and it’s as often an internal process as it can be due to external inference.

    Burned settlements by the end of the Early Harappan period followed (after a “dark” or “recovery” period) by Mature Harappan Culture doesn’t seem to imply any change in the language by itself. I would need to dive much deeper into the specific case to have an opinion about it, but again that would be going into details that would detract from the main points the post tries to make.

    Re:Tocharian, it’s still kind of controversial in some ways, but I don’t think that anyone still holds that it may be a hoax. And as long as we have a language there with a plausible explanation, I think it’s an important point to take into account.

  9. There isn’t much on the collapse of Kot Diji and frankly there are some really asinine takes and interpretations out there when it comes to IVC in general unfortunately all thanks to the PIE debate. 
    True, destruction doesn’t always imply a conquest or language change. However, looking at the neighbors of IVC around the Helmand we also find destruction and clear signs of Geoksyur influence around 2800BCE. Knowing the relationship between Mundigak/SiS and IVC I’d say it’s pretty likely a related group was responsible but that’s just my interpretation since I’m mostly arguing about details.
    I’m really more interested in samples from 2200 – 1700 BCE which should be the most important for Indo-Aryan. 

  10. Pues es una lástima que lo dejes, al menos contigo tenemos un español que puede modelar muestras antiguas de manera independiente. En todo caso, estaremos en contacto.

    En efecto, yo creo que la cultura campaniforme tiene su origen en la costa atlántica portuguesa, no solamente por las antiguas dataciones que han aportado Cardoso y otros arqueologos portugueses, sino por su singularidad respecto a otras variantes de esta cultura en Europa. Además, mira estos datos de linajes campaniformes ibéricos, moviendose hacia el norte siguiendo el Tajo, para mi son otra prueba de que la cultura campaniforme NO hablaba IE. Nadie les tiene en cuenta y desde luego su origen no es la CWC.

    I6601 (2.700 AC)-Bolores, Torres Vedras, Iberia-I2a1a/1a1-L158>Y3992
    I11592 (2.700 AC)-Hipogeo de Bolores, Torres Vedras-I2a1a/2-Y3104>L161
    I0826 (2.656 AC)-Cerdañola del Vallés-I2a1b/1-L460>M436>M223>Y3259
    I1970 (2.500 AC)-Cueva Verdelha, Lisboa-I2a1b/1b-Y6098>S23680>PF692
    I1976 (2.459 AC)-Dolmen del Sotillo-I2a1b/2a-S2555>S2524>L38
    NEO609 (2.381 AC)-Hipogeo de Sao Paulo2, Almada, Lisboa-I2a1a-CTS595
    I6587 (2.350 AC)-Humanejos, campaniforme-I2a1b/1-M223>Y3259
    I4229 (2.335 AC)-Cueva da Moura, Torres Vedras-I2a1a/1a1a/1-Y3992>L160
    I0460 (2.335 AC)-Dolmen del Arroyal-I2a1b/1b-L460>M223>Y3259>PF692
    I0458 (2.332 AC)-Dolmen del Arroyal-I2a1b/1b-L460>M223>Y3259>PF692
    I2467 (2.315 AC)-Dolmen del Sotillo, campaniforme-I2a1b/1-M223>Y3259
    CDM264 (2.250 AC)-Cueva da Moura, Iberia-I2a1a/2-Y3104>L161
    I6543 (2.212 AC)-Camino de las Yeseras 13a, Area10-I2a1a/2-P37>M423

    R1b-P312 trajo o creó el estilo Ciempozuelos en la península, y desde luego puede ser que existieran movimientos de reflujo tal y como defendió siempre Sangmeister. La única posibilidad de que cambiara su lengua nativa (teoricamente IE para los kurganistas) es que entraran muy pocos individuos y se mezclaran inmediatamente con mujeres ibéricas. Esto también tendría que haber pasado en Turdetania (tartésico), Etruria, Raetia, Aquitania y Occitania donde a la llegada de los romanos se hablaban lenguas NO IE, lo cual me parece muy poco probable.

    Después, absoluta continuidad genética en los marcadores uniparentales (con algunas excepciones gracias a la exogamia) hasta la edad del Hierro. Tenemos 105 genomas masculinos de calidad en todas las culturas ibéricas de la edad del Bronce, 99 son R1b-M269 (94.3 %) , 4 son I2a1a-P37 (3.80%) y 2 G2a2b-P303 (1.90%), asi que Iberia es la clave para entender el asunto lingüistico.

    Y respecto a Yamnaya yo he llegado a la misma conclusión que tú, es decir la cultura Majkop tuvo que cambiar la lengua que se hablaba en las estepas. Lo pienso porque todos los linajes masculinos de la cultura Yamnaya es decir R1b-Z2103, R1b-V1636, R1b-PF7562, R1b-Y13200 & I2a-L699 tienen su origen último en los WHG o si lo prefieres en los cazadores recolectores balcánicos y bálticos. Esa lengua NO-IE de los WHG fue la que continuaron hablando sus descendientes R1b-M269>P312 en europa central y occidental hasta la llegada de los romanos

    En mi opinión, la cultura Yamnaya y sus descendientes indoeuropeizaron los Balcanes, pero creo que la lengua micénica tiene origen Anatolio (el 75% de los marcadores masculinos micénicos entre 1.400 y 1.200 AC son locales o de origen anatolio).

    Y respecto a Italia creo que las lenguas itálicas entraron en la peninsula italiana desde los Balcanes a principios de la edad del Hierro gracias a Z2103, J2b-L283 y R1b-Z118.

    El celta es un asunto de europa central e incluso el norte de los Balcanes y solamente se expandió gracias a la cultura de los campos de urnas a finales de la edad del Bronce.

    10 años después y después de analizar miles de genomas, Harvard sigue sin encontrar R1b-L151 en las estepas y mientras no lo haga la vinculación de este marcador con las lenguas IE es simplemente una quimera.

    Un saludo

  11. Very interesting reading, Alberto. You make some very compelling arguments.

    Have you looked at https://pmc.ncbi.nlm.nih.gov/articles/PMC8059681/, which tested for deeper historical relatedness between various language families?

    The abstract contains “Controversial clusters such as e.g. Altaic and Uralo-Altaic are significantly supported by our test, while other possible macro-groupings, e.g. Indo-Uralic or Basque-(Northeast) Caucasian, prove to be indistinguishable from a randomly generated distribution of language distances.” Their test does have special focus on Northeastearn Caucasian languages: “Other groups more recently and occasionally suggested in the literature [81–88] also test negatively”, listing among these “Basque/NE Caucasian, d = 0.544, p = 0.687”.

    Does their Basque-NE Caucasian conclusion have decisive bearing on Hurrian’s potential relatedness to NEBA?
    And, after reading your article, would it be meaningful to account for the “Vasconising” effect on PIE reconstruction that you spoke of, and shift all the Vasconic elements into a NEBA group (that includes Basque, Iberian languages and any others implicated) and then test larger group against NE Caucasian? The reason I suggest this is because of more accurate results obtained by better constructed trees, as seen in section 4 Results (c) paragraph “Three out of the five groups…” versus (f) starting “At first glance, this result appears”.

    On another note, Figure 1 of the same study is interesting in that it isn’t apparent that there is a peculiarly close relationship between Indo-Iranian and Balto-Slavic within Indo-European.

  12. “When it comes to India, unfortunately the ancient DNA record is almost completely missing. Very few samples (to my knowledge) have been analysed so far and none of them published. But the DNA we have from the surrounding areas already tells us with high confidence how the early Vedic people should look like: Basically just like their predecessors from the Indus Valley Civilization. We don’t have direct samples from the latter either (except one of very low quality that was published years ago), but we have outliers from the surroundings that clearly had an Indian origin (known as Indus Periphery samples). The ones from the Indus Valley itself should look similar but with a significantly higher proportion of the specific Indian signature, usually referred to as Ancient South Indian (ASI or AASI). And indeed, the unpublished samples from the core Vedic area dating to the mid 2nd mill. (late Rigvedic period) are, as far as I know, exactly like that. But we still have to wait for samples to be published in order to be certain about it.”

    There is one recently published sample which conforms to your expectation for the region. The recent https://www.cell.com/current-biology/fulltext/S0960-9822(24)00581-5 includes a Western Tibetan sample (SDLG_o) dated but 1900 years ago, which is an outlier for being the earliest that South Asian ancestry was detected among Western Tibetans. The outlier sample was successfully modelled as a two-way admixture of local ancestry and Indus Periphery admixture of the Shahr_I_Sokhta_BA2 variety:

    “We observed several two-way admixture models for SDLG_o including Shahr_I_Sokhata_BA2 and another Tibetan Plateau population. However, Shahr_I_Sokhata_BA2 harbored elevated proportions of AHG-related ancestry and the remainder from a distinctive mixture of Iranian farmer- and WSHG-related ancestry. To exclude more complex admixture models, we used the D(SDLG_o, Model; X, Mbuti) to evaluate whether SDLG_o carried additional Central Asia, South Asia or Steppe ancestries than the combinations of two sources. … These patterns supported that adding one of these ancestor populations of Shar_I_Sokhta_BA2 as the third source was not necessary for SDLG_o’s modelling.”

  13. @Gaska

    Sí, en la imposibilidad de que los pueblos que portaban R1b-L151 hablaran indo-europeo estamos de acuerdo. Yo lo extiendo también a R1a-M420 ya que las zonas donde hoy se habla germánico o báltico-eslavo también necesitan un sustrato no indo-europeo entre otras razones.

    En otras cosas supongo que diferimos, pero eso siempre es sano para poder debatir. De todas formas, sugeriría que próximos comentarios fueran en inglés para evitar que el resto de lectores tengan que utilizar un traductor.

  14. @ak2014b

    Good to see you around and still keeping up to date with the studies!

    I’ve been reading the paper about language families but it’s difficult to know what to make out of these type of analyses. Sometimes they agree with reality ad sometimes they show really odd things. In any case, both Basque and NE Caucasian are long isolated and drifted modern languages which make them very difficult to analyse. That’s why I mention the importance that Hurrian would have if it turns out to be a language from the steppe, since it would be the closest (by far) that we would have in order to compare all the other possible ones. But we’ll have to wait for good samples to really know.

    Thanks for that other study about Western Tibet. It’s another small piece of evidence that North India had genetic continuity since the IVC. There really is nothing going for the steppe hypothesis when it comes to India and that debate should have been closed years ago. I don’t know why they still couldn’t publish a few relevant samples and put it to rest.

  15. @ak2014b

    I forgot about Indo-Iranian and Balto-Slavic. If anything, that lack of significant closer between both could relate to the fact that they don’t share a common origin, though the results are difficult to trust blindly. The similarities are obvious for any common observer, so the influence is very clear and necessary. And it’s very helpful for us to be able to place Proto-Balto-Slavic and a specific area at a specific time.

  16. “Sometimes they agree with reality ad sometimes they show really odd things.”

    Fair point. Sometimes, computational analyses do get superseded by newer studies that have very different conclusions. It’s hard to know which such studies’ results will hold consistently.

    “If anything, that lack of significant closer between both could relate to the fact that they don’t share a common origin, though the results are difficult to trust blindly. ”

    A disconnect between Balto-Slavic and Indo-Iranian was also apparent from last year’s paper by Heggarty and others https://www.academia.edu/105010777/Language_trees_with_sampled_ancestors_support_a_hybrid_model_for_the_origin_of_Indo_European_languages
    The authors there mention a download link is available from their page at https://iecor.clld.org/. Then for instance refer to Table 1 in their paper, which also interestingly groups Greco-Armenian together.

  17. @ak2014b

    I did know that paper by Heggarty and others. It’s a great effort. But I still see important shortcoming in using those methods. In general, it would be like generating an ancestry tree without being able to take admixture into account. It would work relatively well for Paleolithic Europe, but it will start to give very strange results after the Neolithic. I wrote about this in a short post several years ago:

    https://adnaera.com/2018/10/01/ancient-dna-and-linguistics-an-introduction/

    When I was reading papers about Armenian back then I remember different authors coming to the conclusion that it must have evolved in contact with Greek and Indo-Iranian. I guess that this would favour the scenario in which Yamnaya was Indo-European(ised) and took the other European branches to the Balkans. I’m open to both scenarios that I suggested: whatever turns out to have better support by the data.

  18. Long post, Alberto, with lots of good points, and lots to comment about.

    Let’s start with the strong point:
    You provide a good summary of points, where and why the simple concept of “language shift/ dispersal by genetic takeover” doesn’t work for IE, and the “steppe hypothesis: Indo-Iranian, Tocharian, Greek, also the Balkans. The obvious conclusion is that we must look for other mechanisms, at least in parts of the IE-speaking area. You propose a “lingua franca” concept, with obvious difficulties still here and there (actually across most of Europe).

    Let me first add that a more radical version of the “lingua franca” concept would be that of a colonial language. Latin America is a good example: There, you find comparatively low “European” ancestry, not even speaking of the “Steppe” element in it. Still, it is now predominanty IE-speaking. I don’t need to go into detail – and actually, you seem better positioned to figure out the respective mechanisms, if you haven’t already done so.

    I have elaborated on another issue here: https://adnaera.com/2018/10/18/is-male-driven-genetic-replacement-always-meaning-language-shift/ . Resilience to possible male-effected language shift appears to be particularly strong in matriarchic cultures, which in turn appear to be frequent where there is a strong seafaring tradition. Of course, seafaring males may not return (and often don’t even intend to, cf. settlement of the South Pacific, also to some extent Vikings), so there are good arguments for keep the (on land) property in the female line, and also be open to interacting with foreign seafarers passing by.
    I understand that there is discussion whether the Basque society was originally matriarchic. Without being able to comment on that discussion, I may say that the Basque seafaring (whaling) tradition, apparently going back to at least Roman times, should have been supportive of a matriarchic system. Also, El Argar seems to have had at least strong matriarchic tendencies, as demonstrated in their burials. Which means that, when anyway questioning an automatism between (male) aDNA shift and language shift, there is also good reason to question the impact that “Steppe”/ yDna R1b introgression may have had on coastal Iberian communities. Incoming foreign merchants may have left their aDNA, and also their merchandise including amber etc., but that may have been it – the mother tongue could still have prevailed, unaffected.

    t.b.c

  19. @ ak2014b: Your linguistic link (excellent read) brings me to a couple of other points, that are partly axiomatic in the sense that they guide my approach to the whole language family / IE (and also Hurrian) issue:

    1. Language families are defined across (at least) 3 dimensions, namely (i) lexicon/ vocabulary, (ii) morphology/ grammar, and (iii) phonology. The traditional focus of historical linguists has been the lexicon, via Swadesh lists etc. Your link addresses the morphology, with findings that are sometimes at odds with traditional, lexicon-based phylogenies. There may also be attempts to look more closely at phonology, however, I am not aware of any recent study in this respect.

    All three dimensions may change over time: Vocabulary may borrowed from neighbouring families (“Sprachbund” etc.), or travel as “Wanderwort” around the globe (c.f. Lat “canis” vs. Nahuatl “quintl”, both meaning “dog”). “Sprachbund” may also affect morphology, as is well documented a/o in India, where the grammar of some IE and Dravidian languages has been converging, and a more recent example would be “Spanglish”. Most resilient appears to be the phonology, especially when it comes to acquiring “foreign” sounds, which is why sound shifts are typically taken as indication of language shift, i.e. population A adopting the language of population B, in the process transforming “unpronouncable” sounds into their nearest approximation within the old language. However, we also have “natural” sound shifts, as e.g. visible in the Satemisation of modern French: Just say “cent” to see French isn’t a Centum language anymore. The example of French, b.t.w., makes the whole Satem-Centum stuff pretty useless when it comes to historic linguistics.

    2. With all three dimensions being somewhat fluid over time, it is hard to pin-point a proto-language to even all three of them, even more to just one. Nevertheless, I postulate that PIE is technically a hybrid language:

    a.) The lexicon is strongly influenced by Uralic, hence the Indo-Uralic hypothesis. Actually, I found even more lexical parallels to S. Nikolaew’s proposed Nivkh-Algic-Wakashan family (intriguing a/o PIE *wik, as in Latin “Vicus”, Indic “wikipotis” [mayor], Germ. -wik, -wich, -weig settlement names [Narvik, Norwich, Bunswig etc.] – vs. the Wig Wam). So the PIE lexicon appears to have substantial East Eurasian influence.
    https://www.academia.edu/28569450/S_L_Nikolaev_2016_Toward_the_reconstruction_of_Proto_Algonquian_Wakashan_Part_2_Algonquian_Wakashan_sound_correspondences

    b.) Morphologically, the closest neighbour to PIE seems to be Semitic (presence of grammatical gender, synthetic, consonant-based roots). My assessment is confirmed by the paper linked by ak, which has IE and Semitic pretty close, at d=0.398 (and much closer than IE and Uralic):

    c.) As concerns phonetics, PIE is characterised by a very high number of consonants, including all the kw, bh etc. stuff. All in all, PIE is believed to have 25 consonants in its inventory. The only living languages I am aware of which get to this or higher numbers are Semitic languages (28 consonants in Arabic), Georgian (28 consonants), and of course NE Caucasian languages, with up to 70 consonants – albeit Basque, with 24 consonants, isn’t bad in this respect. Finnish, OTOH, just has 13 consonants in its inventory.

    Therefore, I postulate that pre-PIE has morphologically/ phonetically its roots somewhere not too far from the Caucasus and Semitic-speaking areas. To become PIE, it however required lexical overforming by some East Eurasian language. I intend to revisit the question of “where and when” later. For the time being, let me say that “Steppe ancestry”, in its almost 50:50 mix of CHG and EHG genes, aligns well with that postulated linguistic hybridization.

    t.b.c.

  20. 3. A key question to be answered for me is: How can language families differentiate so strongly from each other, when language contact provides strong incentives/mechanisms for convergence, at least into a “Sprachbund”. The obvious answer is: There wasn’t such language contact for a long time, so the families evolved in isolation, thereby acquiring their distinct features.

    The reason for such isolation over millenia was of course at least across most of Eurasia the LGM. Hence, the look for homelands of any Eurasion proto-language, not just PIE, needs to start with a look at glacial refugia. I have started applying that approach here: https://adnaera.com/2018/12/10/how-did-chg-get-into-steppe_emba-part-1-lgm-to-early-holocene/. And invite you to revisit the maps of West Eurasian glacial refugia posted there. What becomes clear:

    a.) The NE Caucasus was ininhatible then, therefore not qualifying as homeland for any language family, be it PIE or Hurro-Urartian. And actually, there is hardly any archeological evidence of human activity there before the early neolithic.

    b.) I already made my point in the linked post that both Colchis (West Georgia and the Eastern Turkish Black Sea coast) and the Southern Caspian were in principle inhabitable by Humans (and might as such each have nurtured a different language family);

    c.) Asides from the Black Sea coast, most of Anatolia should have been uninhabitable during the LGM. Refugia, however, existed (i) in the Aegean including the Bosporus area (which didn’t exist then), (ii) on the Gulf of Iskenderun, (iii) in the Northern Zagros / Lake Urmia area (albeit that one appears to be inhabitated by Neandertalers, not AMHs), and, of course, in the Jordan Valley (Natufians).

    4. When it comes to placing pre-PIE, and pre-Hurro-Urartian, we need to consider a couple more languages/ language families that also emerge from that area, namely (i) Semitic, (ii) Kartvelian, (iii) Hattic, and (iv) Sumerian. Semitic may be placed into the Jordan valley, and Sumerian may have originated in the Persian Gulf Oasis that was flooded sometimes around the 6th mBC. This leaves us with four languages/ families, and four refugia. Two of them, namely Kartvelian and Hattic, are only reported to have been present on/ near the shores of the Black sea, so I assume their homeland there. Which leaves us with the South Caspian, and the Gulf of Iskenderun, for either pre-PIE or pre-Hurro-Urartian.
    My guess is for pre-PIE on the South Caspian – and actually also yours, Alberto, if I understand your argumentation on the eastward spread of IE languages correctly. This would place the pre-Hurro-Urartian homeland on the Gulf of Iskenderun.

    t.b.c.

  21. Now, let’s move to ANFs. Obviously, they must have weathered the LGM on the Gulf of Iskenderun, before a more favourable climate allowed them to move inland and settle the northern part of the Fertile Crescent. And, as has become clear above, I postulate they were speaking some kind of pre-Hurrian. The earliest historical attestation of Hurrians is from the northern Fertile Crescent, so we might see historical continuity there. Which is plausible, because agriculture gave ANFs a demographic advantage, ultimately allowing for settlement of much of Europe.

    The route from there to East Caucasia is clear. Actually, there is lot of aDNA evidence for ongoing genetic exchange between ANF and CHG, and the link via the Obsidian Route is well documented archeologically. Introduction of agriculture into SE Caucasia (E. Georgia and Azerbaijan) seems to have been accompanied by a demographic shift, i.e. immigration of (pre-Hurrian-speaking) ANF. And once NE Caucasia became inhabitable again (the reasons for the pre-Neolithic hiatus are still unclear, they may have to do with volcanic activity of Mt. Elbrus), they would have been a prime candidate to move in, and mixing there with Yamnaya-like population incoming from the North.

    And then, of course, we have the maritime, “island hopping” colonisation of the Mediterranean. Which started from the Gulf of Iskenderun or nearby, and should also have been essentially speaking pre-Hurrian. That would make Iberia speaking some kind of pre-Hurrian, possibly enriched by WHG substrate, before the arrival of IE. And the same applies to Sardinia (Paleo-Sardinian) and Italy (Etruscan). The latter is an obvious example of Steppe ancestry nor effecting language change – the same may apply for Iberian/ Basque.

    When it then comes to linguistic proximity, or lack thereof, between NE Caucasian and Basque, we need to consider that both languages should have separated some 8.000 years ago, and absorbed quite different influences in the meantime.

    Btw, on your NEBA proposal: We know how a Hurro-Urartian language that has been overformed by IE during the Iron Age looks/ sounds – namely like Armenian. If your NEBA hypothesis is true, shouldn’t Armenian be quite close to Western IE languages? Actually, from all I have read, it isn’t.

    t.b.c.

  22. Let me stay a bit with the ANF issue: We have another language that in all likelyhood emerged from Anatolia, namely Hattic. Hattic is poorly documented, but sufficiently enough to make clear that it isn’t morphologically related to Hurro-Urartian. While the latter was exclusively suffixing, Hattic had a strong pre-fixing element, e.g. marking the plural with the prefix “fa-” (“ashaf”= god, “fashaf” = gods). Prefixing is a prominent feature of NW Caucasian languages, and to some extent also found in Kartvelian, which has lead some linguistics proposing a relation of Hattic to both. This makes also aDNA-wise sense, as long as we suppose that Hattians were also gene-wise primarily ANFs. In that case, the ANF element in the Meshoko samples would have come from (pre-)Hattic immigration along the Black Sea coast.

    Since the southern Black Sea coast was inhabitable during the LGM, a (pre-) Hattic homeland there is generally plausible – and it might theoretically have extended towards the Bosporus area. The problem, of course, would then be having two distinct languages, from different LGM refugia, with a widely identical genetic signature, namely ANF. Having said that: There is indication for a substantial pre-fixing element in the pre-IE substrate, as outlined by Shryver for continental Celtic. And the replacement of gender suffixes by articles in modern Germanic and Romance languages may be interpreted as a shift from suffixing to prefixing. For Spanish and Italian, it may of course also be interpreted as Arabic influence, but that explanation fails when it comes to Germanic languages.

    In short: The thought I am entertaining is whether there might have been two ENF languages – one related to Hurrian, distributed along the “island-hopping”, mediterranean route, and the second one related to Hattic, which expanded overland through the Balkans and ultimately forming the Linear Pottery Culture.

    This takes me to the issue of non-IE substrates in Western IE – something your NEBA proposal fails to explain sufficiently (especially since Vennermans “Vasconic” theory seems to have been completely de-bunked by now). Before I continue that point, however, I need to make a couple of other notes.

    t.b.c.

  23. You write: “Northern and Western Europe were completely (re)populated by people who came from the steppe.”

    Actually, this is unproven. Even for Britain, where aDNA evidence seems to suggest it, the Reich Lab concedes insufficient coverage. They point to a lack of sampling for a/o NE Scotland and East Anglia (the latter being Britains most densely populated part during the Medieval, prior to the Black Death), and Wales also seems hardly covered.
    https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/2021_Armit_Reich_Beaker_Antiquity.pdf

    And, of course, your thesis would implicate yDNA I2 virtually dying out across Northern and Western Europe. Well, it hasn’t. And dominated a/o in the Lichtenstein Cave, an elite burial (whole oxens were buried there alongside the humans) from the Urnfield Culture.

    You then write: “We don’t have a single sample in the ancient DNA record from the Neolithic communities from the periods just before, during or after the arrival of the steppe communities.”

    In fact, we have lots of them. Here is France (2 times, Paris Basin & Languedoc):
    https://www.cell.com/current-biology/fulltext/S0960-9822(20)31835-2?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0960982220318352%3Fshowall%3Dtrue#mmc2

    Then Wallonia (South Belgium): https://eprints.hud.ac.uk/id/eprint/35254/1/FINAL%20THESIS%20-%20FICHERA.pdf
    Note also, that the Spiennes flint mine has been C14-dated to have been in Operation until 2.200 BC, with no archeological evidence of BB/ Single Grave presence, so it is assumed to represent continuity of the previous SOM culture. Some archeological background is here:
    https://biblio.naturalsciences.be/associated_publications/anthropologica-prehistorica/anthropologica-et-praehistorica/ap-112/ap112_77-89.pdf

    Over to their post-Michelsberg cousins of the Wartberg Culture in east Westphalia/ North Hesse:
    https://www.nature.com/articles/s42003-024-06676-7#MOESM4
    Dated to after, or just before SGC arrival to the north, typical Wartberg aDNA (i.e. strong WHG element, albeit lower than in Belgium, and intriguingly pointing to Korös/HU as nearest WHG sample, while in Belgium it is Loschbaur/ Bichon). Shows btw also that the Wartberg people weren’t completely immune to the Plague, but their decimation, if it ever took place, wasn’t caused by it.
    Older, but still valid read on the Wartberg/ Single Grave relation, and their co-existence for at least some 300 years:
    https://www.jungsteinsite.uni-kiel.de/pdf/2002_2_fabian.pdf

    Up north, NE Jutland, with some “shortly after” aDNA, showing in the PCA the admixture just where you would expect it, namely at the interface of CW and Pitted Ware. The history of 3-4 centuries coexistence between SGC (Western Jutland), late TRB (SE Jutland and Danish Isles) and PWC (NE Jutland) is briefly addressed in the paper, more on that can be found via the references.
    https://pmc.ncbi.nlm.nih.gov/articles/PMC7808695/

    Lastly, let me mention the Schönfeld Culture. It was once believed to be just a local phenomon, but recent excavations have extended its primary area to approximately the triangle between Luneburg, Berlin and Goslar, with outposts from the mouth of the Elbe down to Lower Austria. They apparently not only withstood CW/BB influence, but survived both, until by 2,200 BC becoming part of Unetice. Unfortunately cremating, so no aDNA.
    Here is a settlement map (which doesn’t seem to fully capture recent excavations around Berlin):https://st.museum-digital.de/object/36935
    A reconstructed house, at its original find spot, may be seen here (scroll down, third picture): https://www.steinzeitdorf-randau.de/

    IIRC, that NW Switzerland study from a few years ago had also some pre-BB, with-BB and “contemporary, but w/o BB ancestry” aDNA.

    In short: There is lots of areas in Central and Western Europe that weren’t depopulated. In some of them, mixing with the CWH took place. Elsewhere, e.g. parts of Wallonia and the Schönfeld Culture, pre-CWH-Cultures continued to exist until the onset of the Bronze Age.

  24. There is more to comment, e.g. on the issue of pre-IE linguistic substrate. Your Matasović link is a great, and quite telling link, because he shows the proto-Slavic substrate to mostly consist of typical HG terms, including hazel(nut) (prime HG staple food in Northern Europe), nettle (main textile fibre), and a couple of medical herbs. No animal husbandry terms, and just a bit of farming (barley, already main crop of EEF). In short: Just what you would expect from a population that has a quite high WHG share (as SOM or Wartberg), but very much at odds with representing a NEBA substrate.
    Moreover, unlike the Leiden School for Germanic, Matasovic only identified sparse common structural patterns in the Balto-Slavic substrate, which seems to indicate that it originated from several different sources – your NEBA hypothesis would instead call for just one, pretty homogeneous source.

    So, here is my “big picture”:

    1. I am with you when locating the pre-PIE homeland in the South Caspian. And much of the stuff you write about the eastward espansion (India, Tocharian) sounds good to me.

    2. That would, however, still have been pre-IE, i.e. before being lexically overformed by some EHG language. Nevertheless, sharing broadly the same morphology and phonology would make adaptation of PIE, which just differed in some lexical aspects, not too hard.

    3. The most likely place where EHG genetic and linguistic influence was absorbed by pre-IE speakers, essentially genetically CHG, was the Prikaspiiskaya culture on the Lower Volga. [I had actually intended to explore this in depth in my part 3 of the “How did CHG into the Steppe” series, but lost an already 50% finished article to a hard disk crash, and somehow never restarted writing]. Some reading on the Prikaspiiskaya culture is in the link – note especially how the author connects it to the timewise later Khvalynsk Culture. Time, place, and archeological features fit, also for Yamnaya, and the northern Caucasus foothills.
    https://journals.uni-lj.si/DocumentaPraehistorica/article/view/43.7/6975

    4. The remainder for Western Europe is the traditional CWC story, i.e. CWC as vector for the spread of IE. With the caveat that absorbing Steppe/ CWC-mediated genes does not neccessarily have meant a language switch to IE. That, OTOH, would make it easier to explain Indo-Europeanisation of the Balkans, Italy etc. IE was already present in some areas, so non-IE-speaking communities had exposure, which facilitated a language switch when further “pushs” travelled up the Danube (or through the Mediterranean). These impulses, btw, do not neccessarily need to have been demographic. Bronze-age trade with Greece (British tin and Baltic amber shipped down the Danube) might have sufficed, especially with two IE-speaking cultures, namely Unetice and Mykene, running much of the show.

    5. What still remains unclear to me is the path of IE towards Anatolia. My guess is on the Kura-Araxes Culture, but I’ll need to do more reading. Including giving your respective sections another look, or two. I may come back to them later.

    That’s it with my comments for the time being. Long post of yours, certainly inspiring, lots of comments.

  25. @FrankN

    Good to see you after so long. Quite some long comments from you too, so I’ll need more time to check all the references and give more detailed answers, but a few comments for now:

    About the initial formation of language families I basically agree with the idea that they formed during the Paleolithic (and the LGM would have selected the few surviving ones while giving them higher coherency). This dynamic of languages diverging during the Paleolithic but converging from the Neolithic onwards is something I already wrote about as mentioned in the comment just above yours.

    Whether the pre-Hurro-Urartian family would have been the one spoken by ANF is something I wouldn’t disagree in principle, but I’ll wait for further clarification of whether Hurrians came from the steppe. We have some clues (especially from genetics) that are compelling enough for me to want to wait for relevant samples to confirm or deny.

    Re: what you say about the substrates in Northern Europe (referring to the Matasović paper I quoted) I would say that the bias towards mostly pre-Neolithic words is expected given the phenomenon I have explained in the post: It’s been assumed (just because it was the easiest option at the time) that everything that couldn’t be linked to non-IE must be IE. That’s how you can get 50% of that “unknown” IE substrate in Catalonia (+10% Greek, +23% Latin) while only 10% is Iberian. This is an issue that must be addressed before we can actually evaluate the substrates of Europe with much better accuracy.

    I think that my bigger disagreement with you are still about the CWC-BBC problem. I couldn’t go still through all the references, but from first impressions:

    – The paper about France: Most samples are Neolithic. There are just five individuals that are from the relevant period, two of them (TORTC and GBVPL) are dated to 2527 and 2525 calBC (average), and they are regular Neolithic farmers. One (GBVPK) dated to 2387 calBC and this one is already a typical BBC male with the typical R1b Y Chromosome. Then there are two more from after 2000 BC which are also typical BBC but females. So these samples are exactly as all the others we have.

    – The paper about Belgium: The samples themselves show the same pattern with Neolithic type ones having i2a lineage while BBC type ones having R1b lineage. I didn’t find exact dates for each sample, but nothing stands out about them.

    – The paper about Warburg: The samples are clearly Neolithic, dated to 3300-2900 BC. So nothing special about them.

    – From the one about Denmark: “Our genetic data document a female (Gjerrild 1) and two males (Gjerrild 5 + 8), harbouring typical Neolithic K2a and HV0 mtDNA haplogroups, but also a rare basal variant of the R1b1 Y-chromosomal haplogroup. Genome-wide analyses demonstrate that these people had a significant Yamnaya-derived (i.e. steppe) ancestry component and a close genetic resemblance to the Corded Ware (and related) groups that were present in large parts of Northern and Central Europe at the time. Assuming that the Gjerrild skeletons are genetically representative of the population of the SGC in broader terms, the transition from the local Neolithic Funnel Beaker Culture (TRB) to SGC is not characterized by demographic continuity.”

    Now, if we get enough resolution we will find a Neolithic community contemporary with a steppe community. This is necessary to have happened, since otherwise the steppe communities could not have gotten EEF admixture. But those Neolithic communities died out early after. We may even find a surviving Neolithic community that post-dates the arrival of steppe ones, if it was in an isolated enough place. But nothing of this has any significant relevance from a linguistic point of view.

    Though clearly the area around present day south Poland, Slovakia, Czechia and Southern Germany was where there was the bigger interaction between both communities. After all, that’s where the steppe communities got 50% of EEF admixture. So any linguistic consequence of those interactions (if it exists) had to be in that area. Not in Western Europe, since in Northern France and Great Britain they got 0% admixture and in Southern France some 5-10%, while in Iberia the remaining 10-15%. Sure, that makes Iberia the place with the most EEF admixture, but that’s just because of the cumulative nature of it (I mention this because the “mainstream” view just did some hand-waving saying that non-IE languages may have survived in Iberia due to the higher EEF admixture).

    Regarding Iberia itself, I don’t think that a sea colonization is very plausible. But even if many came by sea, it was not a male migration. These were steppe communities, with men and women. And incorporating a small amount of local females into those communities wouldn’t make the community as a whole shift their language. Especially if all the other communities were also from the steppe and spoke the native steppe language.

    If instead you argued for a language shift in Central-Eastern Europe, that would be slightly more plausible (though really very, very strange). But in that case the end result would be the same, CWC-BBC being non-IE.

  26. Hi, Alberto,

    I agree that my links posted for the Central European EN (counting France as CE) could do with clearer data. Specifically:

    – My impression for the France data is that they have found some unadmixted samples from the BB/CWC period. But unfortunately (or not), all their male samples from that period have come out as R1b.

    – Belgium has only three dated samples (apparently from other sources, the author/ Reich Lab missed out on C14 analyses). One is Iron Age, e.g., should be discarded completely. Of the other two, one is pre-BB (and unadmixed), the other one BB (admixed, and yDNA R1b). The author apparently hoped to be able to address aging via the stratigraphy. However, the C14 dates have shown the stratigraphy to be disturbed (BB/IA samples from lower layers than the SOM sample). We need to hope that someone (Reich Lab?) will do C14 analyses also for the other samples.

    – Warburg: The second sample is some 200 years younger than the first one, and should fall into the period when SGC was already present in the North (and after CWC had southerly bypassed Warburg on their way from Thuringia to the Rhine-Main area, and ultimately to N. Switzerland). Unfortunately, wiggles in the calibration curve don’t allow for precise dating during the first half of the 3rd mBC.

    – Denmark: Yes, the data shows a shift. But also (check out their PCMs) signs of something like a 50:50 admix of Pitted Ware and Single Grave DNA.

    My point was mostly about your statement that we don’t have samples from directly before and after appearance of CWH.

    A more fundamental point is: If your theory of an essentially linguistically homogenous NEBA sphere across most of Northern and Central Europe from plusminus 2.500 BC is correct, and it would have been IE-ized just some thousand years later, we should have ended up with something similar to Armenian. Or at least one homogeneous form of Western IE. I find it very hard to explain the split between Germanic and Celtic from your theory. And even when allowing for substantial Indo-Iranic (Scythian) influence on Balto-Slavic, the latter should be much closer to Germanic and Celtic than it actually is. Your NEBA theory anyway fails f.t.b. to explain the emergence of Germanic and Celtic.
    If, OTOH, both go back to CWH overforming previous MN languages, thereby absorbing a strong substrate from them, the emergence of distinct sub-families within Western IE should be expected:

    – Pre-Germanic would have absorbed substantial Post-Michelsberg substrate. This, in turn, should have been a mixed language based on the EN/ early MN farmer language spoken in the Paris Basin (and possibly Britanny), i.e. substantially influenced by Cardial Pottery EEF language, with strong influence from the WHG languages absorbed during the late MN. The latter would have provided the HG substrate that can still be identified. And, as Michelsberg apparently absorbed different kinds of HGs – more Loschbaur-like west of the Rhine, more Köros-like, partly also Baltic-HG-like east of the Rhine – we should expect diversity, i.e. absence of common morphological features, in that substrate.
    Intriguing in this respect is also the current diversity of Danish, with four clearly distinguishable dialects, which in their geographic distribution mirror the cultural split during the early EN between CWH, late TRB and Pitted Ware. Given the very small geographic distance between these dialects, I find it hard to explain them as a recent phenomenon. Having emerged as IE absorbing very different linguistic substrates (or, in the case of West Jutland, no substrate at all), however, makes sense to me.

    – (Pre-)Celtic would reflect IE with mostly “Danubian” EEF substrate absorbed. If my theory of that substrate being strongly derived from Hattic is correct, it would explain P. Shryjver’s observation of Continental Celtic (Gaulish, but not Celtiberian) being characterized by what he calls a “verbal complex”, with a total of 10 possible grammatical “slots” being arranged around the verb. And, actually, modern French with its particular “Il n’y en á eu pas du tout” constructions seems to set forth such “verbal complexes” until today – they certainly can’t be explained from Romanization or Germanization.

    – Balto-Slavic is complex, as you have already written, because it has certainly received strong East Iranic (Scythian) influence. [Btw., I think that West Germanic (but not North Germanic!) also has received such influence, in that case from the Alans (Ossetians) after the fall of the Roman Empire.] Let me add in this respect that some linguists have started to question the theory of a Balto-Slavic family, ascribing similarities instead to “Sprachbund” phenomena over the last good two millennia.
    Unlike Slavic, Baltic seems to be related to Dacian (or Daco-Thracian) – difficult to prove or disprove, since Daco-Thracian is only poorly attested. For what it is worth, Dacian town names ending on “-dava” are well attested, and can actually be found as far north as Central Poland (e.g. Wlodawa on the border to Belarus, some 80 km ENE from Lublin). Dacian cultural influence, possibly including migration, on SE Poland during the late IA is archaeologically well attested. Moreover, Vennermann and Shryjver have postulated that Verner’s Law, a key component in the shift from pre- to proto-Germanic, has arisen in language contact to speakers of Finnic, and more specifically reflects how originally Finnic speakers would conduct language switch to Germanic. If so, Verner’s law, and ultimately the shift from Pre- to Proto-Germanic, must have arisen in a contact zone between both families, which only can have been the Baltics. This precludes presence of Baltic languages at their current location during the late IA – they would instead only have arrived their during the Migration Period, i.e. pushed north by Huns and Goths.
    https://en.wikipedia.org/wiki/Verner%27s_law

    As to Slavic: Proto-Slavic (but much less so Baltic!) has quite some Germanic adstrate, e.g. “hleb” (bread), borrowed from Gothic “hlaifs” (loaf [of bread]). The Slavic conservation of the initial “h”, which all Germanic languages except for Icelandic have lost in the meantime, attests that the borrowing is ancient. When looking for the Proto-Slavic homeland, asides from Scythian, contact to Gothic should thus also be considered. So, Proto-Slavic most likely arose within or immediately next to the Chernyakow Culture. Last but not least, I see some Italic, more specifically Venetic influence on at least parts of Slavic, namely the “g”->”h” sound shift in modern Czech, Slovakian, Ukrainian and Belorussian, and before them in Old Ruthenian.

    Otherwise, https://www.quora.com/profile/Thomas-Wier provides for interesting reading. Wier is an American linguist currently lecturing in Tbilissi, and has been diving a bit into the relation of North Caucasian languages to other families, including Hurrian, Hattic and Basque (scroll through his posts). One of his key arguments against a genetic relationship is the abundance of consonants in North Caucasian, which is neither found in Hurrian nor in Basque. I personally think that such “unspeakable” consonants are the first thing that gets lost when non-native speakers switch language – but may be preserved when speakers of a language having them switch to another language. Point in case are the “click-consonants” in SW Africa, typical for Khoisan languages, but also present in some neighboring Bantu-languages including Zulu and Xhosa.
    This indicates to me that NW and E Caucasian, for all their fundamental morphological difference, share a common substrate (phonetic, possibly also lexical) – in all likelihood from CHG. In terms of morphology, however, they have absorbed very different influences – NW Caucasian rather from Hattic, NE Caucasian more from Hurrian (with maybe, as Ceolin e.a. suggest, also some Dravidian influence present – Maykop aDNA might deserve a closer look in that respect). This doesn’t necessarily preclude NW Caucasian being ancestral to Basque – “unspeakable” consonants are likely to get lost during language change. But in your NEBA horizon, especially if it emerged as radically as you suggest, I would expect more traces of NW Caucasian phonology than are identifiable (which is zero). From a phonological point of view, an ANF-Hurrian relation to Basque seems easier to defend than a direct relation to East Caucasian.

  27. @FrankN

    Certainly very interesting thought regarding linguistics there. But I do find that certain constraints and problems of time depth make the specific links complicated.

    To clarify a bit better my own view: Language differentiation between the different IE language families from Europe would have started in the Balkans itself, which partially addresses your concerns about how would they have differentiated under an homogeneous substrate. Then there’s the fact that there would easily be 1000 years (closer to 2000 in Western Europe) between the arrival of the CWH and the arrival of IE languages.

    That’s regarding the possible effect of substrates, which is a complicated linguistic issue where many opinions can be found (as an example, I’ll quote the opening sentence of Kortland’s “An outline of Proto-Indo-European” (https://www.academia.edu/29613427/An_outline_of_Proto_Indo_European): “Indo-European is a branch of Indo-Uralic which was radically transformed under the influence of a North Caucasian substratum when its speakers moved from the area north of the Caspian Sea to the area north of the Black Sea”).

    Then we have the problem of the shallowness of the IE language families from Europe. From Chang, Garret et al. 2015, they get something like this:

    https://cdn.sci.news/images/enlarge/image_2516_2e-Indo-European-Languages.jpg

    Or if you look at the newer revised version by Heggarty et al. linked above by ak2014b which has pushed back the dates to fit a different hypothesis, it’s still not that much different. We can’t link the expansion of these families to the CWH. And we know at least something about the case of Celtic, being a very late arrival to Western Europe. For Balto-Slavic we have the constraint of the Indo-Iranian influence from the Scythians. (BTW, there’s a recent paper about Germanic proposing an expansion initially from the East Baltic to Sweden: https://www.biorxiv.org/content/10.1101/2024.03.13.584607v1 ).

    But finally, we have the biggest problem which is the actual languages found in the areas where the CWH expanded, basically in Iberia and to some degree in Italy. As I said above, Iberia is really not the right place to look for an in situ language shift from the CWH people (or BBC people if you prefer). Gaska posted the statistics above from the male lineages we have from the Bronze Age too.

    I forgot to answer from your earlier posts to the possibility of an early matriarchal Basque society. First, I’m not aware of any particular difference in the BBC from 2400 BC and after in the area of Aquitaine and northern Spain relative to other places. But I’d point out that the Megalithic cultures of the Neolithic were extremely patriarchal too (or patrilocal, if you prefer) as we’ve seen from Dolmen burials. More important to me is the more simple fact that this was not any sort of male driven expansion, but a migration of communities into mostly depopulated areas, which doesn’t leave much room for debating what didn’t happen.

    BTW, from Ceolin et al. linked above by ak2014b too, here’s the full chart with language distances: https://github.com/AndreaCeolin/Boundaries/blob/main/Supplementary_Information/FigS1/FigS1.png . A few strange things, like NE Caucasian and Dravidian getting along surprisingly well with too many other families, which probably explains why when measured against each other they get such a low distance in spite of the almost impossible real connection.

  28. An interesting paper I’ve come across these days relevant to the possible NE Caucasian (and Hurro-Urartian) connection with, in this case, Etruscan:

    “Etruscan’s genealogical linguistic relationship with Nakh-Daghestanian:
    a preliminary evaluation” by Ed Robertson: https://www.theelen.info/%5B20151101%5D%20Etruscan%20numerals%20-2-.pdf

    Yes, there are so many theories that anything can be found to support any theory or the opposite one. I just found this to be a more serious effort compared to some other ones and worth a look for those interested in the subject.

  29. @Vara

    To be honest, I haven’t read about it in any detail. It’s really difficult to know what to make of any of these sort of complicated linguistic theories about isolated languages where any comparison is based on very few details.

    In general, pretty much any other West Eurasian language, and even some (North-)East Eurasian ones, has been found to have resemblances with Indo-European. I guess that’s because IE has such a large corpus that it’s likely to always find correspondences if you look for them. Many languages (including Burushaki, but also Etruscan or Basque) have been proposed to be para-IE (belonging to a sister branch of IE), just like the whole Uralic family has, or even Chukotko-Kamchaktan (which we know is impossible). The corollary of all these hypothesis would be the Nostratic language macro-family.

    Burushaki as an IE language was also proposed by Ilija Čašule more recently, and John Bengtson, who is clearly used to examine these sort of very low evidence connections reviewed the findings and didn’t find them very convincing (https://www.degruyter.com/document/doi/10.31826/jlr-2011-060108/html ). (BTW, for anyone wanting to check out Bengtson’s summary about the Basque-Caucasian hypothesis, here’s a link: https://www.academia.edu/31720885/Euskaro_Caucasian_Hypothesis_Current_model_2017_ . You’ll probably find the evidence to be weak, but it’s up to anyone’s criteria if it reaches an expected threshold or not).

    One of the main purposes of this post was to explain the data that we have from ancient DNA in a way that can make sense for historians and linguists, because only knowing what is possible, impossible, likely or unlikely is how they are going to make some real progress instead of looking at random connections. Many linguists have started to take into account the genetic evidence and adapt their theories, but unfortunately based on poorly explained or outright wrong conclusions. My hope is that things improve in the near future, so I put here my two cents.

  30. @FrankN

    I realised that Gaska’s comment was in Spanish, so you may have missed those stats I mentioned about Bronze Age male lineages in Iberia. Here’s what he posted about it (translated):

    “We have 105 quality male genomes from all the Bronze Age Iberian cultures, 99 are R1b-M269 (94.3 %) , 4 are I2a1a-P37 (3.80%) and 2 G2a2b-P303 (1.90%)”

  31. @Alberto

    The evidence isn’t strong on PIE’s relationship with other families. There are so many theories out there including Hurrian being the earliest split from Fournet and another one Dravidian being the closest language to PIE. I think it’s all nonsense.

    IMO, Burushaski being a sister language to PIE fits with your Iran-Turan hypothesis. Witzel claims the Vedic Aryans interacted with Burushaski speakers at some point but I’m pretty doubtful of that. I think it’s more likely that this language was brought across the IAMC from the many later migrations.

  32. I found a preprint from earlier this year that can be interesting for starting to look closer at the genesis of Balto-Slavic:

    North Pontic crossroads: Mobility in Ukraine from the Bronze Age to the early modern period

    In Figure 2 you can see the location and dates of the samples. They have two from the Vysotska Culture (somehow related to the Chernoles Culture, though also to Urnfield) dating from around 1100 BC. On the PCA (Fig 3. A) they plot where European Bronze Age ones would (CWC, Unetice,…) so they are probably the European clade of R1a, but they don’t have enough resolution to know (Table 1). Then there’s a Cimmerian from around 1000 BC which is clearly of Central Asian origin (from the PCA) and has Y haplogroup Q1b. It was the interactions between these two populations that produced the Balto-Slavic languages during the next few centuries.

    Then we have a group from Central Ukraine labelled as “Scythian right bank of Dnipro Illirian-Thracian basis” that date to around 700-600 BCE and are genetically also of European origin (Fig 3. B) and one of the samples has enough resolution to know it has the European type of R1a (R1a-Z283, Table 1). So these are “European” Scythians, and in the archaeological notes at the end of the paper they say:

    “This necropolis [Medvyn] belonged to the forest-steppe agricultural population, which preserved archaic burial traditions (decarnation through exposure to the elements and scavangers). […] Burials with a similar set of artefacts are found in the earlier dated kurgans of Saharna-1 burial ground (Cigleu) in forest-steppe Moldova. These facts allow to assume the movement of the population from Middle Transnistria (the oldest complexes) through Pobuzhzhia (Tyutky, Nemyriv, Vyshenka-2) to Porossia in the early Zhabotyn period. The migrants moved into regions sparsely populated by people of the late Chernolis culture, where mixing of different ethno-cultural groups occurred. The funeral rite and the set of moulded dishes indicate either the participation of the Chornolis-Zhabotyn population of Porossia in the genesis of this population, or the influence of migrants on the material culture.”

    So it seems these samples come from a population that moved there from further South-East (near Moldova) which was a more Scythian area. Still, they don’t seem to show genetic impact of Scythians, though the culture must have been strongly influenced by Scythians. It may be that these samples just spoke the same Indo-Iranian language as the Scythians, or maybe it was Balto-Slavic, who knows. But all these populations at this place and time are where one has to look for the origin of Balto-Slavic languages.

  33. Another detail from that same paper. They have some samples labelled as Thracian-Hallstatt dating to as early as before 900 BCE (the non-outlier radicarbon dated in Table 1, carrying a Y chromosome E1b-V13). From the archaeological notes:

    “The Thraco-Cimmerian culture still does not have the status of a distinct archaeological culture in historiography. It was described in 1920–1930 based on horse ammunition items from hoards belonging to the late Urnfield and early Hallstatt period in Central and Eastern Europe.”

    Genetically they are all quite “southern” for that region (border between Ukraine and Romania), quite clearly from further south (Fig 3. A).

  34. I made an update (just before the conclusions) about a significant change I saw in the sequence of samples that we have from Czechia. Between the EBA and the MBA, specifically c. 1600-1500 BC, between the Únětice Culture and the Tumulus Culture, there’s a significant genetic impact from the Balkans (estimated around 40% maybe, depending on the source) that persists through the IA. I had not noticed that before and didn’t expect to see such a clear change in such a brief period between the two cultures.

  35. Hmmm there might be something something going on in the Balkans.

    I remember there was someone claiming that there will be a few -2000BCE J2a samples with extra Anatolian ancestry in Eastern Hungary. There is still that BR2 sample with Arslantepe related marker but it’s nothing out of the ordinary in terms of the autosomal profile.

  36. @Al Bundy

    Well, I guess that if we refer to the linguistic role that the Andronovo people played in this, the most important thing would be that they were the ones who introduced the Indo-Iranian language to the steppe, which was spoken there for the next over 2000 years, and the one that helped shape the Balto-Slavic languages as we know them.

    I don’t know if that answers your question or you were referring to something else.

  37. Alberto,

    thx for the links to the Etruscan and the Bengtson paper. The Bengtson paper requires registration, so I haven’t read it. But I came across a more recent paper that he co-authored, which extends the family even a bit further: “Notes on some Pre-Greek words in relation to Euskaro-Caucasian (North Caucasian + Basque)” (including a couple of parallels also to Burushaki):
    https://www.degruyter.com/document/doi/10.1515/jlr-2021-191-210/html

    While some of the stuff in both papers clearly extends my linguistic competence, I find the evidence quite compelling. But note, however, that both papers suggest a relation that differs from your proposal (and that also both differ in their assessment of the role of Hurro-Urartian):

    1. Robertson [Etruscan] includes Hurro-Urartian in his family, but restricts the Caucasian part to East Caucasian. This makes sense to me from a morphological point of view, but differs of course from the Moscow School, which sees East and West Caucasian united, at least as concerns phonolgy and some basic vocabulary. Both, however, can IMO also be explained from a shared substrate, and/or language contact/ Sprachbund phenomena.
    He doesn’t regard Hurrian, East Caucasian and Tyrrhenian [Etruscan] as directly descending from each other, but rather as “cousins”, which all go back to an undocumented forefather spoken somewhere in Anatolia: “The relationship between Etruscan and the modern Nakh-Daghestanian languages is, while relatively distant, not a “remote” or “long-range” one, and might be compared in degree to the relationship between Latin and the modern Celtic languages. Just as Latin and Celtic had a common ancestor at a time depth of the order of 4000-5000 years before present (or rather, were at least adjacent, closely-related dialects of their earlier common ancestor), Etruscan and Nakh-Daghestanian became separated at about the same sort of time depth, or slightly more than 2000 years before Etruscan is first attested. The closeness of this latter relationship would be consistent with Proto-Tyrrhenian having separated from the rest of East Caucasian during the east to west wave of settlement across Anatolia which occurred as a consequence of a period of economic prosperity between the 4th and 3rd millennia BCE.” (P.2) O.k – I am unsure whether that theory can be aligned with the available East Caucasian, Anatolian and Italian aDNA from the EMBA, but that would be a secondary point.
    More important to me is his reasoning for the just indirect relation (p. 27). One point relates to the unusual high number of consonants in East (and also West) Caucasian, which isn’t found in Hurro-Urartian and Etruscan (and also not Basque, for that matter). Then, he points out: ” The most important key feature of Nakh-Daghestanian grammar which is not shared by Etruscan is class marking,
    and specialists in ND have traditionally regarded items of vocabulary which show class marking to be among the more ancient members of the ND lexicon. However, we have seen above that at least some instances of class marking show signs of having occurred as innovations. (..) It is completely lacking in Etruscan and its closest relatives and in Hurro-Urartian, and, unlike those Lezgian languages which do not now have class marking, neither Etruscan nor Hurrian/Urartian show signs of ever having had it. It is reasonable, on the basis of the balance of evidence so far, to suppose that Etruscan is more closely related to Nakh-Daghestanian than either of them are to Hurro-Urartian, and that hence the ancestor of Hurro-Urartian was the first to split from the common ancestor of Etruscan, Hurro-Urartian and Nakh-Daghestanian (which we could refer to as Proto-Alarodian for want of a better suggestion), followed by Proto-Tyrrhenian at some point thereafter. Proto-Tyrrhenian and Proto-Hurro-Urartian thus never formed a separate clade of their own. ProtoAlarodian also did not have class marking, and Proto-Nakh-Daghestanian, or all of its daughters, acquired it as an innovation at some later date, perhaps due to the influence of West Caucasian object markers or Akkadian personal pronouns.”

    Now, my understanding is that Basque also lacks class marking (by prefixes), but is instead exclusively suffixing, in a quite complex way – a feature it shares a/o with Hurrian. If so, the above argument possibly precludes your NEBA hypothesis. Albeit one may argue for a relatively late adoption of class marking in NEC, after NEBA languages had already split. Which still doesn’t explain the loss of various Caucasian consonants.
    [IMO, Robertson’s suggestion of class marking in NEC being adopted from NWC – and ultimately Hattic – makes quite some sense. Intriguing in this respect: https://www.quora.com/Is-there-any-evidence-to-the-claim-that-French-is-becoming-polysynthetic?no_redirect=1, comparing modern French grammar to West Caucasian Ubykh, thereby strengthening my point of a pre-Hattic role in EEF languages (continental route, ie. LBK etc.)]

    t.b.c: My Browser has stability problems. I post this before it gets lost.

  38. 2. Bengtson/ Leschber draw the relation somewhat differently. They include NWC in the family (of course, Bengtson has worked intensively together with Starostin and other members of the Moscow School), even though my impression is that the lexical parallels given relate almost exclusively to NEC languages. While occasionally also including Hurro-Urartian examples, they do not explicitly spell out that family’s relation to their proposal – maybe, because it wasn’t deemed relevant for their topic.
    On the genesis of the relation, FN2 spells out “I think the ancestors of the Basque people were the first European farmers, bringing agriculture from Asia Minor. The first wave went along the north Mediterranean coast and I would seek its traces in Greece and Italy, plus adjacent islands. The northernmost part of this wave was perhaps the Alpine region, where the tribal languages Rhaetic and Camunic were located, probably related with Etruscan. Till the present time there are traces of Basque-like toponyms and dialect words in Sardinia.”

    Among the arguments he has provided in earlier papers is the fact that he thinks to have identified multiple lexical parallels related to agriculture and animal husbandry, but non as concerns metallurgy, pointing to a split still in the LN. And, of course, we should then expect to find linguistic traces of that maritime (Cardial Pottery) EEF language not only in Basque, but also elsewhere in the Mediterranean. Etruscan (Robertson) would be one of that remnants. Another one, which the linked paper sets out on, is pre-Greek substrate.

    The paper concludes (p. 94): “It is important to emphasize that authentic Pre-Greek words, if they are of a more or less ‘basic’ nature, are not loans directly from North Caucasian (as framed by Nikolaev), but instead substratal remnants of a Euskaro-Caucasian language related to (Proto-)North Caucasian, but surely not identical with it.” In that sense, he agrees with Robertson, whereby the former provides a clearer linguistic reasoning. Bengtsen/ Leschber don’t need that reasoning, because their theory of a relation dating back to the LN precludes a direct link to the Caucasus.

    Intriguing here, OTOH, is that they find fossilized traces of [Caucasian] class markers in the Pre-Greek substrate, also in Basque, and the Pre-IE substrate in other Western IE languages (p. 87 ff). The latter includes most notably the ominous “a-mobile” described by Iversen & Kroonen, and also for the pre-IE substrate in Slavic alluded to in your linked Matasovic paper. E.g. (p. 88):

    “Latin merula ‘blackbird’ (< *mesl-) : Old High German amsala id. (< *a-msl-) : cf. (without a prefix) Basque *mosolo ‘(small) owl; buho, mochuelo’: mozolu, mozoilo, mosolo, (expressive) moxolo, motzollo id.; NC: Archi mus:al ‘wild turkey’, Chamali (dial.) mus:iya".

    If such traces would only be found in your NEBA horizon, your theory would be strengthened. But their presence also in Pre-Greek, and the general strong relation of Pre-Greek to "Euskaro-Caucasian" puts your theory in question. As I have said: I see [Pre-]Hattic at work here, contributing to the EEF language, and, via NWC, to NEC.

    As to Burushaski: Bengtson/ Leschber extend a couple of their Pre-Greek – Euskaro-Caucasian lexical parallels also to Burushaski. My rudimentary understanding, based on https://en.wikipedia.org/wiki/Burushaski is that it is morphologically quite different from IE. What stands out to me are the verbal complexes, with up to 11 slots, reminiscent of the more than 20 slots in NW Caucasian languages, or the 10 slots identified by Shryjver for Continental Celtic. In Burushaski, 4 slots come before, 6 after the verb. The most complex IE language verbal construction that I have an idea of (there may be other, more complex ones) is Latin, with 2 optional slots (negation, preposition/ preverb) before, and up to 3 slots (tense/modus, person/number, passive [optional]) after the verb stem. [Western] IE may have seen a reduction in verbal complexity. However, Shryjver rather argues for at least the Continental Gaulish verbal complex to reflect the influence of a Pre-IE substrate. Which – I am repeating myself – could be related to [Pre-]Hattic.

  39. Namaskar Alberto,

    Nice to see you posting after a long time. And what a wonderful invigorating read it was. I would request you to keep writing new posts. Furthermore I would request you to try and get an article published in some decent journal. You have a very important & valid point to make.

    I hadn’t really given much thought to your argument that Indo-European languages couldn’t possibly have come to Europe from the steppe. I wasn’t really sure what to make of it and had quickly forgotten all about it.

    Since then I have myself come to realise that Indo-Europeans couldn’t possibly have come to Europe from the steppe. It is only in the early 2nd millennium BCE after intense interaction and influence from Indo-European Near East that Indo-European elements seem to become visible in Europe. The influence of Minoans & Mycenaeans on the Carpathian Basin & Nordic Bronze Age is particularly stressed in this regard. Kristian Kristiansen has published stellar articles on this phenomenon. There was this one recent article from Rune Iverson that is also worth reading –

    https://www.researchgate.net/publication/381364698_Issues_with_the_steppe_hypothesis_An_archaeological_perspective_Iconography_mythology_and_language_in_Neolithic_and_Early_Bronze_Age_southern_Scandinavia

    It appears to be the case that even the 3rd millennium BCE steppe migration is too early for spread of Indo-European languages in Europe. The IE phenomenon spreads to Europe most likely from the Near East and it is post 2000 BCE. Robert Drews also has an excellent book on the subject –

    https://www.routledge.com/Militarism-and-the-Indo-Europeanizing-of-Europe/Drews/p/book/9780367886004?srsltid=AfmBOopOtHgCUuxlIv1NbtGGwWzFLx918uSPLK88yDBMguRcuiQshL4h

    It appears that Nordic Bronze Age was the incubator of Proto-Germanic while the Carpathian Basin may have been the incubator of Celtic & Italic branches and both these places were Indo-Europeanised under massive Near Eastern influence. Here is one important paper by Kristiansen on how Near Eastern warrior societies influenced & transformed ‘backwater’ Europe –

    https://www.academia.edu/124784901/The_Rise_of_Bronze_Age_Peripheries_and_the_Expansion_of_International_Trade_1950_1100_BC

    Another one by Vandkilde tries to explain how globalised the Eurasian trade network was during this period –

    https://www.academia.edu/35021739/Bronzization_The_Bronze_Age_as_pre_Modern_Globalization

  40. Frank,

    Yes, the possible relationship between any of those languages is a very difficult subject. If it’s difficult to say if NE Caucasian and NW Caucasian are related or not, just imagine how complicated it is to relate any of them to Basque or Etruscan. Even in the case of Hurro-Urartian (and those are two languages already to look at and compare, spanning a significant period) is difficult to say if it’s related to NE Caucasian or not. That’s why I said about Bengtson’s hypothesis about the Basque-N. Caucasian hypothesis that the evidence was weak. Even if the hypothesis is correct and the languages were indeed family related it would just be too difficult to know it with certainty at this point.

    That’s why I think it’s better to start from where we have the most solid evidence first, which in this case comes from ancient DNA, and work from there already knowing what you’re looking for, what is possible and what isn’t.

    Your linguistic insights are certainly very interesting and they have a value on their own. But for me the main problem comes from this “linguistic first” approach. You find yourself attributing a language from the Cardial Pottery people to a completely different population that came from the steppe and occupied the former territory of those Cardial Pottery populations.

    Given that Basques descend from a population that arrived to their approximate current location ca. 2400 BC and that they came from Central Europe (and ultimately from the steppe), how could their language come from the Neolithic farmers of Southern Europe? The Bell Beakers that settled the area didn’t get much admixture from the Neolithic communities around SW France and N Iberia. Maybe 5% from incorporating some females from those communities that, I’ll stress it again, disappeared shortly after.

    If the Neolithic communities of Europe that had survived the collapse and were still there by the time steppe communities arrived had been able to establish a good relationship with these incoming communities by trading, exchanging wives, cooperating with each other, etc… why would they have disappeared almost immediately after when the steppe communities themselves thrived at the same time and in the same places? The evidence clearly shows that whatever the way in which the steppe communities incorporated women from the Neolithic ones that they met as they repopulated N and W Europe, they didn’t establish any good relationships with the Neolithic communities where those women came from. At the least, they just excluded them from their own networks, at which point those isolated Neolithic communities would have small chances of surviving.

    So in this context, it seems difficult if not impossible that there was any linguistic influence in the language of the steppe people from their predecessors in the area, so to expect a complete language shift is completely unrealistic IMO. My NEBA languages hypothesis is indeed a hypothesis (obviously), but not like the older ones before ancient DNA that were mostly a “best guess” based on very inconclusive evidence. It’s one based on a very clear and unambiguous evidence. An evidence that makes it very, very difficult to argue that the CWH (by which I include the BBC to the west and all the other forest steppe cultures to the east) didn’t speak the same language. They came from a small core population and occupied a very large and mostly empty space. And due to their mobility they kept stronger network interactions that the previous Neolithic people did. So any of them shifting to a language (and keeping it for millennia) of a foreign, small population that went extinct immediately after is really hard to explain.

    Besides, if one want’s to propose that the steppe people of the CWH spoke an IE language, and that some of them shifted to a non-IE one, there’s also the problem of the complete absence of those IE languages. They all disappeared without traces, apparently? Italo-Celtic would anyway come ultimately from the Balkans (or at best Central-Eastern Europe) and reach W Europe in the IA. Balto-Slavic formed necessarily in the IA. So other than possibly Germanic all the other ones disappeared? And without leaving traces in the non-IE ones too?

    So to reiterate, I have to insist on the “ancient DNA first” approach, and then let’s look at the languages, because in this case this equates to an “evidence first” approach (given that genetic evidence is very strong and linguistic evidence is very weak).

  41. Hi Jaydeep, good to see you too after so long. Glad you’re still around following the developments of this fascinating topic.

    Indeed, the evidence is quite clear about IE languages in Europe being a very late introduction. And the evidence for the steppe people to have been PIE is not only non-existent, it’s that it really is incompatible with the evidence we have. Sadly, years pass and we keep seeing the same thing repeated without putting much thought into it other than trying harder each time to solve the problems in more complicated ways. It’s also sad to see the delay in the publication of ancient DNA from North India, since that would force for many people to rethink the whole thing again and maybe they’d finally come to terms with the actual evidence.

    Thanks for the links. I’ll read them as soon as I can and comment about them here.

  42. Alberto:

    1. “Given that Basques descend from a population that arrived to their approximate current location ca. 2400 BC..” Is that actually so? I mean, for the males you can probably say so. But from what I remember (correct me, when I am wrong), Basques and Sardinians are still the most EEF-like populations in modern Europe.

    Which takes us back to the point of if & under which conditions newly arriving males can effect language shift. And we anyway need to deal with non-demographic/ DNA-based explanations for the shifting (or non-shifting) to IE (lingua france, political/ cultural dominance etc.). Including in the case of Etruscan, which obviously could withstand demographic pressure for language shift, until the Romans broke their political dominance. [You could o/c argue for Etruscan being a NEBA phenomenon. But in that case I would expect it being closer to Aquitanian/ Basque, a/o because both had experienced similar language contacts (Continental Celtic, Latin, Punic/ Arabic). Asides, Etruscan aDNA doesn’t have enough “Steppe” to qualify as convincing case for a demographically effected language shift.]

    There are various more recent European examples where language and genes don’t match: Hungarian, the spread of Slavic across the Balkans and also parts of Russia, or the medieval Germanisation of lands east of Elbe/ Saale, including formation of Yiddish as West Germanic language. Even for SW Germany (your link above, good read btw.), Germanic immigration alone after the collapse of the Roman empire doesn’t suffice to explain the language shift. It needed a political factor, in the form of Frankish control, to make it happen. In Lombardy, with a comparable demographic shift, OTOH, Frankish control was too weak, so it remained Romance-speaking.
    Certainly, none of the above examples provides an immediately applicable model about why Basque might have withstood “Steppization”. But we are talking about a timescale (Copper Age/ EBA), when long-range trade networks and centrally-controlled structures emerged across Europe, including Iberia (El Argar). And the Basque Country sits on one of the land passages across which Cornish tin was transported to avoid ship passage around the Iberian peninsula (which derives its name from the Ebro, used for that trade). I don’t know enough about the EBA on the Gulf of Biscay for any qualified statement, but asides from matrilocality, issues related to controlling major trade lines, and possibly using language as advantage against competing networks (the Loire/ Rhone, Rhine/Rhone, Elbe/ Danube [Unetice] passages to the Mediterranean and Black Seas, respectively) might need to be considered as well.
    Last but not least, there has been an obvious founder effect for R1b males. This is traditionally explained by social factors (crowding-out of other males, e.g. via killing, raping, disenfrachising), but may as well relate to epidemics, e.g. the Plague. We start to get a better understanding of the role of the HLA genes in fighting various infections, including CoViD, albeit that understanding is far from being perfect. In any case, HLA genes are strongly homozytous, a/o regularly used for paternality tests. So, the “founder males” may just have had a genetic advantage against other males, and their current dominance has to do with genetic selection rather than inititially “overpowering” (also linguistically) the “native” population.
    [Intriguing here: There appears to be a strongly negative correlation between genetic resistance against the Plague, and CoViD. Regions poorly affected by the Black Death stand out negatively when it comes to CoViD, and vice versa. Points in case are Lombardy vs. Tuscany (the former with little documented death toll from the Black Death, but a high one from the Plague, the latter the opposite), Franconia vs NW Germany (dito, Tirschenreuth in E. Franconia was the second most heavily affected county in Germany by CoViD, while there lack historic reports of Black Death victims between Nuremberg and Prague, both major economic and political centers of that time). The Basque Country also seems to have fared the Black Death quite well – CoViD obviously not so.]

    2. “why would they [the Neolithic communities] have disappeared almost immediately after when the steppe communities themselves thrived (..)?”. Well, they didn’t. They just remained invisible to the aDNA record, for which reasons ever. I have provided examples above, including the Schönfelder Culture on the Upper Middle Elbe, which survived CWC & BB, towards the beginning of the EBA expanded a/o into Bohemia and Austria, and seems to have been one (of several) formative element in Unetice. Schönfelder was cremating, so no aDNA.

    For Unetice see also https://www.science.org/doi/10.1126/sciadv.abi6941 (Bohemia time transect -have you considered their data in your Unetice analysis?). Interesting there a/o, with respect to yDNA founder effects: “In addition to autosomal genetic changes through time, we observe a sharp reduction in Y-chromosomal diversity going from five different lineages in early CW to a dominant (single) lineage in late CW” – in a process of some 300-400 years. Might have been social, might also have been biological (disease-related) selection. In any case, the process was longer and possibly more complex than you seem to suggest.
    BB, btw, for all their genetic similarility to CW, subsequently effected the next complete shift in yDNA (from R1a to R1b), before early Unetice re-introduced Mesolithic yDNA (I, C) in substantial portians

    Here is another example, which I just came across by chance when trying to learn a bit more about the EBA in SEE – if you don’t know it yet, it is anyway good background for contemplating linguistic developments there:
    https://academic.oup.com/mbe/article/40/9/msad182/7240678

    “We report 21 ancient shotgun genomes from present-day Western Hungary, from previously understudied Late Copper Age Baden, and Bronze Age Somogyvár–Vinkovci, Kisapostag, and Encrusted Pottery archeological cultures (3,530–1,620 cal Bce). Our results indicate the presence of high steppe ancestry in the Somogyvár–Vinkovci culture. They were then replaced by the Kisapostag group, who exhibit an outstandingly high (up to ∼47%) Mesolithic hunter–gatherer ancestry, despite this component being thought to be highly diluted by the time of the Early Bronze Age. The Kisapostag population contributed the genetic basis for the succeeding community of the Encrusted Pottery culture.”

    Intriguing is not only the fact that by ca. 2.200 BC there had been an extremely HG-rich, apparently hardly Steppe-affected population around in CE that had been able to genetically “overpower” the preceding, Steppe-rich Somogyvár–Vinkovci culture. They also found that population to be a WHG/ EHG mix, with slightly more EHG, whereby the WHG component was reminiscent of the one found in FBC/GAC (->Schönfelder?), the EHG one to Ukraine EN. However, they constate that respective mix is so far undocumented in the aDNA record. Still: “Individuals with this ancestry predating Bk-II by only a few generations appeared in Czechia, Northern Hungary, Eastern Germany, and Western Poland, indicating that the Kisapostag-associated population probably came to Transdanubia via a northern route” (good map in the article). [They didn’t check the Wartberg samples in that respect – my hunch is that these would also have provided a decent fit, given that their WHG ancestry was quite “eastern” (Korös/ Iron Gates-like)]. Moreover, the HG profile (somewhat diluted, o/c) made its re-appearance in the LBA Tollense samples.

    The subsequent MBA Encrusted Pottery Culture then is genetically described as “dilution (..) driven by contact with various local populations, genetically best represented by later Transdanubian Hungary_LBA or Serbia_Mokrin_EBA_Maros”. The dilution is in the range of 25-30%, re-introducing yDNA R1b-ZZ103 (completely missing in the Kisapostag samples), but 3 of the 5 male Encrusted Pottery samples still have yDNA I2a-L1229.

    Here you go with your (interesting) observation of Balkans (Bulgarian) a DNA having introgressed into Czechia. They can’t have arrived along the Danube. In fact, as per https://en.wikipedia.org/wiki/Encrusted_Pottery_culture: “The Encrusted Pottery culture expanded eastwards and southwards along the Danube into parts of Croatia, Serbia, Romania and Bulgaria in response to migrations from the northwest by the Tumulus culture”. The path must have lead north of the Carpathians – and the Balkans remain puzzling to me, also linguistically.

  43. On a more general note: There isn’t any standard linguistic “birth rate” that universally defines the time after which a proto-language produces offspring, in the sense of individual daughter languages, and ultimately sub-families. Beyond the proverbial saying “A language is a dialect with a flag and a navy”, pointing out the political dimension, there even seems to lack widespread consensus where a dialect continuum ends and a language family starts [Sorry, Hungary – no navy! And greetings to Luxemburg – I find Letzebüergisch pretty hard to understand. The same applies to Schwyzerdütsch…].

    At least two factors seem to play a role when it comes to forming daughter languages/ families, and I have looked around a bit for respective benchmarks.

    The first factor relates to morphology: Semitic languages, possibly Afro-Asiatic as a whole (estimated time depth of 16.000 years, far longer than any other language family I am aware of), are known to be especially resilient to change, for their focus on 3-4 consonant roots. Point in case is Arabic, first documented around 400 BC, so by now some 2.400 years old, w/o having produced daughter languages. Well, in fact, there are now at least four distinct regional dialects, different enough from each other that Algerian films require subtitling to be shown in the Gulf states. While still being called “dialects”, technically we might regard them as daughter languages. Still, this would give us just one linguistic generation over 2.400 years, as a first “upper limit” benchmark.

    The second factor is the intensity of foreign language contact. Again, for an upper benchmark, Hawaiian, as example for Polynesian languages as a whole, with very little foreign language contact prior to European exploration/ colonalisation. It is believed to go back to Marquesan settlement maybe in the 4th century C.E., latest in the 6th century, with later settlement (9th cent. CE) from a/o Samoa. Acc, to https://en.wikipedia.org/wiki/Hawaiian_language: “Jack H. Ward (1962) conducted a study using basic words and short utterances to determine the level of comprehension between different Polynesian languages. The mutual intelligibility of Hawaiian was found to be 41.2% with Marquesan, 37.5% with Tahitian, 25.5% with Samoan and 6.4% with Tongan.” 41,2% mutual intelligibility with Marquesan is probably beyond the “dialect” stage, so Hawaiian is clearly a language in its own right. But from ca. 600 CE to 1896 (when English became the official language), we are talking 1.300 years. During that period Hawaiian has produced differing dialects on the various islands, but nothing that seems by any linguist being regarded as daughter language.

    On to IE: The classical benchmark, for a medium to high foreign contact scenario, is Old French. It was first documented in the Oaths of Strassburg 842. This was exactly 900 years after Cesar started the conquest of Gaul, so a nice and round benchmark. I call it “medium to high contact” scenario, because there had obviously been Continental Celtic substrate involved, Aquitanian/ Basque, the usual Roman mixing of populations, including oriental Jews, Germanic (Franks, Visigoths, coastal contact with Vikings/ Normans), remigrating Insular Celts into Brittany, plus at least throughpassing, partly also settling other migrating groups (Alans/ Ossetians, e.g. around Tours, possibly a few more Caucasians migrating with them).

    Lower Benchmark: Latin America, more than 500 years after Columbus, w/o forming own Romance languages. Certainly also medium to high foreign contact, that, in addition to native languages, may include West African slaves, and immigrants from outside Iberia – on top of Spain’s linguistic diversity exported there.

    Under these considerations, I am generally fine with Germanic having a 3-tier structure (family, sub-families, individual languages). The fourth tier, as now becoming apparent under High German with Schwyzerdütsch, Letzebüergisch and Modern High German is one to much. But is perfectly explainable, because actually High German split too early from West Germanic. And did so for a good reason, namely Romance speakers shifting to West Germanic (Shryjver has discussed this in great detail).
    With the common dating of Proto-Germanic to ca. 100 BC, the separation of North and West Germanic was actually also too early (and to me, North Germanic feels quite more separated from German than, e.g., Italian is from Spanish). The underlying reason should equally have been substantial language shift / acquiring substrate. The proccess has been ongoing in Norway when it comes to Saami, albeit the originally absorbed substrate might have been a complete different language.

    Then – Italic: The linguistic diversity is documented from the late IA / Roman republican period. And goes certainly beyond dialects. By the 6th cBC, there existed 3 clearly distinguishable families: Latin-Faliskan, Osco-Umbrian, and Venetic (sometimes considered a separate family inside IE). They have shared certain defining sound shifts, especially PIE “Gh”->”h” (“hortus”-“garden”) and “Bh”-“f/v” (*bhergh->Lat “for(h)tis”, German “Burg”, “frater” – “brother”), possibly under Etruscan influence. They also share the reversed sequence of qualifier (noun) and specifier (adjective), e.g. saying Villa Nova, instead of New Town. The latter is ascribed to Semitic (Punic) influence, and might just have occurred by the 7th or 6th cBC. Latin “hospes” (host, lit. “guest master”) preserved, fossilised, the standard IE sequence and attests the recency of the shift.
    Albeit it cannot be excluded that language contact with Etruscan and/or Punic has introduced specific changes in just one or two of the sub-families, the fact that all three have been affected by similar changes suggests that the internal differentiation has other reasons, reaching backwards longer. Some glotto-chronologists think of back to 4.500 years, which seems too long to me. But the Hegarty e.a. 2023 paper places Proto-Italian some time around 2.000 BC, which could make sense to me. If we assume development outside Italy, e.g. on the Balkans, and entry around 1.200 BC, with the transition from the Terramare to the Villanova Culture, the three groups/ families should already have entered fairly differentiated, at the transition from “somewhat, but hardly, mutually understandable dialects” to separated languages. Which seems unplausible to me. Alternatively, several authors consider two or more waves of Indo-Europeanisation, with Venetic possibly being the offspring of the latest (Villanova) wave. That doesn’t rule out entrance from the Balkans, with in that case another wave crossing over from Albania. Still, Italic remains a specific puzzle, that seems far from being solved to me.

    On to Celtic. First, two general remarks:
    1. Hallstatt was most certainly not Celtic, at least not exclusively. In all likelyhood, Celtic wasn’t even the dominating language. The name-giving village of Hallstatt is surrounded by Venetic inscriptions to the South (Gailtal, Carinthia) and the West (Ampass/ Tyrolia, a bit W from Innsbruck), and the antique Tergolape, a clearly Venetic name (c.f. Tergeste->Trieste, Opitergum->Oderzo), is currently associated with Schwanenstadt/ AT, 70 km to the north of Hallstatt. So, the chance is high that the people from Hallstatt proper spoke Venetic or something closely related to it. Hallstatt had substantial cultural influence from the NW Adriatic see, most likely transmitted via, and by Venetics. Intriguing in this respect is also, how the Czech/ Slowakian / Ukrainian “G”->”H” sound shift (“gora”->”hora”) mirrors the Italic shift described above. If you apply Venetic sound shifts to PIE *bʰérǵʰos (hill, mountain), you arrive at something like vrch[s] – which is a Czech word for hill or mountain (Proto-Slavic *verh, w/o accepted IE ethymology, if one excludes borrowing from Venetic).

    Having said that, modern archeology distinguishes between East and West Hallstatt. For good reasons – there is quite some evidence of military conflict. And the Ehrenbürg in Franconia, a densely settled Plateau with some 3.000 inhabitants, believed to have been a major centre of East Hallstatt, was completely destroyed and not resettled at the transition between (East) Hallstatt and La Tene.
    West Hallstatt, OTOH, may well have been Celtic. Unlike Bavaria, SW Germany is also where we find multiple Celtic toponyms ending on -dunum (“enclosure”, e.g. Cambodunum ->Kempten, Tarodunum-> Zarten n. Freiburg), or -briga (“hill, city”, Sarabriga -> Sarrebruck, Bregenz [<-Brigantium]. Hence, your linked article, dealing with SW Germany (West Hallstatt) may well be correct when addressing the inhabitants as Celts [but note, how careful they are with speaking just about West Hallstat, not Hallstatt in total].

    2. Some Greek historians (Herodotus?) stated that the Celtic homeland was near the source of the Danube, which has lead many historians to locate it in SW Germany. However, antique authors often selected different source rivers. Ptolemy, e.g., seems to, orographically correctly, have taken the Vltava as main source of the Elbe (and possibly the Oker as source of the Weser). If we take the Greek statement as meaning "near the source of the Inn", we end up near to the the lakes of Como and Lugano, where the oldest Celtic (Lepontic) inscriptions have been found.

    Homeland aside: By the 6th cBC, (Continental] Celtic was also already quite differentiated into at least Celtiberian, Gaulish, and possibly Lepontic (arguably just a dialect of Gaulish). The differentiation is explainable from different language contact, acc. to Shryjver in the case of Celtiberian, e.g., with Iberian language (and Lepontic with Raetic/ Etruscan). Therefore, Proto-Celtic doesn't require such a time depth as does Italic. Nevertheless, to develop into a distinct family requires a reasonably early differentiation, also geographically, from which closest relative (Italic, as per Shryjver, or Germanic, as per Heggarty e.a) ever. Heggarty e.a. place the split between Pre-Germanic and Pre-Celtic to the third mBC.

    This works with the "IE from the Steppe" theory, especially if you assume Pre-Germanic as Single Grave, Pre-Celtic as BB-derived. And the time depth might be shortened unter the assumption that both absorbed a fairly different linguistic substrate, which would have been a relatively "classic" continental EEF language in the case of BB, but heavily WHG-overformed (maybe even WHG-EHG-dominated) TRB/ GAC language for SGC. However, I fail to see how the Balkans could have brought forward such differentiation.

    I leave it here with IE ftb. Proto-Balto-Slavic is anyway a special case, Proto-Greek complex for interaction also with several Anatolian languages, and there seem to be other readers that are far more acquainted with Indo-Iranian than I am to take up my lines of thought in that area.

    But the key problem is: To explain the diversity of Western IE, you need quite some time depth, and the presence of fairly different linguistic substrates that are being absorbed. Ideally both, because language contact/ shift substantially reduces the time depth required for linguistic differentiation. Some 2-3 millenia just on the Balkans seem too short for that – and we need to look at ca. 500 BC as the period when Italic, Celtic and Pre-Germanic were already distinct sub-families, not just sister languages.
    Moreover, your NEBA hypothesis, as it is presented, would lead to incoming IEs encountering a linguistically fairly homogenous population. You are talking about some 1.500 years between CWH/NEBA arriving, and then switching to IE. Without intensive language contact (depopulated, females gradually absorbed, no surviving LN population around), we might at best assume one linguistic NEBA generation, i.e. split into a number of languages that are still pretty close to each other, partly still mutually understandable (think of South Slavic and a comparable time scale, for that matter). I can't see such linguistically homogeneous population bringing forward the differentiation between Celtic and Germanic – not even speaking of the specific features of Gaulish, which appear to live forth in modern French and make it such an outlier among Romance languages.

    You have addressed that problem yourself in a comment above, as "shallowness of the IE family in Europe". The only plausible solution to this problem IMO is that incoming IEs encountered fairly different substrates to interact with and ultimately absorb, including by language shift of a sizeable portion of "natives".

    Which calls, first of all, for the LN population to have already spoken fairly different languages. Various degrees of interaction with HGs (and, apparently, different kinds of HGs, from Spain-like to substantially EHG-enriched), plus linguistic differentiation already between incoming "island hopping" and "Danubian" ANFs, would have provided for such substantial differentiation of LN languages. But secondly, this differentiation needs to have survived until IEs started coming in (or effecting language shift in other ways). This precludes any "NEBA reset" – at least in the radical form proposed by you.

  44. Frank,

    You do raise a lot of interesting questions there (which, needless to say, is what I want and expect from the comments here), and it will take me a while to address all of them (and for some I won’t have any clear answer, I wish I could have answers for everything). So for now I’ll start from the beginning, your first point which is pointing at the core of this debate:

    “1. “Given that Basques descend from a population that arrived to their approximate current location ca. 2400 BC..” Is that actually so? I mean, for the males you can probably say so. But from what I remember (correct me, when I am wrong), Basques and Sardinians are still the most EEF-like populations in modern Europe.

    Which takes us back to the point of if & under which conditions newly arriving males can effect language shift.”

    There is a reason why I started this post talking about Western Europe on the claim of it being the easiest part, thanks to it being at the west end of the world back then and therefor free of complexities present in other places more at the centre, and also because of the quite extensive sampling available for the relevant periods.

    Since you reference there the males as the main drivers of the CWH/BBC migration, I really have to correct you about this. This was a migration of communities of people, LN/EBA people, shepherds, with children and grandparents, husbands and wives. Not just because we were still far away from the time where armies existed, but more importantly because the data shows us this. If the people from the steppe were migrating as groups of men and acquiring women from Neolithic (EEF) communities, their steppe ancestry would have diluted to almost nothing in a few generations (by 2500 BC they would be indistinguishable from EEFs except for their Y Chromosome).

    As for Basques being (other than Sardinians) the most EEF-like (in the autosomes) population from Europe, well, more or less that would be correct. Not by much, but yes. However, it’s important to understand the process as it happened. Olalde et al. 2019 put is this way:

    “We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry.”

    But note that the steppe populations what reached Iberia didn’t come from the Lower Don directly. They had to cross Europe over a few centuries. Before they left Central Europe and headed to Iberia they already had replaced 50% of Central Europe’s ancestry (here taken as the same thing as Iberian ancestry, i. e, EEF ancestry). All this 50% Central European ancestry present in those steppe populations from Central Europe was of Danubian origin, not Cardial. It was then when crossing France and reaching Iberia where they acquired an extra 10% of EEF admixture (this time of Cardial origin). Which would make the Iberian BBs at ~40% Yamnaya (that’s the 40% replacement in the study’s quote above) plus ~50% Danubian EEF (well, late Europea Farmer would be more appropriate, since they had more WHG ancestry by then) plus ~10% Cardial EEF (idem). This is why Basques closer to EEF, because the small amount of admixture they got from local communities was added on top of the large amount of admixture already acquired in Central-Eastern Europe. Even a 1% addition from local sources would make them closer to EEF. But that “added total” is not a measure for the likelihood of switching to an EEF language.

    This is why I already mentioned above that Iberia is not the right place to look for a language shift for the steppe populations. If they were going to shift to some EEFs language, they would have done so already in Central Europe where they had 50% EEF ancestry. Not from the 10% between France and Iberia. Whatever language those BBC folks spoke in Central Europe is the language they brought to Iberia, France and Britain. The case of Britain is even more clear than the one of Iberia, since we have no evidence even indirect of any presence of Neolithic communities surviving by the time the BB folk arrived. The reason why I don’t use Britain instead as an example is because there we don’t have attested languages until much later. We just have the specific substrate in Insular Celtic as a good clue, but that’s never going to be as good as actual written evidence of a language. And needless to say that if BB males migrated to Britain alone, they would have gone extinct in one generation (and if the island was not deserted and they were getting EEF females for reproduction, their 50% steppe admixture at arrival would have disappeared in less than two centuries).

    Then you question if (or rather deny that) the Neolithic communities actually disappeared, saying they just were invisible to ancient DNA record. But that’s clearly not correct at least for the most part (in most of the areas occupied by the CWH). In the case of Iberia and Britain, we have extensive sampling not just from the time of arrival of the BBC, but for the next 2000 years. The Neolithic communities were not invisible to the aDNA record or we would have seen them eventually. Or at least we would have seen them indirectly in the genes of the descendants of the steppe people. And why would settled agricultural communities of the late Neolithic, with duellings tools, pottery, etc… become invisible in the first place? The fact is that they indeed disappeared, and this point is crucial for what I’m trying to explain in the post.

    “Last but not least, there has been an obvious founder effect for R1b males.”

    Yes, but founder effects within different subclades of R1b only (in most of Western Europe). Since pretty much everyone was already R1b, any founder effect of any lineage would still be under the same typical western European R1b branch. IOW, the almost 100% R1b in Bronze Age Western Europe is not due to any founder effect. It’s just due to the people who repopulated the area being almost 100% R1b since they arrived.

    Yes, other places closer to contact zone areas (Carpathian Basin-Moravia/Bohemia, for example) are more complicated. Maybe other places did see a survival in exceptional cases of some Neolithic communities (you mention the Schönfeld Culture, which I don’t know any details about so I can’t comment much about it), but exceptions are just that. They don’t change the overall picture.

    I’ll have to read in some detail the papers you linked above to comment further on this last point. Especially interesting for me is the origin of those I2a haplogroups in the Unetice Culture. Back then the resolution was too low to know: not all I2a comes from Neolithic communities (ultimately from WHG), but some lineages were on the steppe and came from there. From a quick look at the paper you linked, those may have come from “Early CWC”, which would mean steppe in origin (should be expected), but I should check more carefully.

    I leave also more complicated points (like those about language diversity in European IE languages) for another reply too. That’s a complex topic as I’ve discussed in the blog before. Not only there’s no standard rate at which languages evolve, but even the direction of their evolution can be either way (“more time = more diversity” or “more time = less diversity”). In any case, good point to raise the issue about the difficulty to reconcile the shallowness of IE languages in Europe and their significant inter diversity.

  45. Before continuing I wanted to make sure my memory served me well regarding the steppe admixture in Etruscans. And I can confirm that they do have enough steppe for a convincing case. Yes, we can’t have almost 100% certainty that their language came from the CWH like in the case of Basque/Iberian, but it’s clearly a good possibility. Here you can see that they can be modelled as deriving 85% of their ancestry from Bell Beakers from Italy, while from the 25 Y Chr haplogroups available, 19 belong to R1b-L51+ (76%). For a run using directly Yamnaya as a source (not realistic, but to get an idea of how much actual steppe admixture) here it is showing some 29% ( I also checked with BB from Bavaria, and it was around 52%).

    That’s clearly more steppe than any population from the Balkans, let alone West Asia and beyond.

  46. @Alberto: Thx for the extensive replies.

    1. Etruscan may be a case for CWH/NEBA. Or equally, if representing a surviving EEF language, a case that Basque may also represent such a language. I am open to both options. But it certainly makes sense to explore the cases of Basque and Etruscan [Raetic] simultaneously in more depth.

    2. Britain: Albeit Patterson e.a. 2022 (https://pmc.ncbi.nlm.nih.gov/articles/PMC8889665/) report signs of the EEF share increasing already during the EBA, including in Scotland, relatively unaffected by MBA immigration from the Continent, I agree that the shift is minor, and Britain is indeed a case of virtually complete population replacement by “Steppe” immigrants.
    Which would make Insular Celtic a prime subject for identifying NEBA (“Vasconic”) substrate. While Vennermanns “Germania Vasconia” theory has received some review, and overwhelming rejection, when it comes to [West] Germanic, I am not aware of similar studies for Insular Celtic. This blog post https://euskerarenjatorria.eus/?p=38538&lang=en actually suggests Vasconic substrate in Insular Celtic. However, this may also stem from language contact, which certainly has existed especially with Ireland, and may have been quite intensive over the last 3-4 millenia. Anyway, the issue deserves further consideration and study.

    Some side notes:

    a.) The Orkneys (unsure, whether they technically even qualify as Britain) seem to have been an outlier. As per https://www.pnas.org/doi/10.1073/pnas.2108001119: “As elsewhere in Bronze Age Britain, much of the population displayed significant genome-wide ancestry deriving ultimately from the Pontic-Caspian Steppe. However, uniquely in northern and central Europe, most of the male lineages were inherited from the local Neolithic. This suggests that some male descendants of Neolithic Orkney may have remained distinct well into the Bronze Age.”
    Just a footnote to the overall picture, but maybe relevant when it comes to Pictish (possibly also the substrate in Irish). I also remember one paper co-authored by Shryjver that dealt with substrate in North Germanic and Saamii, and identified some non-IE substrate shared by Scotch-Gaelic and North Germanic [Old Norse]. Haven’t bookmarked the paper, will need to try finding it again.

    b.) Patterson e.a. 2022 report that “average EEF ancestry increased in North-Central Europe (Czech Republic/Slovakia/Germany)” during the EBA – actually significantly (from approx. 34% to 48% acc. to their Fig. 4). For the Netherlands, they report an increase from appr. 27% to 31%. Now, that may reflect EBA immigration from EEF-rich regions, e.g. the Balkans – albeit the Netherlands are quite remote from there. More likely IMO, however, is your NEBA hypothesis having been formulated too radically, ignoring sizeable pockets of non-Steppe-affected populations (in the case of NL e.g. the flint mines of La Spienne, and also Rijkholt, see my comments above).
    To the extent Davidsky has processed the Patterson 2022 data, it might make sense to explore it further. Whereby I would more focus on the HG than the EEF element – differentiation between La Goyet/ El Miron, Villabruna/Loschbaur, KO1/Iron Gates and EHG (UA EN) seems to be the best way to distinguish local admixture from immigration, and eventually identify the direction of immigration, if there had been any.

    c.) Patterson e.a. also report substantial additional Steppe introgression into Iberia between the CA and the MBA (EEF down from 64% to 59%), w/o further discussion/ analysis.

    3. The above observation takes me to Iberia. Let me concede in advance that I have stopped following intensively a DNA studies in 2020, and may not be up to date on recent findings. Your guidance is certainly appreciated in this respect. Nevertheless, I am aware of the Olalde papers, and also Villalba-Mouco e.a. 2021 (https://pmc.ncbi.nlm.nih.gov/articles/PMC8597998/). My take-aways from them are:

    a.) While Steppe ancestry in Iberia arrived during the CA (BB-period), it wasn’t until the EBA, i.e. after 2.200 BC, that specific R1b lineages assumed dominant position. Which means that we have to consider some kind of “founder effect”, or maybe “survivor effect”, after the apparent socio-economic crisis and turnover that hit at least S. Iberia around 2.200 BC (discussed in more detail in Villalba-Mouco e.a. 2021, with possible reasons relating a/o to the 4.2k climate effect well documented for the E. Mediterranean, maybe also epidemics [Plague], and unsustainable land use).

    b.) The “Steppization” process appears to have been gradual, over time, and on-going during the EBA (compare Patterson e.a. 2021). If so, it would have increased the likelyhood of the immigrants having been absorbed w/o language shift.

    3. At least in S. Iberia (El Argar), there is little indication for “Steppization” by immigration of “shepherds, with children and grandparents, husbands and wives”, as you have put it. In fact, Villalba-Mouco e.a. 2021 report “rejection of models involving Germany_Bell_Beaker + C_Iberia_CA. (..) Notably, Bastida_Argar also failed for the distal model.” Instead, they propose an Iran-N-enriched source: “However, these three groups returned values ≥0.05 in the proximal local CA substrate model when Iran_N was added as a third source”. While failing to identify the specific origin of that source, possibly because of undersampling of the source regions in question, they state that “adding a central Mediterranean population to the outgroups (Sicily_EBA, Greece_EBA, or Greece_MBA) decreases the model support (P values) for Almoloya_Argar_Early and Almoloya_Argar_Late, indirectly attesting to the importance of central Mediterranean BA.”
    My takeaway is that we need to consider a “Steppization” component from the central/ eastern Mediterranean, most likely (via) Sicily, which would at least complicate, albeit not necessarily invalidate, your CWH/ NEBA idea.

    Having said that: I concede again to not have followed recent studies in detail, and appreciate any guidance from you, especially as concerns your postulated immigration of “shepherds, with children and grandparents, husbands and wives.” If such immigration can be confirmed from aDNA, it is certainly a weighty argument, albeit one may still question the linguistic impact of sheep-herders vs. elites controlling tin trade from Britain to the Mediterranean, or gold and silver mining in Andalucia (El Argrar).

  47. I forgot the most important note on Patterson 2021: The absence of significant IA migration into Britain makes its Celticisation difficult to explain.We need to go back to the MLBA, starting ca. 1.500 BC, and have to assume that the migrants into Britain, most likely and/or in majority from NE France, already spoke Celtic or some kind of Para-Celtic (“Nordwestblock”). Which, under your NEBA theory, would push back IE-isation of NE France and in consequence Central Europe accordingly, i.e. into Tumulus (at latest) instead of Urnfield.

  48. Frank,

    Let me first also admit to not have followed too closely the studies from 2020 onwards. I just followed them distantly, and missed quite a few. I had to catch up a bit while writing the post and still doing it after publishing it. My feeling during these years was that what was coming out didn’t significantly change the picture, though.

    1. Britain: Clearly Britain is the strongest case of population replacement. If the steppe populations found any surviving neolithic groups they were really very few and basically undetectable in the genomes of the BB groups.

    For the Insular Celtic substrate, see the https://www.degruyter.com/document/doi/10.31826/jlr-2012-080111/html. But here I should reiterate my opinion about substrates in N and W Europe being clearly biased due to a very large portion of them being considered as IE just because they appear all over Europe (I’ll mention again that 50% pseudo-IE substrate in Catalonia, which must actually be Iberian, and is shared all over the continent). Until this is rectified, it will be very difficult to make sense of the substrates with just looking at those not shared between different regions.

    2. The paper about Bohemia you linked above has some very interesting samples regarding this discussion. For example, it has some Corded Ware samples that have no steppe ancestry, and as you can probably guess all of them are females. Three are from the site Vliněves (VLI008, VLI009 and VLI079) and one from Stadice (STD003), with dates respectively of 2894-2703 calBC, 2850-2497 calBC, 2853-2503 cal BC and 3010-2889 cal BC. So here we have the direct evidence of females being incorporated from Neolithic communities (in this case, I guess some late Globular Amphora communities, since it’s always been Globular Amphora samples the best fit for the EEF admixture in CWC groups).

    The paper also features several CWC samples that basically have no EEF admixture (KON005, modelled as 98.2% Yamnaya in table S9, dated 2868-2586 calBC, OBR003, 93.5% Yamnaya, 2911-2875 calBC, VLI076, 92.5% Yamnaya, 3018-2901 calBC), the latter one a female, with many others above 80% Yamnaya, with several females too, showing that this area was of particular importance when it come to the early interactions between steppe groups and Neolithic ones. It’s around this area extending further west to Bavaria (still a relatively small zone) where the CWH people went from almost 0% EEF to around 50% EEF, and therefor the one that would matter most when it comes to any linguistic influence/shift (though I’d still say, very unlikely given the type of interactions).

    Interesting too are the early CWC samples carrying R1b-M269, and R1b-L151 which later became the BBC marker, showing that there cannot be further doubts about both cultures being the same people.

    When it comes to Únětice, what the data shows is that it started with a significant input coming from the NE, bringing higher steppe admixture (compared to the late BBC preceding it) and different male lineages. They also don’t have any high resolution when it comes to the I2a lineages, but it seems incompatible with the area and genetic profile of the incoming population that those I2a could come from EEF populations.

    I’ll elaborate on other of your comments when I can get more time, but as a quick note, the very high steppe females from the samples in the Bohemia paper provide direct evidence of females being part of these steppe migrations. But it’s not that we really need that direct evidence. If I can get enough time I’ll try to show it with numbers (and maybe in Iberia if that’s a better place), but I can confidently say that a male migration (or largely male) is incompatible with the samples we have. These were communities of people, families, moving together. For example, from the Villalba-Mouco et al. 2021 paper:

    “This observation suggests a substantial amount of steppe-related ancestry in El Argar BA individuals, which we tested formally and directly with f4-statistics of the form f4(Argar_Iberia_BA/SE_Iberia_BA, SE_Iberia_CA; Yamnaya_Samara, Mbuti) (fig. S5A and table S2.7). Significantly positive f4-values confirmed the presence of steppe-related ancestry in all BA individuals. We then tested for differences in affinity to steppe-related ancestry by contrasting northern versus southern BA individuals using f4(N/NE/C_Iberia_BA, Argar_Iberia_BA/SE_Iberia_BA; Yamnaya_Samara, Mbuti) (fig. S5B and table S2.8). The resulting f4-values confirmed a smaller amount of steppe-related ancestry in individuals from the Argaric sites La Almoloya and La Bastida compared to the rest of Iberia_BA groups, especially when compared to those from northern Iberia (fig. S5B and table S2.8), despite the complete turnover to lineage R1b-P312 (except for one subadult male in La Bastida) visible in the Y-chromosome record (Fig. 3B, table S2.6, and text S8). However, at the intrasite level, we observe no significant differences with respect to the amount of steppe-related ancestry between the early and late phase of La Almoloya and La Bastida based on PCA and formal f4-statistics (Fig. 3A and fig. S5), which suggests that the contribution is homogenized across the population.”

    As expected, a small difference in steppe ancestry between north and south Iberia (that’s already a larger area than the “core” Bohemia-Bavaria one mentioned above) due to still incorporating a small amount of females from Iberia itself. There’s not too much in it, though. In Table S2.13, the test several models for different BA populations of Iberia as a mixture of C_Iberia_CA_Stp (these seem to be the samples with highest steppe) + something else. They get best fits with adding Iran_Ganj_Dareh_Neolithic as a second source, but notice that it’s with negative values of it. The best fits without negative amounts is with Jordan_PPNB. As such, NE_Iberia_BA is modelled as 94.6% C_Iberia_CA_Stp + 5.4% Jordan_PPNB. For N_Iberia_BA it’s 91.2% and 8.8% respectively. And for SE_Iberia_BA_Argar (I assume this is the average of all the sites of El Argar and all timeframes) it’s at 86.8% and 13.2% respectively.

    I’d still agree with your impression of my too radical formulation of the dynamics of what happened with the CWH populations and their replacement of former neolithic people. But that’s from the point of view of someone into ancient DNA genetics. The reason I didn’t go into small details and nuances was for the sake of simplicity. I honestly don’t think that those nuances are enough to have a real impact on the outcome, so when writing for linguists and needing to cut the technical information of the post I just summarised it in a possibly too radical way for some tastes. I hope your comments are helping to give a broader perspective on the subject.

    More as I get time for it…

  49. Sorry, correction about those models from Villalba-Mouco et al. I mixed up the distance with the P-Value, so since they’re giving the P-Value higher is better, not lower as I was taking it. Looking at the best models, which are often two-way mixture of C_Iberia_CA_Stp + C_Iberia_CA (i.e, local steppe + local pre-steppe), for NE_Iberia_BA the local steppe source peaks at 75%. In central Iberia is down to 56-58%. And in El Argar (average) is at 42.6%. For the latter, adding a third source with a good model works with Ganj_Dareh_Neolithic and Jordan_PPNB, but both with negative percentages. However, for some specific sites from El Argar it does work well with positive values for Ganj_Dareh_Neolithic. For example, La Bastida works best with 7.2% Iran Neolithic.

    In general, it’s not surprising that the SE part of Iberia was able to incorporate the largest amount of Neolithic females, given that it was that area the one that was the most populated and lasted longer as the Los Millares Culture. Neither it’s surprising that it had some contacts with the eastern Mediterranean. But still, if one wanted to argue that Iberian came from Los Millares (which again, is very unlikely to happen by just taking some women from it), it could work for Iberian, but it would fail to explain why Aquitaine in SW France also spoke a related language. And regarding the eastern Mediterranean pretty much the same. So overall, interesting details, but they don’t change the picture.

  50. Jaydeep,

    Reading that paper by Rune Iversen was quite interesting. He argues that the CWC/BBC that first came from the steppe could not have introduced the IE languages based on the lack of concepts and features that are part of the PIE language. Where I disagree is with his solution to the problem:

    “To adopt new words into a language that describes concepts and features unknown to its speakers seems to go against the paleolinguistic method. These concepts, together with signs of Indo-European mythology, first appeared in the Early Bronze Age, period IB/II, c.1600/1500 BC (i.e. c. 1200-1300 years after the Single Grave culture and the supposed introduction of Indo-European). Hence, we must expect at least a “second round” of influences from the steppes introducing new words (originating in Proto-Indo-European vocabulary) together with new features such as woollen clothes, domesticated horses, spoke-wheeled chariots and figurative mythologically loaded iconography. A driver for this development could be the Sintashta chieftains.”

    While Sintashta (via its descendant culture that moved to the west, Srubnaya) could be responsible for providing the horses and chariots, I can’t see how the figurative art could be related to either of them. While Sintashta had already a bit of influence from Turan (David Anthony elaborates on it) it was small, and looking at the material culture of the Srubnaya Culture it still looks very crude (especially comparing it to that of the Scythians that succeeded it is night and day). But Scythians are too late to have brought these changes to Denmark. We must look at SE Europe for that. Besides, it doesn’t really make sense that the CWC was non-IE but then Sintashta was IE, since they were the same people. So again, we need to look at SE Europe for a source for the language if one wants to argue IE arrived at such date (1600-1500 BC).

  51. Frank,

    Re: the arrival of Celtic to Britain, the new paper just published is more informative than the old one you mention (Patterson 2021). Here’s the link again: https://www.nature.com/articles/s41586-024-08409-6

    In general, I think that by the middle Bronze Age the elites and networks associated to them started to be solidly established, and much more by the IA. Since the expansion of Celtic didn’t occur in a similar way to that of Latin (with a centralised system), it’s likely that the “Celtic package” (iron included) spread through these networks already established, and specially through its elites. In cases where elites where in conflict with each other it’s likely that there was a take over. But in cases where they were allies, the package was probably just transferred along. The question of how did the language spread with this package is open to debate, but it could have been adopted by the elites first as a form of prestige language and for communication, and then been gradually adopted by the rest of the population. Ancient DNA won’t show big movements, I’m afraid. So we’ll have to figure things out by looking at the details (sometimes genetic, sometimes archaeological).

  52. Alberto,,

    I am with you on this one. Iverson’s solution does not make sense. I agree that IE must have spread into Europe via the Aegean & the Balkans. The second wave as he calls it is not coming from the steppe. Furthermore, this is not the second wave but the first wave of IE.

    But his paper is nevertheless important because he is honestly pointing out how the IE evidence in Europe is actually about 1,200 later to the supposed entry of IE from the steppe.

  53. First a note on Iversen: Wheels were already known and used in the Mesolithic and EN, e.g. as flywheels, for drilling shaftholes into axes. The LN innovation was developing strong enough bearings to allow for wheels being placed under load-bearing carts. But these people didn’t re-invent the wheel (just the bearing). So, there is no reason to assume PIE didn’t have wheel-related vocabulary already before the appearance of wheeled vehicles (which, btw., seem to be a FBC/GAC invention, so PIE terms might also have been borrowed from there).
    Also, the word “wool” may have traveled with the product, c.f. “Alpaca”, “Cashmere” etc.
    That`s nothing to build any serious argumentation upon. More relevant ist his description of a second push that transfered a presumably “indo-europeanised” iconography from the Mediterranean to S. Scandinavia during the MBA (Tumulus Culture). If CWH was NEBA, the MBA seems the timeframe to look for ultimate IE-isation of Central Europe, S. Scandinavia and also Britain.

    Alberto, on Iberia: My understanding is that we talk about the following scenario (with some regional specifics, i.e. not completely uniform about the whole of Iberia):

    1. Steppe ancestry arrives around 2.600 BC, in various cultural contexts, partly, but not exclusively linked to the BB phenomenon.

    2. Initially, these people are clearly identifiable as “outsiders” genetically (including yDNA). The homogenisation process takes quite some time, possibly more than a millennium – cf. Patterson above, where “dilution” seems to be maximised by 1.500 BC, before LBA/IA bring additional migrations (Iberocelts, also Punic).
    Might be interesting to take another look at the first arrivals as per the Olalde paper: What was the share of females? How much “dilution” from CE did they already carry? Intriguing in this respect is also the fresh Alltoft e.a. paper (https://www.nature.com/articles/s41586-023-06865-0, with better Supp. Mat. than the biovrix preprint): They postulate an initial “Steppe”/GAC admixture that dilutes subsequent moves, and can also show this for Scandinavia and Britain. However, from their handful of fresh Iberian samples (mostly Asturias) to SW Europe, most don’t show that dilution (Ext. Data Fig. 9).

    3. The “takeover” of yDNA R1b only happened after ca. 2.200 BC (note also one of the Alltentoft samples, still labeled Iberia_LN, ca. 17% Yamnaya, yDNA I2a – though dating to 3.600 BP seems to point at either contamination or dating error), i.e. some 400 years after initial arrival, and appears to be related to the general crisis during that period.

    4. There may be examples of your “immigrating CWH females”, already pre-diluted with farmer ancestry in Poland (Yamnaya/GAC mix proposed by Allentoft e.a.), or also further along the route in BB contexts. However, I am not convinced yet that this was the dominant pattern – especially not with the fresh Allentoft data. 3 females included there (Supp.Table X):
    – atp9: 27% Steppe, 5% GAC_PL, 17% Levant_BA;
    – ValeOuro10207: 23% Steppe, 4% GAC_PL, 9% Levant_BA, 6% ItalyCA;
    – NEO653: 32% Steppe, 2% GAC_PL, 13% Levant_BA (plus, intriguing, extra 1% WHG, 2% UA_HG, 2% Guanche over what is already incorporated in their SWEurope Farmer contibution).

    Plus, as R1b male in El Argar,
    – pir001: 16% Steppe, 3% GAC_PL, 14% Levant_BA [2% WHG, 3% UA_HG, 1% Guanche].

    Finally, they have clustered the Iberian samples together with an EMBA female from SW France (Castelnaudary, Dep. Aude):
    – QUIN 234: 28% Steppe, 1% GAC_PL, 14% Levant_BA [2% WHG, 2% UA_HG, 0% Guanche]

    This data points to less then 20% Steppe dilution by CE (GAC) farmers (NEO653, QUIN234 below 10%)- most dilution obviously took place in Iberia itself, or at least not far from the Pyrenees.

    5. There seems to have been an inflow from the Eastern Med. – a quite substantial one, when taking your 7.2% IranNeo as a starting point (by the time that ancestry had reached Crete, Egypt, or whatever more proximate origin we are talking about, it should already have been heavily watered down). That inflow certainly added to the “Steppisation”. Note also the Levant_BA contribution identified in the Allentoft paper. Their respective sources are Anatolia BA (including Hittites) from Damgaard 2018, and Canaanite BA (Sidon) from Haber 2017.

    In short: A quite complex and lengthy process, which includes the chance for each of the immigration waves becoming linguistically absorbed (to the extent required, arrivals via the Mediterranean might already have switched their language there). And the obviously stongly matriarchic El Argar setting would have made such absorption even easier (provided mediterranean arrivals came by boat, and were predominantly male).

    You say: “it could work for Iberian, but it would fail to explain why Aquitaine in SW France also spoke a related language.” Do we have any aDNA from Aquitaine? I am not aware of anything, especially nothing relating to the LN/EBA transition. Also, the Pyrenees and the Basque country proper appear to be relatively sparsely covered from aDNA (there is, however quite a lot from Asturia, and also the Upper Ebro). So currently, there seems to be mostly extrapolation at work, which might be misleading. Nevertheless, the QUIN234 sample from above, which clusters with EBA Iberia, comes from a place pretty close to where Aquitanian was spoken in antiquity.
    Plus, there is the possibility of later linguistic expansion during the BA. El Argar was certainly powerful enough to effect “elite dominance” language transfer (or prevent CWH from shifting language) across some distance over its immediate political control. And during the BA we are talking important tin trading routes running north and south of the Pyrenees.

    But, of course (just to make things more complicated), there is a different scenario to envisage: The aDNA contribution from the Levante comes close to how speakers of Anatolian should have looked genetically (in fact, the Hittite samples are part of that cluster) – and also what we might expect to have then brought forward IE-isation of the Balkans and beyond. The Levantine or better Anatolian influence on El Argar, as limited as it ultimately was genetically, had high socio-political relevance (pithoi burials etc.), so we might think of El Argar, at least at some later stage, having adopted IE from that influence. Which would be the “para-Celtic” element attested in Iberia. And implicate quite some MLBA shuffling-around of languages, as the El Argar area looks quite safely to have been Iberian-speaking during the IA (at least, until the Punians established Cartagena).

  54. Alberto: Still have to read your Celtic Britain paper.

    For the late IA (from 2nd cBC onwards), the Celtisation of at least S. England is easily explainable from various factors:

    1. Monetarisation ( use of coins) entering England from the south, with various coin sources, starting from Marseille coinage early on, ranging from Brittany in the West to the Ubians (originally Dünsberg/ Hesse, by Germanic expansion replaced to Cologne, c.f. Cesars respective account).
    https://en.wikipedia.org/wiki/Celtic_currency_of_Britain
    https://collectingancientcoins.co.uk/getting-started-with-celtic-coins/

    2. The spread of Gallo-Belgic coinage through England is probably linked to Gaulish payment for English mercenaries. One occassion were fights against the invasion of Cimbri and Teutones. Cesar reports that the Armorican Veneti (Vannes & surrounds) mobilised support from Britain, until he managed to take out the Veneti’s fleet.
    This implies quite intensive Celtic language training for the recruits, and surviving/ returning veterans probably being quite wealthy and influential – and having sufficient money to open up England to Gaulish imports, e.g. wine.

    3. There is quite some written attestation of Belgic migration into S. England around Winchester (Venta Belgorum), albeit extent and timing is still disputed.
    https://en.wikipedia.org/wiki/Belgae#Great_Britain

    Whether there were previous waves of Celtisation, and their outreach, has remained unclear to me. Some British tribes may have spoken a form of Para-Celtic that may have reached the Island already during the MBA (note that “Pritannia”, as well as some tribal names, e.g. “Parisii”, are difficult to explain as Celtic because of the preserved “P”).
    Shryjver seems to argue that Celtic only (or mostly) reached Ireland after and because of Roman conquest attempts of Britain – for older establishment there he sees too little internal development.

  55. Hi!

    Would just like to bring something interesting in the mix.

    The Sanskrit Rgveda, possibly the oldest IE literature of Bronze age we have, has very clear and multiple references to the River saraswati (which is also a Goddess ) flowing from “snowy mountains to the sea”

    This river is now dried up but its existence was confirmed by geological studies due to its clear paleo channels.

    Problem is that, Rgveda calls sarswati as mightiest of all rivers which flows from Himalayas to the sea(arabian sea) and there is no hint in Rgveda that river was drying up (as exists in later texts like Mahabharata which explicitly mention how saraswati has disappeared in deserts of rajasthan). But last time this was the case was before 1800 BCE.

    River began drying up from late 3rd Mil.BCE and and had almost dried up entirely by 1800 BCE.

    That would place Rgveda at the very least, before 1800 BCE- much before any posited Indo Aryan ingress into Indian Subcontinent

    This prima facie rules out Steppe Hypothesis

  56. @Kulkarni

    Lies?, who is lying?

    Alberto and FrankN are simply saying what they think and of course they are much more intelligent and educated than you.

  57. Regarding the arrival of the Celtic language in the British Isles this is what I have written in eurogenes

    “What is important in this work is not the matriarchy of the Durotriges or the annihilation of the British Celts but the role played by G2a-L497 in the possible arrival of the Celtic language in the British Isles (according to Cassidy because of the remarkable increase of the EEF component in southern Britain). After Gretzinger who showed that a large part of the Central European Celtic elites belonged to this lineage, the appearance of the same markers in Britain does not seem to be a coincidence”

    *MBG010 (573 BCE)-Magdalenenberg, grave100b, Germany-G2a2b-Z1823
    *I12772 (378 BCE)-St. Merryn, Padstow, MIA_England-G2a2b-Z1816>Z1823
    *I19587 (94 BCE)-Scorton Quarry, IA_England-G2a2b2a1a1b-Z1823

    *HOC002 (515 BCE)-Hochdorf, Hallstatt, Germany-G2a2b-Z726>CTS4803
    *WBK195 (25 CE)-Winterborne, Durotriges_IA, England- G2a2b-CTS4803

  58. Frank I am glad you are still active

    Continental migration to the British Isles was relatively important during the Atlantic Bronze Age and later during the Iron Age and not only originated in France or Central Europe, but also in Iberia.

    “The earliest Iron Age samples, from Carsington Pasture Cave in Derbyshire (CE033/I12274-R1b-DF27>Z198>S11475) and Pockington in Yorkshire (I11033-mtDNA-H2a3/b), show a sharp increase in EEF ancestry. Haplotypic analysis suggests that the Carsington Pasture individual is not local to Britain, with close to zero ancestral contribution from preceding Bronze Age populations”

    “One genome (I12774) in this sample set is an outlier with respect to date (Early Iron Age) and ancestry. SOURCEFIND estimates this male individual’s ancestry to derive mainly from Spanish and French populations, in agreement with his positioning on PCA close to the median value for the Spanish Iron Age population. This individual was previously identified as an outlier with respect to EEF ancestry. This is suggestive of mobility along the Atlantic seaboard in the Early Iron Age. We observe three individuals from Winterborne Kingston with SOURCEFIND estimates of British Bronze Age ancestry one standard deviation below the mean (WBK02, WBK03 and WBK04). WBK02 is the most pronounced outlier with his continental ancestry modelled as coming almost entirely from the Spanish Bronze Age. While the majority of tested Middle and Late Bronze Age genomes show only minor contributions from continental sources, two extreme outliers in Kent are observed from Margetts Pit (I13716) and Cliffs End Farm (I14861-R1b-P312). These individuals were previously identified as outliers with respect to EEF ancestry and used as the proxy source for incoming continental ancestry during the Middle and Late Bronze Age in Patterson et al. 2022. This study found these outliers were genetically similar to individuals from the Knoviz culture of central Europe, a subgroup of the Urnfield cultural complex. SOURCEFIND models the ancestry of the Margetts Pit individual as a mix of French and Spanish Bronze Age components. The Cliffs End individual also has a substantial contribution from the Spanish Bronze Age, as well as German and Czech sources”

  59. @Frank

    It is evident that there was a cultural (pithoi burials etc.) and genetic relationship of the Argar culture with Sicily and Greece. However, the percentages of blood from the eastern Mediterranean are very small and they were produced thanks to exogamy (all the male markers are Iberian R1b-P312 and R1b-DF27, but there are two or three mtDNA markers of eastern origin). But this exogamy also occurred with cultures north of the Pyrenees because we also found Central European or French mtDNA in the Argaric samples.

    On the other hand, the Argarians were simply proto-Iberians because they are genetically indistinguishable (both uniparental and autosomal) from the Iberian samples we have (Ilerkavones, Indiketes, Layetanos, Basques). It is necessary to make very complicated mental gymnastic exercises to think that the Argar culture and the Iberian culture spoke different languages, that is to say, in my opinion, the Bronze Age in Iberia was absolutely NOT_Indo-European.

  60. @Alberto

    In my opinion and given that the male markers of the Yamnaya culture are overwhelmingly of WHG origin, that culture spoke the language of European hunter gatherers (in which I include EHGs, who are very similar genetically to Westerners). I have no idea to which linguistic family that language belonged but evidently there are only two possibilities.

    1- It was a NON-Indo-European language-That would explain that in those regions of Europe where the percentage of WHGs blood remained at acceptable levels during the Neolithic (Iberia, GAC-Poland, Germany, France) a Paleo-European language survived. Obviously the male markers involved would be I2a-M438 and R1b-L754 and it would be a perfect explanation for the existence of non-Indo-European languages in Western Europe at the arrival of the Romans (Basque-Aquitan, Iberian, Tartessian, Etruscan, Raetian etc). In this case, and if people continue to defend the linkage of Yamnaya with Indo-European, then this culture had to be Indo-Europeanized by Darkveti-Meshoko or Maykop, the latter being technologically superior to Yamnaya.

    2-It was a Proto-Indo-European language-In this case, not only the Yamnaya culture but a large part of Europe would speak this language. Neolithization could have brought non-Indo-European languages to mainland Europe and then the explanation of this linguistic family in Iberia would be that the Central European migrants who reached the peninsula quickly acculturated and lost their mother tongue, that is, Basque-Aquitanian-Iberian and Tartessian would simply be Iberian Neolithic languages.

    In any case, I must admit that I believe that the linguistic debate contaminates the genetic debate because of the difficulty of scientifically proving the linguistic theories that are presented.

    Everything seems to be a question of faith except for the fact that Etruscans, Basques, Iberians and Tartessians spoke NON-Indo-European languages and were overwhelmingly R1b-P312, while Mycenaeans spoke IE and were mostly J2a (in addition to many other male markers), while Z2103 and other steppe lineages were a clear minority.

    Explaining this paradox has proved to be an impossible task for the Harvardians, so I think they will continue to make fools of themselves for a long time to come.

    Un saludo

  61. Do you have a link to the paper of Rune Iversen?, I have always thought that the spread of Indo-European is a Bronze and Iron Age issue, nothing to do with Chalcolithic.

    By the way Alberto, I disagree with you regarding the arrival of Central European markers in Iberia, while (at the moment) the oldest R1b-P312 is in Germany (the second in Spain) we have hardly any evidence of the arrival of mtDNA markers with origin north of the Pyrenees during the Chalcolithic (3.000-2.000 BC), so it could be that those P312 men (hunters, miners, traders, explorers) were a minority and lost their mother tongue. It must be taken into account that Iberia has a very mountainous territory of 600,000 km2 and that the colonization could have lasted hundreds of years.

  62. Two other important issues that everyone ignores are;

    1-Regarding R1b-L151 in the early CWC, it must be remembered that Papac found a northern or Baltic signal in those samples totally incompatible with an origin in the Yamnaya culture, so this marker does not have its origin in the steppes, no matter how many times the Kurganists repeat the contrary. And also archaeologically the R1b-L151 tombs in Bohemia have absolutely nothing to do with CWC or Yamnaya, except one in which there is a battle axe (which was used as grave goods in central europe since the neolithic period). So be very careful with relating R1b-L151 with the CWC.

    2-Regarding R1b-M269, this is what we currently have

    *I2181 (4.527 BCE)-Smyadovo, Gumelnita-Karanovo culture, Bulgaria-R1b-M269-Mathieson, 2.018

    *PIE064 (4.499 BCE)-Pietrele, Gumelnita, Bulgaria-R1b1a1b-PF6517-M269-Penske, 2.023

    *KST001 (3.824 BCE)-Konstantinovskiy, Maykop culture, Russia-R1b1a1b-PF6517-M269-Ghalichi, 2.024

    *NV3003 (3.714 BCE)-Nevinnomyssky3, Maykop early, Russia-R1b1a1b-M269- Ghalichi, 2.024

    *ATP3 (3.389 BCE)-El Portalón, Atapuerca,chalcolithic, Iberia-R1b-M269-Günther, 2.015 (Harvard-R1b-P297)

    *AF023 (3.245 BCE)-Trou Al’Wesse, neolithic-B, Belgium-R1b1a1a2-M269-Fichera, 2.021

    Recall that the two oldest have been recognized as such in the Harvard-Reich database, so we cannot even claim a steppe origin for M269.

  63. @ Gaska:
    1. WHG language: I think, we may get an impression by looking at proto-Germanic terms w/o a clear IE ethymology. These include, from the extended Swadesh list as per wiktionary (https://en.wiktionary.org/wiki/Appendix:Proto-Germanic_Swadesh_list) the following (I stopped somewhere in the middle, the list was long enough):
    – *allaz (“all”): “unclear origin”,
    – *swēraz (“heavy”), only some tentative connection to Baltic;
    – *fluglaz (“bird”, c.f. “to fly”), IE connections only to “feather”-terms in Latin and Balto-Slavic;
    – *lewH- (“louse”), IE parallels only in Celtic and Proto-Slavic;
    – *lewbʰ- (“leaf”), possibly connected to Balto-Slavic *lubъ (luba) “bast (of trees)”, Albanian labe “rind, cork”, Latin libros “book”, no further IE connections;
    – *rindǭ (“bark”, “rind”) – no IE parallels provided;
    – *raipaz (“rope”) – “of unclear origin”;
    – *blōþą (“blood”) – “of unclear origin”;
    – *bainą (“bone”, “leg”) – “of disputed origin”;
    – *handuz (“hand”) – “Uncertain. .. often considered of non Indo-European origin.”;
    – *breustą (“breast”), w. cognates limited to0 Proto-Celtic and Proto-Slavic;
    – *drinkaną (“to drink”) – “Cognate with Lithuanian drė́gti (“to become moist”)” [Already that one looks a bit far-fetched];
    – *slēpaną (“to sleep”) – “of unclear origin”;
    – *libjaną (“to live”) – connected to PIE *leyp-, “to stick”, plus various IE terms designing sticky substances, including Sanskrit: रेपस् (répas, “dirty spot, stain”). Hmm – call me unconvinced.
    – *þankijaną (“to think”) – only somewhat plausible IE parallel provided is Latin tongeo “to know”;
    – *fehtaną (“to fight”) – in wiktionary linked to PIE *pek “to comb, to shear”. Semantically rather unconvincing, and the PIE root is restricted to Western IE;
    – **snīþaną (“to cut”) -“Of uncertain origin”. A few parallels in Celtic and Slavic that may be semantically linked, but do not formally agree in their phonetics. Which in itself is an indicator for a substrate term.
    – *sitjaną (“to sit”). Only formal parallel with OGreek. The Proto-German reconstruction is unclear: “There is dialectal variation with and without the *j, the reason for which is disputed”. Possibly dual borrowing into Germanic.
    – *gebaną (“to give”). Connection is being made to “Proto-Slavic *gabati (“to seize, take”) and Sanskrit गभस्ति (gábhasti, “arm, hand”).” Looks a bit far-fetched, but I wouldn’t completely rule out the connection.
    – *sagjaną (“to say”): Connection is being made to PIE *sekʷ-, “to follow”, e.g. Lat. secutus “having followed”. Call me unconvinced.
    – *singwaną (“to sing”): As related IE term, Marathi: सांगणे (sāṅgṇe)”to tell, narrate” is provided. Hmm..

    Now, mind you, this is not some Germanic seafaring terminology (which is also full of apparent substrate), but we are talking about very basic (Swadesh list) vocabulary. And even if some of the above examples ultimately turn out to be IE-derived, it will still leave quite a long list of non-IE substrate in Germanic. Hence, I am pretty certain that North European HGs didn’t speak something related to IE. And since nobody so far (including Vennermann, with his “Germania Vasconia” approach) has been able to connect this substrate to Vasconic, or NE Caucasian for that matter, I am also anything but convinced of Alberto’s NEBA theory.

    Otherwise, “El Argar speaking IE” was more of a thought experiment. From an aDNA perspective, with the data provided by Allentoft 2024 and Villalba-Mouco, it looks possible. But geographically, El Argar looks too much inside the Iberia-speaking area (pre-punic/ Cartagena) to expect it having spoken anything different from Paleo-Iberian. The intriguing question than is: If El Argar withstood immigrative pressure towared IE-isation (Hellenic, Anatolian), why should it have given in to “Steppe” pressure from the inland. In terms of economic & cultural relevance, the E. Mediterranean was certainly much more powerful.

  64. @Ugra

    Saraswati argument is as boring and stale as it can get, bring better coping pointers. Rigvedic Aryans primarily (and only) inhabited Kuru and parts of Panchala, with main focus on Kuru region in through much of Rigveda. Saraswati is the only river they would be aware of and would’ve venetrated (along with Ganga and Yamuna but they just know about them rather than venetrating), so it’s quite obvious they will consider their own river (no matter the strength compared to others) as mightiest as it brings them prosper. Also flowing from mountains to oceans, pretty much every river flows from mountains to oceans even if as tributaries, so again it’s quite obvious if they’re talking about the origin they’d have talked about the end (regardless of them knowing that it actually ends anywhere), Rigvedic Aryans never inhabited Sindh so it makes zero sense for them to ACTUALLY know it ends in ocean. Your prima facie is just the rinsed and repeated BS nonsense, Steppe hypothesis has far better evidence than this delusional theory being framed by Ashish/”Vasistha” Company.

  65. Gaska,

    Here is the link to the Rune Iversen paper: https://www.researchgate.net/publication/381364698_Issues_with_the_steppe_hypothesis_An_archaeological_perspective_Iconography_mythology_and_language_in_Neolithic_and_Early_Bronze_Age_southern_Scandinavia

    Frank,

    To be fair with the author, he does state that the wheel/cart predate the CWC (Single Grave Culture, as known in Denmark):

    “What I have tried to show in this section is that we see significant gaps of c. 500-600 years and c. 1200-1300 years, respectively, between the supposed introduction of Indo-European language in southern Scandinavia and the material things referred to in the common Proto-Indo-European steppe vocabulary. According to the prevalent scenario, the Indo-European language (incl. its wagon, wool and horse terminology) came with the Single Grave culture c. 2850 BC. However, the wheel and wagon technology was already present in southern Scandinavia from the mid/late 4th millennium BC (i.e. c. 500-600 years before the Single Grave culture) as can be deduced from preserved cart tracks and supposed wagon burials (Piggott 1969: 308; N. N. Johannsen & Laursen 2010; Mischka 2011). This does not in itself constitute a problem as the “old Neolithic” words associated with wagons could be replaced by the new Indo-European vocabulary.”

    Later, domestic horses and chariots may have been introduced from the steppe via the Srubnaya Culture (although I’ll point out that the Kivik cist he mentions with depiction of two wheeled chariots pulled by horses show these wheels to have four spokes, like the Mycenaean ones, which were invented in the Near East according to David Anthony: “When horse-drawn chariots appeared in the Near East they quickly came to dominate inter-urban battles as swift platforms for archers, perhaps a Near Eastern innovation. Their wheels also were made differently, with just four or six spokes, apparently another improvement on the steppe design.”), but it’s the iconography that seems the mot relevant thing in the article, and that, to my knowledge, couldn’t really come from the steppe at that time.

  66. @Frank

    I think that the eastern Mediterranean influences in BA Iberia (in its Mediterranean side) are there, but as Gaska pointed out too their genetic influence is minimal at best. I’ll look into why in that model of the samples from “La Bastida” they got 7.2% Iran Neolithic, but I also mentioned that when running all the “El Argar” samples together they got a negative value for Iran Neolithic (La Bastida has only 10 samples and not sure how many good enough for the run they did, while the total from El Argar must be around 100). Then there’s the male lineages too. As you point out, those sailing across the Mediterranean would have been males, so any offspring in Iberia would carry their lineages, which we don’t see either.

    A language shift by this sort of distant cultural influence is really very difficult to happen. Notice that the Iberians had a strong and long lasting influence from well established colonies from both Phoenicians and Greeks that lasted for centuries. Phoenicians were the first ones to introduce iron working to Iberia (one of the reasons that might have kept Iberians free of Celtic influence), but many other things like coinage or the writing system were taken from these foreign cultures. And yet Iberians didn’t adopt their language as their native one.

    I will run Iberian samples between the first appearance of steppe admixture onwards as soon as I can, but I’m not sure how the Globular Amphora ancestry (or Danubian EEF more generally) could dilute independently from the steppe ancestry. They arrived tied together, without the possibility of steppe ancestry reaching Iberia directly in any way.

    2600 BC seems a bit early for the arrival of steppe admixture to Iberia, but I’ll check the earliest dates. I think that it mostly happened between 2400-2300. I’ll see if I can find the earliest steppe samples and the latest no-steppe samples. Unless there’s some exception, they shouldn’t be more than 200 years apart, a rather fast process for the size of Iberia.

  67. @Vara – yes the person who denies IE as a concept is best fringe source to rely on for BMAC chariot invention theory, I get it. Btw late BMAC (and perhaps Namazga VI) was most likely formed with early Mitanni related elites/mercenaries from Sintashta-Petrovka similar to Copper Hoard

  68. @Alberto – you fell off to complete BS and crap. I’ve read your older blogs and comments and they were at least somewhat sane like Vasistha’s. If you have to go for non-Kurgan hypothesis a framework of Heggarty et al. that Vasistha considered is what is 10 times better and sane than the conspiracy that you’ve made in this blog. Same goes for FrankN. You guys fell off, to absolute wretched conspiracies I must say

  69. @Ashish

    I am not really sure what you’re on about. I guess your weird obsession with Vasistha has addled your mind.

    BTW, It’s 2025 and you still didn’t realize that we have Namazga VI, Late BMAC and Yaz I samples and they are a continuation of early BMAC?

  70. @Ashish Kulkarni

    I don’t think I’ve ever heard from you before and have nothing against you or whatever theory you prefer, but your comments are really of too poor quality and contain insults to me and others here. I hope you understand that here we debate ideas, with some logical arguments based on the evidence available.

    If you’re able to adhere to those standards, you’re welcome to keep commenting. Otherwise you may just ignore this blog and go on with your life. There are many other blogs out there that may be interesting for you and your comments.

  71. @Ashish Kulkarni

    Sorry, but “your theories are plain garbage” doesn’t cut it. If you want to make any good counter argument take your time to think about it and write it carefully and I will publish it.

  72. @Vara
    I might be over emphasizing on Vasistha (cause his theories are very wrong about many things he has touched and have caused all the misinformation on Reddit and Twitter), but I’m not obsessed lol. Sure yes Namazga VI is genetically BMAC (Sumbar/Parkhai_LBA) with few Hasanlu Woman type Mitanni elites/mercenaries (Parkhai_LBA_o). And it’s also true Yaz I-II has some 40% BMAC (rest 60% is Tazabagyab i.e. Proto-Iranic).

    @Alberto
    I have no interest in breaking down and wasting time point by point countering those huge walls of text by you, Frank and Maju. Maybe sometime later

  73. Thankyou Alberto! also thankyou Jaydeep for alerting people.

    two unrelated questions:
    1) You state that Basque is related to NEBA, what about Etruscan? Do Etruscan and Basque share a substrate and do they converge to a macro family? I don’ think theres any consensus on this.

    2) India samples are sparse but what about Swat? Does it not give us the required snapshot of a the very crucial period of the postulated time window of the AIT? It tells us that the steppe signal remains small in NW India and only increases after the Iron Age.

  74. @ Alberto:
    “A language shift by this sort of distant cultural influence is really very difficult to happen.”
    Well, think about Latin America – or India under British rule. And we know that El Argar was heavily centralised, including elite control over grain resources and milling (the Egyptian model). So, in principle, the elites had means and ways to enforce language shift – and in a hypothetical multi-lingual environment (partly “Steppe”, partly “Iberian LN farmers”) shifting to a “neutral” language might even have been appreciated by a good part of the population.

    I am not saying that this happened – in fact, I don’t think that El Argar was ever speaking some kind of “oriental” IE. However, these are the kind of mechanisms we anyway have to consider for IE-isation of at least those parts of Europe (and maybe Indo-Iran), where the demographic model with 50% plus “Steppe” doesn’t work. Stuff like Gothic serving as “lingua franca” on Attila’s court, Slavic under Avar rule, Hungarian serving to unify Germanic and Slavic speakers (plus fending off any claims from German, Polish, Byzantian rulers).

  75. ” I’ll look into why in that model of the samples from “La Bastida” they got 7.2% Iran Neolithic, but I also mentioned that when running all the “El Argar” samples together they got a negative value for Iran Neolithic.”

    Well, “La Bastida” should capture the elite. “El Argar” would include a lot of peasantry. A difference is to be expected – and actually interesting to analyse also from a (pre-)historic perspective.

    Otherwise, check out Allentofts approach, starting with digging into their “Supplementary Tables” and the proximal models there. You’ll find the GAC stuff well documented there, especially when it comes to SGC in Denmark, and also Andronovo/ Sintashta. And they have, instead of Iran_N, taken Hittites and Sidon_BA as proximal source (at least for the Hittites, Davidsky should have data).
    Unfortunately, they just provide percentages for their proximate modelling, w/o P-values to judge the goodness of fit. But that kind of checking shouldn’t be too difficult to add.
    Of course, it might also be instructive if you played around with other CE “steppe” samples, e.g. BB. Maybe they yield better fits for Iberia (Allentoft e.a. obviously optimised their model for Scandinavia, not for SW Europe).

    Anyway, I am curious about your results. Pls. keep me (us) informed.

  76. Postneo,

    Good to see you active too. To your questions:

    1. Yes, Basque should come from the BBC (part of the CWH), but the case of Etruscan is less clear. It could come from that same source or it could come from the Neolithic farmers from Europe. Genetics allow for both possibilities, so it’s ultimately a linguistic problem to determine if they are related or not. At this point I’d say there’s no evidence of them being related, but we have to take into account the scarcity of data to work with. There’s a lot of work to be done in the field of non-IE languages from Europe, and this post is an effort to stimulate that research.

    2. Yes, the Swat Valley samples are an important source of information since they show that between 1200 BC and 1 CE the population of the Swat Valley was local and only has some steppe admixture probably acquired second hand and via female exogamy. Maybe the later samples did get some admixture from people who were directly from the steppe, but also via females. In any case, that’s too late to matter.

    Still, it would be much better to have a decent transect of the core Vedic areas during the second mill. BC so that everyone can be sure that the population was of local origin too.

  77. Frank,

    I looked at he samples from La Bastida and nothing really stands out. The steppe ancestry is diluted to some 17% Yamnaya in the 7 samples available, and they have a similar amount of Globular Amphora admixture. The rest is from Iberia (maybe some from France too, but that’s hard to detect).

    https://adnaera.com/wp-content/uploads/2025/01/vah_bastida.png

    As you can see, nothing from the Levant or Anatolia. So I tried adding Ganj_Dareh_Neolithic directly (and also added GRC_Koufonisi_Cycladic_EBA in case) and one sample (BAS018) did get 3.6% Ganj_Dareh and still 0% from the Eastern Mediterranean sources, while other two (BAS002 and BAS022) got noise levels (0.4% ad 0.5% respectively). So this looks more like an artefact to me (and maybe some other sample not available in G25 due to low quality pushed that number higher).

    Not much difference with the samples from La Almolaya:

    https://adnaera.com/wp-content/uploads/2025/01/vah_almoloya.png

    It’s hard to determine the date of the earliest steppe admixed samples due to the lack of carbon dating, but at least from these ones available they all look post 2500 BC:

    https://adnaera.com/wp-content/uploads/2025/01/vah_iberia_bb.png

  78. Great post. A few points as I go through it:

    – Neolithic expansion/colonization also brought animal husbandry with it. Throughout history, ancient and recent, farmers often relied mainly on pastoralism for food, milk, material etc. Animal husbandry was more or less pioneered by agriculturalists. All agricultural economies were hybrid ones, because this is most efficient.

    – Some of the HG in EEFs is not just WHG but also EHG. In Iberia there’s also Magdalenian iirc. (Rivollat et al 2020) There are also other papers going a bit deeper than that. But yes most EEFs what (primarily) WHG-like HG ancestry. Some HGs also had a high % of EEF ancestry but these were rare and eventually disappeared.

    – There is a slight mating bias in CWC and no mating bias in BB. (Lazaridis & Reich 2017) Also Villalba-Mouco has a related paper out touching on that. I don’t think there’s much need to comment on any other of the theorizing about mating patterns.

    – “Then around 3000 BC something happened throughout Europe, affecting specially all of the northern and western parts of it, and causing a big population collapse.”
    A significant reason for this could be increased conflict among EEFs. They were particularly warlike, and were the first to invent organized warfare. They also didn’t hold back from killing off entire communities (see: mass graves). Fibiger has some great papers on that.
    In comparisons, HGs were a bit less (or about as much) warlike/violent, and steppe pastoralists actually seem to be less warlike. It was Beakers who turned it up again, 2000+ later.

    – Presence of steppe-related populations in mainland Europe is seen much earlier than 3000 BCE. The communities they resulted in just didn’t last or rather, got replaced by the later Yamnaya-related ones.

    – The admixture patterns show not a significant collapse, but a significant assimilation. CWC started off as 100% steppe, went to 75%, and some CWC-labeled samples are 50%. This can explain the lineage change in them (predominantly R1b→R1a), and coincides with their shift into a sedentary farming/hybrid economy. The ones who got (mostly) culturally assimilated by EEFs were the ones who outcompeted the rest of the CWC clans and genetically assimilated EEFs. Whatever remained of the now-smaller and less successful EEF communities simply scattered. These populations weren’t very numerous – Balkans were more densely populated and we can see a smaller turnover there, for this reason.

  79. – Wheeled vehicles are older than 3500BCE (see CTC/TRB artefacts.)

    – The genetic signature that resulted in BBs did not start out as Bell Beaker culturally, not entirely at least. Earliest BB samples have no steppe-related ancestry iirc.

    – “What all the process described in the above paragraphs basically means is that Northern and Western Europe were completely…”…migrated into by steppe-rich newcomers who over about 1000 years had become a 50-50 mix of EEF and Yamnaya-like. In Western Europe it was more like 60-40.
    “The people, the communities of people with their culture and language, that populated all of these parts of Europe were originally from the steppe.” About half of their ancestors were, most of their culture (remember, way of life. The way they lived, not just some rites one performs a few times per year) was EEF-derived, language is currently being debated. Some were from the steppe, others not.
    The ensuing communities, however, did not have any EEF holdouts which is what you’re probably trying to get to (judging from the rest of the post). This is true.

    – Mainstream studies don’t use the term “Indo-Slavic” and several mainstream papers are explicitly saying that the language families of Indo-Iranian and Balto-Slavic are unrelated. (Which means, different origins.)
    “But that would imply that such language was also spoken throughout Western Europe, something that we don’t have any evidence of whatsoever. Moreover, it would imply that Celtic and Italic would be descendants of Indo-Slavic, something that is at odds with basic linguistics.” Precisely. On top of a lot of other evidence against the existence of such a node, there’s basically zero proof for its existence whatsoever. It’s a bit like someone claiming that “humans have three arms” when there’s no proof of this and all the facts show that humans don’t have three arms.

    – “they spoke an older form on Indo-European language from which all others descended from” Probably not that old. The split of Yamnaya and CWC cultures coincides with the fragmentation of the proto-family that resulted in Italic, Celtic and Germanic (Heggarty et al 2023) so whatever CWC spoke, it was something related to these three (or two, whatever) languages. Nothing to do with Balkanic or Indo-Iranian, who by that time had likely already split somewhere else, independently, not necessarily as a single family (Greco-Aryan) but definitely not in the CWC. This aligns with genetics of Greco-Armenian as well.
    Balto-Slavic is itself breakaway from this Italo-Certic-Germanic cluster, possibly showing a split in what would be CWC and BB communities. They aren’t necessarily the exact same, linguistically (and definitely weren’t culturally when BB materialized).

  80. – Basques and Iberians are indeed the same. The only difference is that Iberians picked up some Iran-related ancestry from SE Mediterranean (observed in Italy too). It follows that their languages remained intact as they were, drift aside.

    – (Your linguistic analysis is VERY interesting. I love this type of writing.)

    – As an afterthought, it’s possible that BB culture was not unified. Indeed they had genetic differences from South to North. It could be a phenomenon of homogenization that did not however alter language, with some remaining IE and some non-IE. Even some CWC branches could’ve simply changed language, there are no rules against this. And vice versa as you said for Paleo-Sardinian.

    – “(Italy) Neolithic communities still persisted during and after the arrival of the people from the steppe” Wasn’t it mostly BB (50% EEF) + local EEFs? The only migration directly from the steppe occurred into North and Central-East Europe.

    – “Unlike the rest of Europe, the Balkans didn’t see any migration from the CWH people” There’s actually strong CWC and independently BB presence, although not uniformly. A lot is gathered on the North Balkans, mixing (or not) with Yamnaya ancestry, and some is even observed as far south as Greece (with archaeological findings corroborating the link in some case).

    – “Indo-European populations started to enter SEE Europe during the period from 2400-2000 BC. They came from West Asia (North West Anatolia was the immediate origin, but ultimately their origin had to be deeper into West Asia, around the South Caucasus) and settled the area of Thrace during this period.” There actually is aDNA evidence for this but in a generally earlier date. Most Balkanic languages had split by 2400-2000 BCE anyway so this works out only with an earlier entrance date, which is what we see in excess CHG/Iran in the Balkans during the EBA/MBA and some y haplos. The influence was obviously not pure CHG/Iran by that point so it’s safe to double it by 100-120% if not more. Not sure if someone has ran DATES to date the admixtures.

    – “(Greece) There was a continuity since the early neolithic until after 2500 BC” Incorrect, Greece sees some 20-25% admixture from Chalcolithic Anatolia (see Skourtanioti’s recent papers). This is Greece (late) Neolithic. Earlier Greek Neolithic is just 100% EEF with no HG. However this was confined in Crete, the Aegean and Peloponnese (giving rise to early Helladic, Cycladic and Minoan cultures later on – Minoans themselves are almost 50% Chalcolithic Anatolia).

  81. – Best fits for Logkas samples are 25 & 40% CWC and the rest can be Helladic-like. They don’t have significant CHG/Iran so they’re just from the North Greece/Balkans genepool.

    – Indeed Mycenaeans have an interesting West Asian input, seen not just in their elevated CHG/Iran ancestry (which would be reduced if they really were only 10:1 Minoan:Yamnaya) but also some traces of Levant-like ancestry.
    Nonetheless their steppe ancestry comes directly from Yamnaya, not from an intermediate Balkan source. For the case you’re making, this would be a strong indicator that this Yamnaya source did not speak IE since Greece is a language forming at the tail-end of Balkanic, and is not standalone. There were already several splits since then, and thus possibly mixings – which we don’t see.
    New samples that provide better fits can be discovered later, though.

    – There’s some additional Anatolian input in post-Mycenaean Greece.

    – “it’s difficult to pinpoint the exact origin of the Mycenaean people, but it had to be somewhere around Thrace or North West Anatolia” Culturally, Mycenaeans appear in the Peloponnese and perhaps even southern Peloponnese (see Dickinson’s work for that). Genetically, at that time their genetic profile had already been formed somewhere in that area, since we don’t see it anywhere else. Samples with Steppe ancestry in Greece prior to that (if they are indeed the source for Mycenaeans and not CWC/BB-related) have higher steppe % than Mycenaeans, double if not more.

    – “This leaves us with two options about the affiliation of the Minoan language: it could either come from the local Neolithic inhabitants (EEF) which would basically make it an isolated language, or it could come from the Anatolian side and be an IE language of the Anatolian branch.”
    Ah, a topic I love. Minoan is obviously not Greek, or even Balkanic, but a drifted language… of what origin, remains to be seen. What we can however already see is that most of the papers on the language (I’ve cited several papers in one of my posts in the past) agree it’s Anatolian IE. Since it’s old, it’s most likely something related to a proto-language or some language sub-family within the Anatolian branch and not e.g. a form of Luwian as was initially theorized by some.

    – Anatolian arriving in Anatolia from south of the Caucasus is further corroborated with similarities between IE (as a whole) and Kartvelian/Caucasian non-IE languages suggesting contact. It was in a recent linguistic paper, can’t remember the title.

  82. – “arguing that they came from North of the Caucasus” There’s a bit of backstage play for Harvard’s “new position”. To be brief, Harvard is more or less convinced that IE does come from West Asia, Aknashen_N as a (distal, obviously – the actual one would be 100-200% increased in admixture %) source in both of their papers should be telling about what’s coming when any kurganist worth his terminally online salt would have discarded these models immediately in favor of his theory (funny thing to consider that the steppe hypothesis is now just a pet theory peddled by schizos. How times change lmao). Because Anthony is involved (and he’ll kick the bucket soon) they can’t shit on it outright, and they’re trying to get many of the steppeboos on their side, convincing them. Sure it’s an effeminate and boring way to do it but it works in academia since academia operates more or less under the same normie rules. It’s working, though, and even Anthony has ceded many of his positions. Basically, Harvard and Max Planck keeps winning which is why the only shift we’re seen in the mainstream and big labs is away from the steppe hypothesis. Gamkrelidze & Ivanov would be laughing their ass off.

    – Linking Basque and Caucasian, now that’s interesting! The only possible connection here is obviously through a steppe pastoralist origin regardless of anything else.

    – Meggido outliers as the vector of Yamnaya non-IE languages would’ve worked just fine in the kurganist rationalization book if a Hittite outlier with steppe ancestry was discovered among many other non-steppe ones. If their “evidence” is not to be discarded then neither is the evidence you’re presenting.
    (Yes another reason why steppe hypothesis proponents don’t publish papers. They are heavily emotionally invested & boring agenda peddlers that don’t actually have any scientific interest.)

  83. – “Basically no evidence at all for the sort of huge events that should have happened in order for the Indo-Iranian languages to spread from the steppe to such a big area in such a short period of time. Nor any evidence that the people from the steppe could have spoken an IE language in the first place (quite the contrary, as already seen from other areas). Instead, we have a much easier explanation for the steppe populations to have acquired the Indo-Iranian language from their southern neighbours, along with much of the culture, technology, rituals and economy (for the change in the economy of the steppe population before and after the contact with the populations of Turan and IAMC”
    Spot on. There are also several more hurdles (maybe I’ll post a summary later). Basically a Sintashta/Andronovo origin of I-Ir was commonly accepted only because an alternative explanation was lacking. Now that there finally is an alternative, it’s no surprise that we see the shift becoming more prominent. The only ones who seem invested in it is (ironically) Harvard, and casual archeogenetics hobbyists but who cares about them. Over time the same thing will happen just like with IE urheimat, even if the issue remains unsolved: steppe origin is simply not convincing and will be discarded in light of stronger evidence, with the only ones still interesting in it being heavily invested morons who annoyingly harp about their preferred pet theories.
    Actually the only IE language that is a good fit for Sintashta/Andronovo is Tocharian and nothing else. Lmao
    Steppe hypothesis was always the midwit of hypotheses. No wonder it’s also lauded by midwits.

    – Overall, dating of languages could reinforce or challenge some aspects of what you present here. It has already been a bane for the traditional steppe hypothesis, to the point where Gimbutas et al are forgotten as names in the mainstream (and outside of it lmao) or scarcely brought up.
    There is no rule against IE being a pretty old language (PIA in 6000 BCE let’s say), or Anatolian split happening earlier. At the same time it is not a HG nor a purely pastoralist language.

    – The suggestion about the spread of Balkanic is interesting. I’m on the fence about that one (doesn’t matter much anyway). Guess we’ll need more samples and more careful look at DATES and whatnot, taking into account linguistic split dates.

    – “And the language transfer would go the natural way, from the more settled, higher culture society to the more mobile, pastoralist one” This is plausible, and I think this also follows: the steppe-rich people who were more successful were the ones who could easily converse, trade and mingle with IE-speaking neighbors of the places they went. (Meanwhile elsewhere we can see that they didn’t actually fare very well and eventually disappeared, e.g. in NC Asia)

  84. – “war chariots may have already played a role in their spread through the Balkans, at least Greece)”
    BA Greek chariots follow the model of Aegean box chariots from 1900-2000 BCE Kultepe and spread northward from Greece into the Balkans. They seem to appear (imported) at the time Mycenaean culture started to emerge.

    – wrt B-S: Their split date (becoming mutually unintelligible) is ~1000 BCE as per Haggerty et al 2023 which agrees with archaeology and iirc genetics too. Scythians were known for raiding European communities at the time so if they wrecked some settlements in the area, this could be further proof of movement leading to splits, as well as contact with Iranic (not I-Ir).

    – Paleo-Sardinian could be a language that simply drifted too far away to be identified. I don’t think there’s an issue if it remains uncategorized.

    Great post overall, much food for thought.

    @Vara Good to see you here. BBs could be from a clan parallel to CWC and Yamnaya alike, kind of like a “second CWC” where Yamnaya split into two separate clans (while also retaining its own). This would also work for the non-IE BB.

  85. Hi Orpheus,

    Thanks for taking the time to read and comment on the post. There are a lot of points made there (many small corrections which are welcome, since the post itself had to be simplified regarding many details), but others require further discussion.

    I’m not sure if answers can be found in your blog (I just found out about it, and anyone can go to it by clicking on your name, though be warned it’s in Greek for the most part), so it would be good if you could clarify some of your comments. Especially relevant seems to be this one for now:

    “The admixture patterns show not a significant collapse, but a significant assimilation. CWC started off as 100% steppe, went to 75%, and some CWC-labeled samples are 50%. This can explain the lineage change in them (predominantly R1b→R1a), and coincides with their shift into a sedentary farming/hybrid economy. The ones who got (mostly) culturally assimilated by EEFs were the ones who outcompeted the rest of the CWC clans and genetically assimilated EEFs. Whatever remained of the now-smaller and less successful EEF communities simply scattered. These populations weren’t very numerous – Balkans were more densely populated and we can see a smaller turnover there, for this reason.”

    It’s unclear what you’re trying to say here:

    – Which were the steppe communities who got culturally assimilated by EEFs?
    – Which were “the rest of CWC clans”?
    – Which were the “genetically assimilated EEFs”?

    The first ones are the ones who outcompeted the other two? And apparently also the:

    – “remained of the now-smaller and less successful EEF communities” (which I guess were the typical Neolithic communities in Europe?).

    Could you elaborate a bit on this so that we all understand correctly what you’re saying?

  86. @Vara

    I always thought that the (mis-)interpretation of the Shaft Graves from Greece as coming directly from Sintashta was pretty fringe and no Hellenist would ever take that seriously. But it may have had a bigger influence in other circles, granting that article you posted. I’m not sure if the steppe theory ever tried to reconcile Sintashta being Proto-Indo-Iranian but then Greeks speaking Greek as a result of that putative invasion from the steppe. I’ve ever just heard some amateur steppists defending that archaeological connection. But there it seems that no one cares about facts anyway.

  87. @Alberto

    Gracias por el papel de Iversen

    Regarding to mtDNA steppe markers that entered Iberia during the Bell Beaker culture, I have only found these.

    mtDNA-U5a1b1
    Russia, Marinskaya5, Maykop culture -MK5007-3.505 BCE-Wang et al, 2.018
    Bohemia, Obritsvi- OBR003-R1b-L151-2.893 BCE
    Iberia, el Hundido, Monasterio de Rodilla-2.412 BCE-Olalde, 2.019

    -mtDNA-X2b4a
    Netherlands, Oostwoud, BB culture-I5748 (2.406 BCE)
    France, Sierentz, BB culture-I1389/I1390 (2.345 BCE)
    Iberia, cueva de la Fragua-I23569-R1b-L52 (2.088 BCE)

    However, genetic and cultural exchanges during the Bronze Age continued to be frequent as evidenced by the large number of Central European mtDNA (Unetice) found in Argaric sites. This undoubtedly contributed to the fact that the steppe ancestry did not disappear from Iberia during the Bronze Age.

    -mtDNA-I1a
    Russia, Voronkovo, Fatyanovo, CWC-VOR001 (2.519 BCE)
    Germany, Haunstetten-2.112 AC/Kleinaitingen-AITI77A (1.790 BCE)
    Ireland, Ploopluck-PP6-R1b-L21 (1.911 BCE)
    Iberia, Puntal Carniceros, El Argar culture-PUC002-R1b-Z195 (1.679 BCE)

    -mtDNA-H6a1b
    England, River Thames, BB culture (2.299 BCE)
    Iberia, La Almoloya, El Argar culture-ALM047-R1b-P312 (1.875 BCE)

    -mtDNA-K2a
    Denmark, Jutland, SGC-Gjerrild8-2.460 AC/Gjerrild5-R1b-V1636 (2.159 BCE)
    Croatia, Jagodnjak-Krčevine-JAG34 (1.760 BCE)
    Iberia, La Almoloya-ALM084 (1.650 BCE)

    -mtDNA-T2b21
    Netherlands, Molenaarsgraaf, BB culture-I13027 (2.090 BCE)
    Iberia, La Almoloya, El Argar-ALM017-R1b-Z195 (1.791 BCE)

    -mtDNA-U5b1@16189
    Germany, Singen, Bronze Age-MX258-R1b-P312 (1.965 BCE)
    Iberia, La Almoloya-Argar-ALM014-R1b-Z195 (1.875 BCE)

  88. Thanks Gaska. Good that you keep such a good record of all the samples.

    Would you be able to tell us the date of the first R1b-P312 community that we know from Iberia? (By that I mean a community where the sampled males carried R1b-P312 or at least R1b-L51). And then the date of the last community from Iberia where the males were not R1b-P312? I guess it can give us an idea of broadly how long it took for the former to replace the latter.

  89. Alberto:
    Thx for running the models. The distances look good to me, except for BAS018, but you have already explained at least part of that by additional GanjDareh.

    First takeaways:
    1. The “Iranian” element identified in the recent studies appears to in fact be an artefact. Villalba-Mouco et al. had already cautioned that it may also relate to the EEF, i.e. Cardial Pottery being more “Levantine” (in the sense of Iran_EN-, also a bit Natufian-enhanced) than Central European EEF. Which means that it might make sense to check your Iberian MN/CA proxies for “Levantine” elements, and possibly run your models without Iberian sources of “Levantine” admixture to see if then more “Levantine” BA sources are captured (and to which extent the distances change).

    2. Allentoft e.a. might be correct that Poland_GAC was the main CE source for “watering down” Steppe ancestry. And actually, Steppe ancestry appears to already have arrived in Iberia diluted by GAC. However, I note quite a variation in the GAC-Yamnaya proportion in your Almoloya runs, from ALM006 with 54% GAC, 17% Yamnaya (3.2:1) to ALM050 with 7.8% GAC, 15% Steppe (1:2). I would have expected much more fixation to somewhere 1:1 GAC-Yamnaya.
    As such, it might make sense to include further prospective Central European Steppe-admixed sources, e.g. BB-NL, possibly also Steppe-rich British BA, in your models in order to see whether they possibly work better.
    Baalberge_MN, btw., might fail as proxy because of the too low HG content (which would be quite EHG rich in GAC_Poland). If available in G25, I would go for EHG-richer MN samples for modelling.

    3. The HG component might anyway an issue, illustrated by the strong variation within the Iberian sources. How about including El Miron, or another source rich in post-Magdalenian HG ancestry in the modelling to possibly remove that effect?

    4. In any case, your models point 32% (Bastida) to 40% (Almoloya) Non-Iberian / Maghrebine elements. While that may have arrived “with woman, grandfathers and sheep”, as you have put it, it is far from constituting a complete population replacement, and possibly – given that we are talking about a longer process of at minimum 3-4 centuries -also insufficient to demographically enforce language shift.

    @Orpheus: Various interesting points. I will need time for digesting and commenting. Here only on the “warlike EEF”:
    There is in fact quite some archeological documentation of a “war zone” in eastern Germany, approximately along the line Erfurt-Potsdam-Stettin (most of it in German, but IIRC, J. Müller has also written a bit in English). We are talking late FBC (post-Michelsberg, enhanced by lots of Danish/NW German HG ancestry) vs. late Baden (Danubian EEF-derived). However, that war seems to have culminated (and decided) some time before 3.000 BC with FBC conquering and destroying Salzmünde, allowing for subsequent Bernburg expansion southwards, including into Bohemia.
    That, of course, was several centuries before Steppe people arrived in E.Germany, so any de-population caused by that war should already have been over. Otherwise, relations between FBC/Bernburg and GAC appear to have been cordial (cohabitation zone with mixed cemeteries), and the same applies to the FBX-Wartberg relation. So, at least for NC Germany, your explanation doesn’t fit the archeological record.
    One issue seems to have been worsening climate (the so-called Holocene Climate Optimum ended around 3.500 BC), and especially decreasing rainfall (less humidity picked up over the Atlantic Ocean/ North Sea). This seems to have resulted in farmers giving up areas with low precipitation- and it is actually these areas (Thuringia in the rain shadow of the Harz, also the Tauber Valley) where we have first evidence of CWC starting to dominate the archeological record. West Jutland, another Steppe (SGC) settlement focus, may have to do with rising sea levels and soil salination yielding the land unsuitable for farming (but not for animal husbandry).

  90. @Orpheus

    It’s good to have you here.

    “Good to see you here. BBs could be from a clan parallel to CWC and Yamnaya alike, kind of like a “second CWC” where Yamnaya split into two separate clans (while also retaining its own). This would also work for the non-IE BB.”

    I am on the fence on what languages BBC and CWC spread. Though, I heavily lean mostly on BBC being non-IE now. I think Yamnaya is the first IE culture

    “Basically a Sintashta/Andronovo origin of I-Ir was commonly accepted only because an alternative explanation was lacking.”

    The earlier alternative steppe scenario where Indo-Iranian derives from Yamnaya and makes its way across the Caucasus to northeast Iran makes more sense. In fact, even though the archaeological evidence doesn’t support it anymore as KAC does stop at the Alborz you could still argue with it based on the Shahtepe samples. The reason why Andronovo was heavily pushed is obvious. Because the Armenian hypothesis could easily use the South Caucasus > Iran argument as well. Either way, I’ll publish a few posts on I-Ir some time later.

    “There is no rule against IE being a pretty old language (PIA in 6000 BCE let’s say), or Anatolian split happening earlier.”

    I disagree. The devil and smith story is reconstructed for every IE branch except Anatolian. The concept of a smith of royal descent(except in the German version) who assists the dragonslayer is a core part of IE belief.

    Even Anatolian could have some aspects of this myth or it could have split before the myth reached it’s final development:

    https://www.academia.edu/43290390/Indo_European_Smith_and_his_divine_colleagues

  91. Frank,

    While the models seems to have work more or less okay when it comes to averages, I can’t say that individual variation may be too accurate. With relatively similar sources (al EEF are similar enough in this respect) it may count more certain patterns of DNA damage, imputation due to incomplete data, etc… That may partly account for the variation in GAC vs Iberian EEF proportions. However, another part may be real, given that BB samples from Germany are far from homogeneous. There is a significant variation in the steppe % in different samples. This was a pretty rapid expansion without that much time for homogenisation.

    Let’s see if Gaska can gives us those dates from the earliest R1b community and the last non-R1b community in Iberia to have a reference. From the samples I ran above (the Bell Beaker ones), the earliest carbon dated ones are:

    – I0461: 2453-2201 calBCE (North Iberia, female)
    and:
    – EHU001: 2287–2044 calBCE (2-3rd_rel_EHU002)
    – EHU002: 2562–2306 calBCE
    Both males, both R1b. Given that they are close relatives, they must have lived around 2300 BC.
    The rest have approximate non carbon dated dates like 2500-2000.

    About the levant admixture, I guess that if it was already present in the Chalcolithic, it may not be relevant to the post-R1b arrival contacts. In any case we’d be looking for a 1-2% amount or so?

    What I still fail to see is what you are looking for exactly in Iberia. It seems that you might be suggesting that El Argar was an EEF matriarcal culture that assimilated R1b males from steppe communities? That would be easy to see given the good sampling of this culture. We would start with samples that are 100% (or almost) EEF and whose male lineages would be 100% (or almost) of Neolithic origin. Then we would see a gradual increase in R1b lineages and steppe admixture (but rising very slowly: if those males had 40% max. steppe admixture to begin with, and if they incorporated 10% per generation, their average contribution would be 5% per generation, which would result in the first admixed generation having around 2% steppe admixture and growing slowly with each generation, or around 29 years). Do we have anything that even remotely resembles that in the available record?

    More generally about the whole population replacement (not just in Iberia, but all of N and W Europe), the same principle applies. If this was a male migration who took EEF females as they advanced, their steppe admixture will be reduced by half per generation. First admixed generation would be 50% steppe, second 25% steppe, third 12.5 steppe, etc… which would make it disappear very fast. If instead, you want to argue that it was not 100% male migration, but they did carry a few women with them (10%) while getting the rest (90%) from EEF communities, this would only slow down slightly the pattern above.

    The 100% population replacement is different from 100% genetic replacement. I tried to stress this in the post itself, so I don’t want to repeat myself. I’d rather you present me (us) some scenario you’re able to envisage which is compatible with the data that we have, because I think that once you try to do that exercise is when you will better see what I’m trying to say in this post.

  92. Frank,

    I’ve run the El Argar samples using Iberia_N + Iberia_Meso and to me it seems to work worse, with more ancestry going to GAC. There’s only one sample affected when it comes to East Med ancestry, ALM041, showing 11.4% from West Anatolia, but I don’t know if that’s very trustworthy. On the second tab I also run the samples with Barcin_N instead of Iberia_N, but that worked even worse, with almost EEF going to GAC (which by the way, seems to decrease Yamnaya as it goes up).

    https://docs.google.com/spreadsheets/d/1I4xQMo4KU9opytgMvhuV0AAXlhHsZydj8XucCgz8E7M/edit?usp=sharing

  93. -El Hundido BB tombs are the oldest samples of R1b-P312 in Iberia and are perfectly dated.

    “Las Tumbas Campaniformes del monumento funerario de El Hundido (Monasterio de Rodilla, Comarca de la Bureba, Burgos)-Carmen Alonso Fernández (2.013)”

    These are the three oldest samples of P312 in Europe

    *RISE563 (2.456 BCE)-Osterhofen, BBC, Germany-R1b-P312
    *EHU002 (2.434 BCE)-El Hundido, Monasterio de Rodilla, BBC, Iberia-R1b-P312
    *I1390 (2.432 BCE)-Sierentz, Alsace, BBC, France-R1b-P312

    The curious thing about El Hundido is that the R1b are buried in the corridor of a collective Neolithic tomb and that the grave goods are typical of the Iberian Bell Beaker package, i.e. wristguards, copper daggers, and Ciempozuelos style ceramics. As you know this style is exclusive of Iberia, then either these gentlemen were the ones who brought this style from Germany or it was simply invented just after crossing the Pyrenees.

    Glockenbecherbezüge zwischen Mitteldeutschland und der Meseta-Region, mit Blick auf die Schönfelder Kultur-Ralph Großmann (2.017)-The following article is based on articles by E.Sangmeister and H.Behrens from the 1960s and focuses on supraregional similarities among the Bell Beaker phenomenon especially between Central Germany and the Meseta region. Not only the Beaker package, but also decoration motifs and the Begleit-Komplementärkeramik of the younger Bell Beaker phase illustrate interrelations between these regions. Especially decoration motifs might be interpreted as a form of reflux from central to southwest Europe. 14C data support an earlier implementation of the two-zone decoration in central Germany and these elements reached the Iberian Peninsula and the Meseta region along the Upper Rhine region and southern France. Moreover, the Iberian Bell Beaker phenomenon and the Schönfelder culture are also interconnected. Carinated bowls and Kalottenschalen reveal the influence of the Iberian Beaker phenomenon. Throughout Europe, network centres can be identified within the Bell Beaker phenomenon which illustrate the importance of the exchange of material and immaterial resources

    “Furthermore, the connection between the Meseta region and Central Germany is not only evident within the Bell Beaker phenomenon, but also between the Bell Beaker phenomenon and the Ammensleben group of the Schönfeld culture. Thus probably from the 24th century BC onwards, the idea of the kinked wall bowl and calotte bowl from the Iberian Peninsula reached Central Germany and the Schönfeld culture”

    Regarding your question about the male lineages we have R1b-V88 in Neolithic sites of Els Trocs and cueva de Chaves, and ATP3 (3.500 BCE) which after 10 years has been recognized by Harvard as R1b-P297 (although it is positive for two SNPs under M269).

    Between 5.000 and 2.500 BCE we have 121 male samples-77 I2a-M438 (different subclades-63,64%), 14 H2a-P96, (11.57%), 26 G2a (21,49%) 3 R1b-V88 (2,47 %) and 1 R1b-P297 (0.83%), i.e. the male descendants of the Iberian HGs continued to dominate Iberia throughout the Neolithic and Pre-BB Chalcolithic.

    During the BB culture (2,500-2000 BCE) we have, 24 I2a-M438 (48%), 3 H2-P96 (6%), 3 G2a (6%) and 20 R1b (different clades-P312, L51, L151, DF27 etc…-40%), i.e. in Iberia, I2a-M438 is still the majority during the BBC, although I suppose that when more sites are analyzed, this will no longer be the case.

    The Bronze Age represents a radical change; 99 R1b-M269 (94.3 %) , 4 I2a1a-P37 (3.80%) and 2 G2a2b-P303 (1.90%).

    The Iron Age is overwhelmingly R1b-P312 except for a Celtiberian and a Basque-Navarran which are I2a.

  94. I think the distal models are better to distinguish the origin of the autosomal components, we can see the increase (Iberia_BBC-9%>Iberia_BA-13%) in steppe ancestry (CHG+EHG) thanks to the Central European exogamy and a slight increase of the Levantine component in the Argar culture more accentuated in La Bastida than in La Almoloya. This may be a consequence of exogamy with the central Mediterranean, but look at the CWC and Netherlands_BBC, according to Lazaridis (2022), its Levantine component is a little bit superior to the Iberian BBs.

    -Iberia_BB culture
    60,98-Anatolia_Neolithic
    23,43-Western_HG
    6,38-Levant_PPN
    4,61-Eastern_HG
    4,42-Caucasus_HG
    0.00-IRN_Ganj_Dareh
    0.00-Morocco_Eneolithic

    -Argar_BA_Almoloya
    55,48-Anatolia_Neolithic
    24,28-Western_HG
    7,78-Caucasus_HG
    7,62-Levant_PPN
    4,86-Eastern_HG
    0.00-IRN_Ganj_Dareh
    0.00-Morocco_Eneolithic

    -Argar_BA_Bastida
    53.92-Anatolia_Neolithic
    23,35-Western_HG
    10,44-Levant_PPN
    7,55-Caucasus_HG
    4,75-Eastern_HG
    0.00-IRN_Ganj_Dareh
    0.00-Morocco_Eneolithic

    Sicily_EBA
    68,29-Anatolia_Neolithic
    11.37-Western_HG
    9.05-Levant_PPN
    7,23-Caucasus_HG
    4.06-Eastern_HG

    Italy_EBA
    45,96-Anatolia_Neolithic
    15,49-Eastern_HG
    13,46-Caucasus_HG
    12,95-Western_HG
    12,14-Levant_PPN

    -Greece_Mycenean
    53,70-Anatolia_Neolithic
    21,00-Caucasus_HG
    19,60-Levant_PPN
    4,90-Eastern_HG
    0,80-Western_HG

    -Germany_CWC
    34,24-Eastern_HG
    32,16-Caucasus_HG
    16,55-Anatolia_Neolithic
    10,49-Western_HG
    6,56-Levant_PPN

    -Netherlands_BBC
    28,53-Eastern_HG
    25,76-Anatolia_Neolithic
    23,53-Caucasus_HG
    14,63-Western_HG
    7,55-Levant_PPN

  95. Gaska,

    Thanks for all the detailed information. I think that the patter in clear and as expected. Communities of R1b-P312 males entering Iberia around 2400-2300 BC and communities of non-R1b-P312 males disappearing around 2300-2100 BC.

    It’s the same pattern seen through Northern and Western Europe. The main difference is that while in Central-Eastern Europe the CWC people started to arrive with close to 100% steppe admixture, when they started to arrive to Iberia they had around 40% steppe left. That makes the difference in the final genetic outcome, but no difference in the population dynamics outcome.

    Maybe rather than Iberia where all is clear, someone should look at why Sintashta had close to 50% EEF admixture if they headed east where there were no EEF populations around them. Or why no one ever argued that Sintashta spoke and EEF language based on that admixture level (the same applies to Central European CWC late and BBC).

  96. Really beautifully composed theory, I know some archeogenetics and know quite a lot about linguistics. I don’t really know what to think though.
    You raised some good points, but at the same time, Occam Razor tells me that PIE originated and was carried across Europe by Sredny Stog, given that, it would be the simplest solution, of course it might not be correct.

    At the same time i recommend you reading Roland Pooth’s work on Proto-indo-European, if you don’t already know about him; here is a link to his work: https://www.academia.edu/34772821/PIE_Alignment_Reasons_for_a_Change. Basically, Pooth through internal means, reconstructs Indo-Hittite morphology, with a hybrid split-ergative alignment, where animate nouns take on the absolutive-ergative system and inanimate nouns take on the nominative-accusative system, and it reflects in a newly reconstructed verbal system to fit this more archaic Hittite-compatible morphology.
    I am mentioning Pooth because his Proto-Indo-European/Indo-Hittite reconstruction might align better with your theory, given that he posits an Indo-Hittite stage which then by a process that I still haven’t read fully, gets to form a lingua franca later on, that is basically the classical late-indo-european morphology, where we have a three gender system.

    On the opposite side, we can look at Indo-Iranian, Indo-Iranian split already into multiple and various dialects by 1200BC, we have attested the Rigveda from 1200BC and the Zoroastrian Old Avestan texts of 1200BC aswell.
    If we calculate how long it would take for these two languages to have diverged from a common proto-Indo-Iranian stage we get a date no older and no younger than 2200BC, and with a high probable point of origin around the Southern Urals, Central Siberia, near the Yenisei river, where they probably had some contact with the people of the Botai Culture.

    2200BC and somewhere in Central Siberia is most likely associated with the Andronovo or Sintashta Culture, the vocabulary of Indo-Iranian fits perfectly with the archeology of Sintashta.
    This culture was an offshoot of the Fatyanovo culture which most likely can be associated with an Indo-Slavic dialect of Late-Indo-European. 2200BC in Central Siberia is quite a large distance and timegap, it means Indo-Iranian evolved from a Late-Indo-European dialect no later than 2800BC. At the sametime, we can attest Mycenean Greek around 1500BC, morphologically very similar to Indo-Iranian, but with radically different phonology, closer to Late-Indo-European. I do not see how this could be compatible with the spread of IE languages through Iran, if Indo-Iranian is clearly associated with Andronovo/Sintashta and Mycenean Greek with Greece.
    The spread of the Greek language seems more likely to have come from some offshoot of the Catacomb culture, an offshoot of Yamnaya itself, via the Balkans. We can confirm this more or less by looking at samples of Yamnaya men which have more CHG than CWH men, the same amount of CHG is present in Mycenean Greeks.

    Again however, I do not see how the Bell Beakers that occupied Iberia and wiped out 100% of the native male lineages, didn’t just impose their language, and instead took on the languages of their newly acquired brides from northern Spain, could happen of course, but seems a bit far-fetched. Plus the Bell Beaker Culture was preserved for a long time, which tells me the customs of the men were preserved, so there is that to consider aswell.

    Something tells me we are missing something from both sides of the picture, I still don’t think that IE comes from Iran though, but also see the possibility that Steppe people that invaded all of Eurasia, most likely spoke distinct dialects and languages, that at some point might have formed the Late Indo European lingua franca, when this happened? The archeology seems to point to 3000BC and CWH, but again, Basques and Etruscans…

    Morphologically Basque does seem to fit well with Caucasian, Hurrian, Dene-Yeniseian, Hattic and Burushaski, maybe they form a macrofamily of early siberian hunter gatherers, but this is extremely speculative, just some fun hypothesizing from my side.

    The oldest R1b-M269(basically the only important subclade for the steppe expansion) sample we have is of a Samara HG from 5500BC, and the oldest R1a is of a HG from the Veretye Culture(10,000BC). These two haplogroups don’t seem to interact too much with each other until we get the early Neolithic cultures of the Middle Volga. My hypothesis is that R1bs of Samara spoke some kind of language derived from this macrofamily of early siberian hunter gatherers that compose Basque and Dene-Yeniseian, while the tribes of R1as 1700km north of the Samara HG spoke some kind of language related to Uralic, and possibly Turkic languages. Again this is highly speculative.

    Anyway, sorry for the rambling, but it something that keeps me up.

  97. @Artur

    Thanks for reading and commenting. The origin of IE languages has always been an interesting debate, and the advent of ancient DNA has made it possible to finally solve the riddle based on more clear evidence. Hopefully we’re not too far away from solving it for good, so I hope you’ll keep an eye on the developments and keep this alternative view in mind while looking at the evidence presented at each time.

    If we calculate how long it would take for these two languages [Rigvedic Sanskrit and Avestan] to have diverged from a common proto-Indo-Iranian stage we get a date no older and no younger than 2200BC, and with a high probable point of origin around the Southern Urals, Central Siberia, near the Yenisei river, where they probably had some contact with the people of the Botai Culture.

    I guess you can get an estimated date, but how do you get an estimated geographical location? (By the way, I think you mean Western Siberia, and maybe the Tobol river? Just a geographical mismatch anyway).

    2200BC and somewhere in Central Siberia is most likely associated with the Andronovo or Sintashta Culture, the vocabulary of Indo-Iranian fits perfectly with the archeology of Sintashta.

    Here you probably mean the Rigvedic hymns rather than the Indo-Iranian vocabulary itself. If the Rigveda was composed around the Punjab region of India, isn’t it strange that it matches the archaeology of a different place instead of the archaeology of the place and time where it was actually composed, by the people who composed it? I’ve always been a bit puzzled by this interpretation.

    At the sametime, we can attest Mycenean Greek around 1500BC, morphologically very similar to Indo-Iranian, but with radically different phonology, closer to Late-Indo-European. I do not see how this could be compatible with the spread of IE languages through Iran, if Indo-Iranian is clearly associated with Andronovo/Sintashta and Mycenean Greek with Greece.

    One solution to this problem would be that Indo-Iranian may not be associated with Sintashta (Andronovo did eventually become Indo-Iranian). After all, it’s not like Mycenaean Greeks had any strong association to the steppe.

    There are many questions still open, but we’re getting closer. Let’s see how everything unfolds once we have all the needed evidence that aDNA can bring us.

  98. @Artur

    Which is the oldest M269* found in Samara? Until recently even Lazaridis said that the oldest was Smyadovo I2181 and now we have another sample in the Carpathian Basin.

    In case Yamnaya spoke an Indo-European language only some subclades of R1b could be related to this language, i.e. R1b-L23>Z2103, R1b-V1636, R1b-PF7562 and even R1b-Y13200 which had Baltic origin but has appeared in Yamnaya and in the Mycenaean culture. Other markers such as R1b-M269* or R1b-L51>L151 have nothing to do with the Pontic steppes.

    I think many people agree that the BB culture did not speak IE, among other reasons because in Iberia, southern France, Sicily, Sardinia, Hungary and even Germany there are many samples of beakers that are not R1b.

    @Alberto

    I think you are right to talk about (chalcolitic ) population replacement, but not only women maintained the genetics of WHGs and EEFs, also I2a men participated (especially in western Europe) because there, the EEF migration wave had already lost much strength. I seem to recall that not a single G2a has been found in the British Isles until the Iron Age, both in that region and in southern Scandinavia the western megalithic culture was overwhelmingly dominated by different I2a clades. If you apply this to linguistics, why would these male descendants of the WHGs have changed their mother tongue? They didn’t have to.

    In my humble opinion, you don’t need to force an issue that is hardly justifiable (i.e. the genetic and cultural continuity between CWC & BBC) to defend that neither the BB culture nor the Western Bronze Age cultures nor the historical Iron Age peoples spoke NON-IE languages. It would be enough for you to justify the genetic continuity through the male line between the WHGs (or even part of the EHGs) and the Iron Age, then, defend that the language of the European hunter-gatherers was NON-Indo-European and you have a reasonable explanation for the linguistic issue in Western Europe during the Iron Age.

    Now the problem would be that even in the “idolized” Yamnaya culture the male markers found are overwhelmingly of “western” origin, understanding that even the Scandinavian, Baltic and Balkan HGs were variants of the WHGs.

    The genetic weight of South Caucasian male markers in the steppes and even northern Russia is very small, insufficient in my opinion to justify a change in language. However, exogamy was widespread with about half of the mtDNA in Yamnaya-Afanasievo of South-Caucasian & Anatolian origin (this is the justification for the important % of CHG in the steppes). Common sense tells me that Yamnaya and the European HGs spoke the same language regardless of the language family it belonged to. Defending that Maykop Indo-Europeanized Yamnaya is not far-fetched due to technological superiority and the transmission of the language by the South-Caucasian male elites.

  99. Gaska,

    The situation with I2a becoming dominant in the middle-late Neolithic in Europe is quite different from the one of R1a-M417 and R1b-L23. The latter, as we have been discussing, is due to populations who can be identified by those male lineages replacing other populations that didn’t have them.

    In the case of I2a, its rise in frequency was an internal process of the EEF communities. It came from outside initially, but at low frequency, and then once inside the EEF communities it rose in frequency for reasons that are not exactly known (we know that this phenomenon happens quite often with male lineages, whether due to bottlenecks, founder effects, selection,… but we can’t always pinpoint the exact reason why). It’s a process that usually takes a long time and generations and it’s unrelated to the population it first came from.

    In this kind of scenario where a lineage rises in frequency within a population there’s no reason to relate it to any language shift. I see much more likely that the EEFs continued speaking the the same language they did in Anatolia (with its internal evolution, obviously) than they shifting to the language of the European HGs. In interactions between these two types of communities, it’s much more common that the HGs adopt the language of the farmers than the other way around, so it’s likely that late groups of HGs in Europe had shifted to the language of their neolithic neighbours, though it’s something that I guess we’ll never know.

  100. @Gaska

    In my humble opinion, you don’t need to force an issue that is hardly justifiable (i.e. the genetic and cultural continuity between CWC & BBC)

    The problem with the term BBC is that it’s been attributed to two populations of different origin, so I don’t really like the term. The genetic continuity between CWC and the R1b-L51+ communities labelled as BBC is perfectly justified. While with the non-R1b-L51+ communities labelled as BBC there’s clearly no relationship to the CWC.

    This is an issue that archaeology needs to address to avoid further confusion between both BBC culture groups.

  101. @Alberto

    1-Yes, it could be that the I2a HGs abandoned their language and adopted that of the Neolithic farmers, I do not deny that this could have been the case, but especially in Britain where there is hardly any H2-P96 it seems very difficult for those I2a men to adopt the language of their women. As you say, I believe that no one will ever be able to prove what really happened.

    2-You may be right and the archaeologists would have to rename the BBC, of course this culture in Iberia is peculiar both genetically and culturally. But it is also true that it was an authentic thalassocracy with great nautical knowledge undoubtedly inherited from the megalithic culture and that the cultural uniformity (for example the maritime style pottery) of this culture throughout Europe required population movements (men and women) and an absolute control of trade routes. So in my opinion different regional variants and even different male lineages are not sufficient arguments to rename this culture.

    3-I have already mentioned to you the archaeological doubts of the assignment of the early burials of R1b-L151 and R1b-U106 to the early CWC in Bohemia and also the impossibility of genetically linking them to Yamnaya.

    4-Leaving aside the geographic origin of the L151 marker, let’s assume that these men really belonged to the CWC, so that your scenario would be possible where do you think the BBC originated? All the regional variants of the CWC (except theoretically Bohemia) are overwhelmingly R1a-M417, I mean, Battle Axe culture (Sweden), Estonia, Latvia, Lithuania, Germany, Poland (except for some very late R1b-L52, 2,400 BCE), Ukraine & Russia (Fatyanovo). There are even CWC domains where the dominant marker is I2a-M438 as in Switzerland (Spreitenbach), and sites where this marker has been found such as Brandysek & Velké Žernoseky (Czechia), Pitutkowo (Poland), Stenderup Hage & Naes (Denmark-Single grave culture). In addition it seems that Harvard has genomes of this culture in Holland and it seems that they are (according to Reich) typical of the local neolithic cultures which has led him to say that steppe ancestry could have been transmitted through exogamy.

    5-As you will understand we must discard all those regions where there is no M269>L51>L151 as the place of origin of the BBC, ergo (correct me if I am wrong), what you are thinking is that the first Bohemian’s L151 (dated 2,900 BCE) separated from the CWC, left their companions R1a-M417, R1b-Z2103, I2a-M438, Q1b2a-Z5902 & R1b-V1636 that never reached Western Europe, invented the BBC in Bohemia (or in Germany or Holland???) and then, in a kind of reflux movement they returned to Bohemia as P312 around the year 2400 to found the eastern domain of the BBC while other P312 traveled west until reaching Iberia?

    I don’t think that whole process would be linguistically relevant, because all those markers would speak the same language, right?, and they would do it regardless of whether they belonged to the CWC or the BBC, right?

    Un saludo

  102. @Gaska,

    1 – Yes, regarding the language of late HGs of Europe we’ll never know, so it’s irrelevant to speculate. But my point regarding the frequency of I2a in Neolithic communities is that the dynamics is really different from what happened later with R1a and R1b.

    In Britain, the predominance of I2a is because the Neolithic communities that arrived there happened to carry that lineage. It’s not so much the ultimate origin of the lineage that matters, but the population that carried it and spread it. It’s inconceivable to think that the HGs from Britain (if there were any left) were taking EEF women as they were arriving to Britain, killing the males, and adopting the Neolithic lifestyle. Besides, their autosomes show that this (even if it had been possible) just didn’t happen.

    2, 3, 4, 5 – There I’m just following what seems to be shown by the aDNA data. But we can dig deeper into this, since it’s a subject you’re well versed on and maybe you’ll show me that I’m getting it wrong.

    The exact origin of the BBC (here I mean the one with the R1b-P312 males) is difficult to point it out, since it expanded quite fast through a large area. But in general, it had to be somewhere around Western Germany or the Netherlands. Not in Bohemia.

    What allowed such rapid expansion had to be the fact that the area it covered was largely depopulated. Otherwise it wouldn’t have been possible for a very small initial population to not only cover all that area, but in doing so replace the previous ones if those had been present in large numbers, as they were centuries before.

    And yes, it was a small population that carried R1b-P312 that expanded all through Western Europe. Whichever other lineages were present in other clans of the population didn’t participate in that expansion.

    If we agree in the above, then we can go onto the next point with is the origin of that population, which is where I believe we disagree. I think it was part of the CWC initially, and I guess you think it was part of the Megalithic Culture initially?

    My reasoning is that the R1b-P312 people appear together with the R1a-M417 people and they are genetically identical other than the male lineage. And while P312 is probably already from Western Europe, its predecessors (essentially from L23, which is the one that matters, with its two daughters L51 and Z2103) are from Eastern Europe and came with the CWC. I don’t know how else it may have happened, but if you give me a good alternative I may change my mind.

  103. @Alberto

    1-Origin of the BBC-For me the debate is closed, the international maritime style that gives name to this culture and that the only one common to all Europe has its origin in the Tagus estuary. Now, I believe that Sangmeister could be right and that there were reflux movements from Germany (maybe P312 and Ciempozuelos)-Netherlands? no way, that debate is also closed and only some fanatic kurganists continue defending that possibility

    2-“Rapid expansion in relation to depopulation”, totally agree with you, there is no other coherent explanation

    3-“Other clans didn’t participate in that expansion”- Here I disagree with you, the BBc occupied much of Europe for hundreds of years and there were many small migratory movements, look at these samples of Sardinia, its origin is Iberia.

    *MIR202-037 (2.623 BCE)-Cueva de El Mirador, Iberia-I2a1b1b1a1-M223>Y3259>PF692>Y34539
    *SUC007 (2.376 BCE)-Su Crucifissu Mannu, Porto Torres, BBC, Sardinia-I2a1b1b1a1-Y34539

    *I0458 (2.332 BCE)-Dolmen del Arroyal, BBC, Iberia-I2a1b1b-L460>M223>Y3259>PF692
    *ISC001 (2.214 BCE)-S’Iscia ‘e sas Piras, BBC, Sardinia-I2a-M223>Y3259>PF692

    4-The origin of P312* is undoubtedly western (Germany, France) and appeared practically at the same time as the BBC, I don’t think it appears in any other European culture and certainly not in the CWC. L151 is Central European (Bohemian) and L51 is not currently a steppe marker, perhaps it could appear in the Baltic or even in the Balkans or the Carpathian Basin. I think the key is the northern signal (excess WHG) that appears linked to this marker in the early CWC-Bohemia and that can only have its origin in the Baltic or northern Ukraine (forest steppe).

    5-R1a-M417 never appears linked to R1b-L51, not even in Yamnaya or in the CWC because the former appears in Bohemia together with R1b-Z2103 and Q1b, 150 years later (2,750 BCE) than the latter (2,900 BCE), that is, if they have the same origin they made separate trips which is highly improbable.

  104. Will be going through the comments in the coming weeks.

    @Vara iirc Anthony pushes for Sredny being LPIE after PIA supposedly splits in “the north Caucasus or north or the Caucasus” per the hybrid theory Harvard cooked up just to entice him to join them.

    @Alberto Hey, I’ll be further clarifying. My English can be a bit unclear sometimes as I’m not fluent in expression.

    “– Which were the steppe communities who got culturally assimilated by EEFs?”
    The CWC clans who switched to sedentary agriculture/agropastoralism.
    “– Which were “the rest of CWC clans”?”
    Presumably the non-R1a heavy ones (like the early R1b ones).
    “– Which were the “genetically assimilated EEFs”?”
    Yes this one is unclear as I wrote it. I mean the EEF communities from which EEF admixture entered CWC. The remaining communities were dispersed, as were other (presumably non-sedentary) CWC clans.

    – “remained of the now-smaller and less successful EEF communities” (which I guess were the typical Neolithic communities in Europe?).
    Correct.

    @FrankN “So, at least for NC Germany, your explanation doesn’t fit the archeological record.”
    Possible misunderstanding here, I wasn’t making a case for depopulation – only that the (now already refuted) theory that steppe pastoralists (who instantly switched to sedentary agropastoralism) were “warlike” and EEFs “peaceful” with HGs also being called “peaceful” at times. Virtually none of these is true,

    (If anything steppe communities are less warlike than PIE (or even LPIE) communities were.)

    Yes, climate conditions seem to have been harsh. There could also be many other factors at play. Poor economy, raids from wandering EEF bands, dysgenic genetic mutations, many things.

  105. @Vara An additional clue for BB not being IE is that there is BB influence in BA/MBA Greece (and also LBA), not just archaeologically but genetically. (seen in Skourtanioti et al 2023) And there’s no discernible BB “IE” substrate.

    “The earlier alternative steppe scenario where Indo-Iranian derives from Yamnaya and makes its way across the Caucasus to northeast Iran makes more sense. In fact, even though the archaeological evidence doesn’t support it anymore as KAC does stop at the Alborz you could still argue with it based on the Shahtepe samples. The reason why Andronovo was heavily pushed is obvious. Because the Armenian hypothesis could easily use the South Caucasus > Iran argument as well. Either way, I’ll publish a few posts on I-Ir some time later.”
    Agreed.

    “I disagree. The devil and smith story is reconstructed for every IE branch except Anatolian. The concept of a smith of royal descent(except in the German version) who assists the dragonslayer is a core part of IE belief.”
    Metallurgy defines the split date I think, not necessarily the existence of PIA as a whole. Heggarty’s model for a ~6000 BCE (could also be later as he admits) beginning of splits could be correct about the existence of PIA around that time, just not the split. Anatolian being earlier by the commonly proposed date of 4200-4000 BCE is something possible, is what I meant. Don’t really have a position here, just mentioning some possibilities.

  106. @Artur – I agree with most of it but there are few things to consider. Earliest M269* is now NV3003 (Novodanilovka profile i.e. Early Novodanilovka + CentralAsia_N) and earliest proto-M417 is somewhere in Upper Don/Baltic/Karelia HGs. My hypothesis is Non-Core Sredniy Stog i.e. higher Ukraine_N Sredniy (in upper Ukraine) is source of M417 (itself deriving it from Ukraine_N) and giving it to Eastern CWC (Battle Axe, Middle Dnieper, Fatyanovo). Basically we have CoreYamnaya (PIE, M269) + non-core SS (M417) + GAC (I1, I2, G2) = CWC. As for Rig Veda I can comment with more precision and it would be 1600-1200 BCE rather than what you said. And Avesta is tied to Yaz culture so it’s likely 1300-1000 BCE with Zoroaster being born in 1000 BCE epoch. We know Yaz (E Iranic, Margiana & Bactria) and Archaic Dehistan (W Iranic, Dahaea and Parthia) derives from Tazabgyab (65% and 35% admix respectively) whereas GGC (Dardic) and PGW (Vedic) derives from Vakhsh-Bishkent (22-24% and 28-38% admix respectively). Balkans sees uniformly 30-40% Catacomb influx at some point around 2500 BCE, this is the Balkanic node (Illyrian-Albanoid, Daco-Thracian, Graeco-Phrygian) arriving. This profile is consistent till Northern Greece, where the Theopetra-Sarakenos-Logkas type Proto-Greeks migrate southwards and admix with the Helladic, Cycladic, and Minoan cultures.

  107. @Gaska

    So if I understand you correctly, we’re mostly debating semantics here. Labels given by archaeology to certain sites (largely burials). That’s what seems to be confusing, since the rest is clear.

    1 – BBC has its origin in the Tagus estuary (around Lisbon, Portugal), possibly around 2700 BC? I can’t disagree with that culture being native and having an origin right there. It’s a Chalcolithic culture from West Iberia, not very different from that of Los Millares in SE Iberia, though clearly not the same. It came from the Atlantic Megalithic culture and the people who bore it were descendants from the EEF who arrived thousands of years before. Essentially we’re talking about this population:

    https://adnaera.com/wp-content/uploads/2025/01/vah_bbc_sw_iberia.png

    2 – We agree about the depopulation issue.

    3 – “Other lineages didn’t participate in the expansion”, I was talking about the expansion of R1b-P312. In those communities, that lineage was by far the most prevalent (though there may be a few rare exceptions, as always). In the expansion of the BBC that had its origin in West Iberia there were other lineages, obviously. So here I think we agree too.

    4 – R1b-P312 from Western Europe (West Germany or near) we agree too. It appears around 2500 BC, yes, practically at the same time as the BBC. We agree.

    I don’t think it appears in any other European culture and certainly not in the CWC

    Yes, the R1b-P312 people appear to form a culture of their own. We could give it a completely new name or call it the Bell Beaker Alternative Culture (BBAC), to distinguish it from the BBC from West Iberia. A matter of name. We are talking about this population:

    https://adnaera.com/wp-content/uploads/2025/01/vah_bb_central_europe.png

    L151 is Central European (Bohemian)

    Yes, L151 could be from Bohemia, where it’s first found. We’re talking about this population:

    https://adnaera.com/wp-content/uploads/2025/01/vah_cwc_bohemia_r1b.png

    I think the key is the northern signal (excess WHG) that appears linked to this marker in the early CWC-Bohemia and that can only have its origin in the Baltic or northern Ukraine (forest steppe).

    Yes, this population was not native to Bohemia. They had just arrived there from the east since they’re 75% steppe-like and just have 25% of local EEF admixture. The origin of the population (and the male lineage) may indeed be the forest steppe in Northern Ukraine (better than the Baltic, given that the Yamnaya people have a sister branch of that lineage and the Baltic is quite far from the steppe. Here we agree too.

    Now, R1b-L151 is the parent of R1b-P312. So here we have a population which is ancestral to the one found in Western Germany and Netherlands shown above, not only in the male lineage, but also in the autosomes.

    As I understand, you argue that these burials are wrongly attributed to the CWC. I haven’t looked at the details of the burials, so I can’t comment. But let’s say that’s correct and this was a different culture that came from around the same area but a bit earlier. Let’s call this culture the Corded Ware Alternative Culture (CWAC) to differentiate it from the CWC. For reference, here are the samples correctly attributed to the CWC:

    https://adnaera.com/wp-content/uploads/2025/01/vah_cwc_bohemia_r1a.png

    This last population is ancestral to the CWC from Germany, but not to the BBAC from Germany and beyond (which descend from the CWAC shown above).

    So in summary, we are not disagreeing in any of the basics. It’s just the labels used to define each population that seems to be the confusing factor.

    We’ve also agreed above that the R1b-P312 people (what I’m here referring as the BBAC) replaced the previous populations in the areas where they went (including those of the original BBC).

    The only pending question would be if the R1b-L151 CWAC population from Bohemia spoke the same or a different language from the R1a-M417 CWC from the same area. I think that given that they are both genetically identical and therefor came from the same area and moved to the same area, it seems quite reasonable to think that they spoke the same language. I don’t think that you’ve ever argued that this R1b-L151 CWAC from Bohemia instead spoke the same language as the BBC from the Tagus estuary in Western Iberia, so that should not be an issue here.

    And finally, did the CWAC population transfer their language to their descendants further west (the R1b-P312 BBAC population)? I personally don’t see why not. Don’t know if you agree with this last one or not, but I know we do agree that in any case is was non-IE.

  108. @Alberto

    Yeah, this Iberian Chalcolithic population (mostly I2a-M438, with some mtDNA of WHG origin and at least 25% WHG (very similar to GAC), were the first to spread the Iberian BBC. Now, can you model some of those males that reached the BBC? You have to take into account that some of them shared ceramic styles and sites like Humanejos with R1b-P312. I suppose that some of them would already have steppe ancestry in spite of their male marker.

    I0826 (2.656 AC)-Cerdañola del Vallés-I2a1b1-L460>M436>M223>Y3259
    I1970 (2.500 AC)-Cueva Verdelha, Lisboa-I2a1b1b-Y6098>S23680>PF692
    I1976 (2.459 AC)-Dolmen del Sotillo-I2a1b2a-S2555>S2524>L38
    NEO609 (2.381 AC)-Hipogeo de Sao Paulo2, Almada, Lisboa-I2a1a-CTS595
    I6587 (2.350 AC)-Humanejos, campaniforme-I2a1b1-M223>Y3259
    I4229 (2.335 AC)-Cueva da Moura, Torres Vedras-I2a1a1a1a1-Y3992>L160
    I0460 (2.335 AC)-Dolmen del Arroyal-I2a1b1b-L460>M223>Y3259>PF692
    I0458 (2.332 AC)-Dolmen del Arroyal-I2a1b1b-L460>M223>Y3259>PF692
    I2467 (2.315 AC)-Dolmen del Sotillo, campaniforme-I2a1b1-M223>Y3259
    CDM264 (2.250 AC)-Cueva da Moura, Iberia-I2a1a2-Y3104>L161
    I6543 (2.212 AC)-Camino de las Yeseras 13a, Area10-I2a1a2-P37>M423

    I believe that the BBC spoke the language of the European HGs whichever family it belonged to and given that in Iberia there is genetic continuity down the male line to the Iberians and Basque-Aquitans of the Iron Age, and given that these gentlemen spoke NON-IE languages, common sense tells me that their ancestors of the Bronze Age cultures and the BBC did not speak IE. If you apply this reasoning to the CWC & Yamnaya these cultures would not speak IE either. In my opinion, R1a, R1b, I2a-M438, I1-M253 spoke the same language, a kind of archaic European.

  109. Regarding the CWC, I think it is the key to understanding the situation in Eastern Europe but it has absolutely nothing to do with Western Europe.

    The first thing everyone needs to understand is that the failure of Harvard (and to some extent also Max Planck) for the time being can be considered as sidereal. They jumped the gun 10 years ago when, upon discovering that the CWC was 75% Yamnaya, they wrongly deduced that the CWC descended directly from Yamnaya. The impossibility of linking by male line these two cultures (For God’s sake, after analyzing thousands of steppe genomes there is no R1a-M417 or R1b-L151 in Yamnaya) has led them to constantly modify their initial position. For example, regarding the ethnogenesis of the Yamnaya culture, it is no longer exclusively EHG+ CHG, now there is also an important component of WHGs and moreover, their genetic makeup existed 1000 to 1500 years earlier than originally thought. In other words, not only did they not know what the autosomal components of Yamnaya really were, but they also did not know how long this genetic component had actually existed. They also had to recognize that this special autosomal mixture may have existed in other territories outside the steppes (e.g. the Carpathian Basin, eastern Balkans or the Ukrainian forest steppe) long before Yamnaya existed.

    This means that the CWC has its origin not in the steppes but in Eastern Europe and that its genetic composition was at first very similar to Yamnaya despite the fact that its men did not belong to that culture (i.e. thanks to exogamy). Genetics is demonstrating that the Yamnaya culture did not monopolize a certain genetic composition and that this could have occurred previously or simultaneously in other territories.

  110. @Reznov

    Hi, thanks for taking the time to read and comment on the post. One question:

    GGC (Dardic) and PGW (Vedic) derives from Vakhsh-Bishkent (22-24% and 28-38% admix respectively).

    Do we have samples from the Painted Grey Ware culture? And which samples are those from the Vakhsh or Bishkent culture you refer to?

  111. The great Kurganist scare was Papac’s paper on Bohemia, all alarms sounded because the steppe theory seemed definitely debunked. I say this because there was a first abstract of the work (2.019) that was later changed, which read as follows-“Interestingly, we identify a possible admixture cline between our Late neolithic Bohemian individuals and a source with high eastern hunter-gatherer related ancestry, currently best represented by Lithuanian neolithic individuals of the Narva culture. We also detect a number of interesting outlier individuals which add to our understanding of the dynamics and regional nuances of population interactions in 3rd millennium B.C. central Europe”

    -In his autosomal analysis Papac says-“Modelling Bohemia_CW_Early as a two-way and three-way mixture using proximal sources-Interpretation: known Yamnaya groups are an unsatisfactory source for “steppe” ancestry in Bohemia-CW-Early-Modelling Bohemia_CW_Early using proximal sources. Interpretation: the addition of Latvia_MN, Ukraine_Neolithic, or PittedWare as a source improves almost all model fits (column O, column X, and column AG) and increases the number of working models (p>=0.05, row 106). Bohemia_CW_Early carries some ancestry related to Latvia_MN, Ukraine_Neolithic or PittedWare that is not present in known Yamnaya and European Neolithic/Middle Neolithic farmers”

    -Ergo we have a northern signal (10-13%) incompatible with Yamnaya, which is is the signal of the Narva culture and the outliers referred to in their first abstract-The Kurganist solution to the problem, i.e. that the signal did not come from the Baltic but from the Ukrainian forest steppe, was in my opinion absolutely absurd and merely sought to keep alive the steppe theory by modifying the place of origin of the migrations, they did not speak again of Yamnaya but of Sredni-Stog etc. etc. ….BUT If this Narva-PWC like admixture was present in the Forest Steppe before the migration then shouldn’t we see it in Fatyanovo or Baltic CWC? Seems specific to this R1b-L151 group, maybe acquired due to a North Sea migration route to the Rhine before ending up in Bohemia.

    -Therefore, regardless of whether you use professional qpAdm or G25 tools, that percentage of Narva signal will always appear in the early-CWC samples. For me, all this is an indication of the origin of L51 or maybe L151 in the Baltic

  112. 1-Regarding L151 in Bohemia-It is true that Czech archaeologists have published works in support of Papac’s paper where they categorically state that these tombs belong to the CWC, however the reality is very different, because the main archaeological indicator of the belonging of a site to a culture are the ceramic styles (especially in the CWC and the corded style) and in this case there is not even a sample of the typical CWC ceramics. Besides, there is only one battle axe, and anyone who knows anything about Central European archaeology should know that this item was used in Bohemian cultures more than a thousand years before the CWC.

    -PNL001 (2.896 BC)-Plotiště nad Labem_LX-R1b-U106-Grave goods: two bone belt clasps, bone awl, chipped industry blade-Archaeological dating: Corded Ware culture, aceramic.
    -OBR003 (2.893 BC)-Obritsví-R1b-L151-Grave goods-Pot, two bone belt clasps, stone battle axe (A-type), chipped industry, blade. Archaeological dating: CWC early stage-
    -STD002 (2.777 BC)-Stadice, Bohemia-R1b-L151-Corded Ware culture, aceramic-No grave goods
    -VLI011 (2.775 BC)-Vlineves, Bohemia-R1b-L151-Corded Ware culture, aceramic-Grave goods-No finds
    -VLI015 (2.775 BC)-Vlineves-R1b-L151-Grave goods:chipped industry, two blades-CWC, aceramic
    -VLI092 (2.755 BC)-Vlineves, Bohemia-R1b-L151Grave goods:chipped industry , blade-CWC aceramic-
    -VLI085 (2.719 BC)-Vlineves-R1b-L151-Grave goods: NO finds-Archaeological dating: CWC aceramic.
    -VLI081 (2.700 BC)-Vlineves-R1b-P310-L52 Grave goods:NO finds-Corded Ware culture, aceramic.
    -KON005 (2.727 BC) Konobrzé-R1b-M269-Grave goods: two bone belt clasps, chipped industry blade-Corded Ware culture, aceramic-

    It is therefore perfectly possible that these R1b-L151 men did not belong to the CWC and had their origin in local Neolithic cultures (this does not seem to be the case because there are no samples of this lineage in Bohemia before 3,000 BCE) or that they were migrants from the Baltic (where there are tons of R1b from the Mesolithic).

    2-All these samples R1b-U106 & L151 are dated between 2.900 and 2.725 approx, while, R1a-M417, R1b-Z2103 and Q1b are later.

    *DRO001 (2.752 BCE)-Droužkovice, CWC, Bohemia-Q1b2a-Z5902
    *PNL002 (2.721 BCE)-Plotisté nad Lebem, CWC, Bohemia-R1a-M417
    *OHR001 (2.519 BCE)-Praha 5-Malá Ohrada, CWC_Bohemia-R1b-Z2103>Z2109

    These three male lineages together with a good number of mtDNA markers do have their origin in Yamnaya-Afanasievo, but it must be understood that their arrival in Bohemia is later than the arrival of R1b-L151 which is inexplicable if all these markers would have shared the same culture in the steppe or forest steppe, in my opinion this is enough evidence to understand that they have different origins.

    mtDNA examples

    -mtDNA-R1a1a
    Rusia, Nizhnaya-Orlyanka, Yamnaya-I6727 (2.653 AC)
    Chequia, Praha 5-Malá Ohrada, CWC-OHR002 (2.525 AC)

    -mtDNA-U5a1b1
    Rusia, Marinskaya5, cultura Maykop-MK5007 (3.505 AC)
    Bohemia, Obritsvi, CWC-OBR003 (2.893 AC)

    3-In any case, the result is that now everyone thinks that L151 is a typical CWC marker when it has only been found in Bohemia and is missing in the rest of the regional variants of this culture. My interest in L51>L151 is because I am R1b-Df27 and I am curious to know the origin of my lineage.

  113. @Gaska

    Yes, in general the populations related to the CWC have more admixture from the hunter-gatherers from the forest steppe compared to the Yamnaya Culture that stayed in the steppe. It shouldn’t be a surprise that if one group moved to the forest steppe it got admixture from the area. Usually Ukraine_N is the best source. I now tried including both Latvia_MN and Ukraine_N as sources and mostly all goes to Ukraine_N (though with so much overlapping it’s hard to say how accurate this may be):

    https://adnaera.com/wp-content/uploads/2025/01/vah_steppe_forest.png

    It’s the same in Fatyanovo or Baltic CWC :

    https://adnaera.com/wp-content/uploads/2025/01/vah_forest_steppe2.png

    Regarding the lack of ceramics in the R1b-L151 burials in Bohemia I have no idea why it may be. But having a quick look at the info now, I see for example from the site Plotiště nad Labem where we have both R1b-U106 (PNL001) and R1a-M417 (PNL002) that they’re quite similar:

    “Grave LX. Skeleton: right-sided crouched burial, head towards the north-west. Sex: archaeology – M, anthropology – M, aDNA – M. Age: adultus I (25–30). Grave goods: two bone belt clasps, bone awl, chipped industry – blade. Archaeological dating: Corded Ware culture, aceramic. Radiocarbon dating: MAMS-41376 (4271±25) 2914–2879 cal BC 2-sigma (157, 158). Pandora No.: PNL001. NM Prague Inv. No.: P7A 36100.

    Grave 221B. Skeleton: right-sided crouched burial, head towards the north-north-west. Sex: archaeology – M, anthropology – ?, aDNA – M. Age: juvenis (14–16). Grave goods: fragment of stone flat axe. Archaeological dating: Corded Ware culture, aceramic. Radiocarbon dating: Poz-86648 (4110±35), 2869–2573 cal BC 2 sigma (159, 161). Pandora No.: PNL002. NM Prague Inv. No.: P7A 43219.”

    PNL001 is from an earlier date and has more steppe ancestry (almost unadmixed), but otherwise the burials look to be similar.

    The other two R1a-M417 that are labelled as the CW_early (RDV001 and TRM006) are not too different, though the latter does have a beaker among the grave goods:

    “Skeleton: right-sided crouched burial, head towards the west. Sex: archaeology – M, anthropology – M, aDNA – M. Age: maturus II (50–60). Grave goods: stone battle axe, bone pin, chipped industry – blade. Archaeological dating: Corded Ware culture, middle stage. Radiocarbon dating: MAMS-45791 (4081±25) 2852–2498 cal BC 2-sigma (119). Pandora No.: RDV001. NM Prague.

    Grave 109/82. Skeleton: right-sided crouched burial, head towards the west-south-west. Sex: archaeology – M, anthropology – M, aDNA – M. Age: maturus II (50–60). Grave goods: beaker, stone battle axe, chipped industry – blade. Archaeological dating: Corded Ware culture, middle stage. Radiocarbon dating: MAMS-45796 (4105±25) 2859–2576 cal BC 2-sigma (184, 185). Pandora No.: TRM006. NM Prague Inv. No.: P7A 38608.”

    Not sure this really shows that the R1b male burials have nothing to do with the CWC, but I guess I’d trust the archaeologists who actually examined the burials.

  114. Doesn’t the Narisimhan paper estimate that the Iranian Zagros farmers mixed with the ‘AASI’ between 4700 BCE-3000 BCE and that they did not have evidence for the reverse? Is this not relevant for a possibly earlier IA in the region?

  115. @ashwath

    Yes, if I remember correctly they estimated around those dates for the admixture between a West Eurasian population and a AASI one. This, apart from other anthropological evidence, is why I suggest that this West Eurasian people may have arrived to India around 4500 BC and that they may have already been IE speaking (I expect that they probably arrived from North Iran/Turan rather than the Zagros itself). However, with no ancient DNA from India to corroborate this it’s still a very preliminary estimation.

    When you say “a possibly earlier IA in the region” I’m not sure if you mean earlier than those dates and my own estimate or you were referring to some other comment suggesting a later arrival.

  116. I think language of Middle and Late Djeitun (Monjukli LateN, i.e. Early Djeitun + Zagros LateN) distantly related to Burushaskic continued subsequently into Anau Culture (Chalcolithic Parkhai, Anau, Geoksyur, Sarazm, Bustan, Tepe Hissar, Shah Tepe, i.e. Monjukli LateN + Zagros C + Hissar/Kelteminar) and from there into BMAC i.e. Anau-type + IVC + Chalcolithic Elamite. And for the formation of IVC there are likely two waves involved – the 1st EarlyN/PPN wave (Dravidian) contributing to Mehrgarh/Bhirrana ca. 7200-7000 BCE, and the 2nd LateN wave (Burushaskic substratum) contributing to Mehrgarh II, Sothi I, Rakhigarhi I, Bhirrana II, etc. ca. 5000-4800 BCE. If at all (in a Heggarty framework) the 5000-4800 BCE wave (Zagros LateN = Zagros EarlyN + Shulaveri-Shomu) may be relevant for introduction of Indo-Iranian, but it’s most likely not. Further I also think Nalchik has Darkveti-Meshoko / West Caucasian EarlyN input rather than Shulaveri Shomu (South Caucasian – NW Iranian). So there probably is not even autosomal link connecting Chalcolithic-EBA Luwio-Hittites (Cayonu-type + excess NW/W Anatolia N + Usatovo-Cernavodo), Chalcolithic Indus-Saraswati (Zagros EarlyN + AASI + Hissar/Kelteminar + Zagros LateN), and EarlyC Lower Don (ML Don Meso + Ekaterinovka-type + Darkveti/Meshoko-type)

  117. @Ashish Kaul

    Thanks for your comment.

    A lot of information compressed into a few lines, but no reasoning and evidence for any of the points made, nor is there any rebutting argument or evidence for what’s proposed in any part of the post, so I can’t reply anything else.

  118. To the extent that anyone is still reading the comments this deep in, I offer a gentle reminder that one of the problems in this area is that it is a cross-disciplinary one, straddling genetics and linguistics. Kudos to the author for thinking against the grain.

    However, a lack of true cross-disciplinary knowledge (of HISTORY) is the main barrier to theorists lacking the real-world examples of how to model these things.

    Imagine a scientist from 2500 years from now trying to understand the complex movements into DELAWARE. I’m going to simplify things in the below example, but if you bear with me, you’ll learn an IMPORTANT point.

    “A DNA analysis of the Y Chromosome from burials dated to approximately 1500AD showed predominantly Haplogroup Q” (Native Americans)

    “A DNA analysis of the Y Chromosome from burials dated to approximately 1700AD showed predominantly Haplogroup I1” (New SWEDEN colony).

    “A DNA analysis of the Y Chromosome from burials dated to approximately 2000AD showed the plurality of Haplogroup R1b with significant admixture” (MEXICAN immigrants).

    This is three changes, in 500 years.

    The Swedes were interlopers. Hard to tell by their dominance of New Sweden in 1650, but they left no lasting linguistic and little genetic evidence.

    The Mexicans who streamed into the place in modern times were not conquerors. They are refugees. Economic refugees, and some real refugees.

    The reasons why they show up in greater numbers is due to complex cultural and economic reasons — simply put, they have more kids and others moved away rather than live close to poor people.

    To call these Mexican immigrants “conquerors” is such a fallacy, I can’t even.

    If you understand this, you understand the “male-mediated Steppe migrations” blah blah blah.

    The Mexican immigrants will show lots of R1b, but to call them “descendants of conquerors” while perhaps accurate, but it’s not a term that anyone would use, just 500 years after the Spanish conquest. Why? Because the reason they are outpopulating Delaware in this example (and much of the American Southwest like Los Angeles, in real life) is NOT due to any superior weaponry, linguistic preferences, metallurgy, or anything.

    It’s just because.

    The same reason some non-Mexicans marry Mexicans is because they’re present in proximity.

    The same applies to the Steppe movements. After the initial 1-5 generations and the initial encounters (in Poland? Czechia?), the people were a Mestizo. They were children of conquerors and conquered.

    As they spread West across Europe, there is no sign in the record of conquest. It was more likely overpopulation.

    This (*yawn) boring explanation explains population movements, from Syrians into Lebanon, from Ukrainians into Poland, from Mexicans and Central Americans into California, from Goths into the Roman Empire. It is almost always NOT conquest. It was overpopulation combined with fleeing wars.

    Within a few hundred years, people would LAUGH if you called the humble Central American immigrant to be a conquistador.

    If you grasp this, everything comes into focus.

  119. @Chris Moore

    Yes, trying to look at the fragmentary data we have from prehistory and put it into context is a really difficult task that requires a multidisciplinary approach and careful consideration of all of the data. This has been a problem when it comes to studies dealing with ancient DNA data. While the teams have tried to increasingly incorporate archaeologists, historians, linguists, anthropologists, etc… it’s still been really difficult for them to put things together. Each one may be an expert in one field, but not knowing about the others makes it challenging for them to work together efficiently. This blog post after years of being absent is my modest effort to try to encourage an effort in that direction.

  120. @Alberto and OTHERS:

    Can I ask what perhaps might be a stupid question?

    Given that R1b was so heavily present in Western Europe BEFORE the spread of Beaker and Corded Ware folk, why is it generally accepted that they spread R1b?

    Here is a list of its long pre-Bronze Age presence in Western Europe:
    1. Villabruna 1 (individual I9030), a Western Hunter-Gatherer (WHG), found in an Epigravettian culture setting in the Cismon valley (modern Veneto, Italy), who lived circa 14000 BP and belonged to R1b1a.
    2. Several males of the Iron Gates Mesolithic in the Balkans buried between 11200 and 8200 BP carried R1b1a1a. These individuals were determined to be largely of WHG ancestry.
    3. Several males of the Mesolithic Kunda culture and Neolithic Narva culture buried in the Zvejnieki burial ground in modern-day Latvia c. 9500–6000 BP carried R1b1b. These individuals were determined to be largely of WHG ancestry.
    4. A WHG male buried at Ostrovul Corbuli, Romania c. 8700 BP carried R1b1c.
    5. A Neolithic male buried at Els Trocs, Spain c. 7178-7066 BP, who may have belonged to the Epi-Cardial culture, was found to be a carrier of R1b1.
    6. An Early Copper Age male buried in Cannas di Sotto, Carbonia, Sardinia c. 6450 BP carried R1b1b2.

    Can someone please explain this? Isn’t it possible that these WHG were simply acculturated by a swirl of cultures, and picked up some Steppe genes through normal demic diffusion?

  121. @JDL

    Yes, R1b was widespread throughout Europe since the Mesolithic. Especially in Eastern Europe (the Iron Gates, the Narva culture and that other from Romania are all samples from Eastern, not Western Europe). However, the vast majority of those lineages went extinct and today 99% of the R1b in Western Europe is under the R1b-L51 branch (which is a sister branch of R1b-Z2103, the one most common among the Yamnaya and Afanasievo people). Both of these branches descend from R1b-L23, which is estimated to have formed some 6400 years ago and its Time to Most Recent Common Ancestor (TMRCA) around 6100 years ago (the TMRCA usually gives us an estimation of when a lineage started to grow and diversify). You can check the Rqb tree and dates for each branch here: https://www.yfull.com/tree/R1b/ .

    So the question is where did those lineages under R1b-L23 originated. It doesn’t matter if there was some R1b-V88 in Western Europe before the Bronze Age, because the one that replaced most other male lineages in Western Europe (G2a, I2a, some J, and other R1b lineages) was R1b-L51, which has a TMRCA of 5700 ya, so 3700 BC. That means that every living person in Western Europe who has an R1b haplogroup under that L51 branch descend from a single male who lived around 3700 BC.

    There are other people much more versed than myself about the details of each haplogroup, so maybe they want to add or correct something, but basically the evidence we have is that R1b-L23 (the father of L51) was somewhere in or close to the Pontic-Caspian steppe around 4100 BC. This is because the earliest sample that we have under that branch must be from around ~3600 BC from the steppe (I33307, R1b-Z2106, from a sister branch of L51). Then we have many samples closer to 3000 belonging to the Yamnaya and Afanasievo cultures from that same branch. The earliest L51 samples are probably also from those cultures, somewhere around or after 3000 BC (the last published paper has some of those samples in the Supplementary Tables: https://www.nature.com/articles/s41586-024-08531-5).

    Then we have the earliest R1b-L151 (the “son” of L51) in Bohemia around 2900-2800 BC in some Corded Ware Culture samples, the oldest one an almost unadmixed Yamnaya-like male, but all mostly Yamnaya-like). And then the earliest R1b-P312 (the “son” of L151) already in Western Europe in the Bell Beaker samples from Germany and Netherlands around 2500 BC.

    So that’s basically the explanation about why it was the BBC the one who spread R1b in Western Europe (because it’s one specific branch of R1b, not R1b of any kind), and that’s why its said to come from Eastern Europe with the CWC, with people who ultimately came from the Eurasian steppe.

  122. @Alberto

    It seems like this is very much NOT settled science. I understand your point, that R1b-L23 is the key branch, and that LIKELY originated in the East.

    (But as you note in your post — almost everything in Western Europe originated in the East, because that is the way genetics get into Europe. Haplogroups I and G and J, etc. — they all “originated in the East.”)

    (Also as an aside: What’s fascinating is that the Yamnaya, who many people still believe to be “the original Indo-Europeans” are NOT considered any longer a candidate for the R1b prevalence, because their clade is a brother clade that did NOT spread into Western Europe at any large scale.)

    Now let’s get to the evidence that R1b-L23 did NOT spread with Indo Europeans or even NEBA people from the East:

    -Many of the samples I listed above were for WESTERN Hunter Gatherers. Doesn’t matter if they were found in East(ish) Europe — these were the WHG branch, so it means there could easily have been a small tribe of WHGs somewhere that carried the precursor to R1b-L23.

    -Many of the samples I listed above were not tested for terminal/downstream SNPs and need to be re-tested. This is key!!!

    -The people with the most “Steppe” ancestry today are the Sami of Northern Finland. They are almost absent R1b-L23. They also clearly mark the spread of Uralic people later!

    -People want us to ignore our lying eyes. The idea that every crag and corner of Wales and Spain and Western France and the Pyrenees was ALL completely replaced by people from the East is so much horse malarkey as to defy all common sense.

  123. @JDL

    I think that if you consider the clades, dates and people who carried those clades at those dates you will understand that this is quite settled science.

    I understand your point, that R1b-L23 is the key branch, and that LIKELY originated in the East.

    If you understand that point, why do you go back to:

    Now let’s get to the evidence that R1b-L23 did NOT spread with Indo Europeans or even NEBA people from the East:

    -Many of the samples I listed above were for WESTERN Hunter Gatherers. Doesn’t matter if they were found in East(ish) Europe — these were the WHG branch, so it means there could easily have been a small tribe of WHGs somewhere that carried the precursor to R1b-L23.

    There were indeed small tribes of European hunter-gatherers (both Eastern and Western) who carried the precursor of R1b-L23 (R1b-M269). Some farmers too around Europe. But from all of those, L23 descends from one and only one male.

    So again, the question is where was that male and it’s immediate descendants at the time this mutation happened. Examine the evidence for this.

    -Many of the samples I listed above were not tested for terminal/downstream SNPs and need to be re-tested. This is key!!!

    Given what I just said above, you can see that this is not key, neither even relevant. L23 didn’t exist at that time. M269 could have been in Britain, but if L23 was born in Russia or Ukraine, then that M269 branch from Britain becomes irrelevant.

    People want us to ignore our lying eyes. The idea that every crag and corner of Wales and Spain and Western France and the Pyrenees was ALL completely replaced by people from the East is so much horse malarkey as to defy all common sense

    I would agree with you if we didn’t have the actual evidence. When the first study came out in 2015 showing some samples from the Yamnaya culture from the area of the Volga bend (near modern day Samara) and some from Germany showing high admixture from this Yamnaya population I thought that it was very premature to say that they replaced 50% of the population of Northern Europe. We had no samples back then from between Germany and the Volga to back that up. But then more and more samples came out and that showed that indeed there was a 50% replacement of the genes. However, when it comes to the people (understood as ethnic groups, communities of people with common culture, language, religious beliefs, etc…) the reality is that they replaced 100% of the previous (neolithic) people. The only explanation for this is a very big population collapse in Europe at the end of the neolithic. The reasons for such collapse are not exactly known, but it doesn’t change the fact that the neolithic communities that had lived there for 3000 years disappeared within a few centuries. And that new people came and repopulated those depopulated areas.

  124. @Alberto

    Appreciate the dialogue. I hope you’ll answer the following questions:

    I have come to grasp the circularity of migration theories. In other words, the ones who come later, assuming they have larger population sizes, ALWAYS leave a Chromosomal signature that can be exaggerated easily. It’s an easy mistake, and it becomes a self-fulfilling prophecy. But it’s wrong.

    Every generation, there is a chance a Y-Chromosome dies if there are no males. The longer a sireline has existed, the longer the chance it dies out. We have seen this on Pitcairn Island. We have seen it in Europe. It’s a simple concept, really.

    40,000 years ago, the males in Europe would have born the Neandertal Y Chromosome.

    30,000 years ago, they bore C, in small numbers, but higher numbers than the previous group.

    The next people to come in bore I, in small numbers, but higher numbers than the previous group.

    The next people bore G, in medium numbers, but higher numbers than the previous group.

    The next people bore R1, in greater numbers than the previous group.

    If you start with a small initial population size, and you grasp that the approximate odds of a male not having a male child at roughly 12.5%, each generation, totally random — then the “newer” a migrant is, the more it will appear to dominate.

    The longer a population has existed in a locale (and being free of mutations), the more generations go by, the greater the chance that random happenstance, chance, etc. will make it appear that a Hg either never existed or was slaughtered in a mass killing/enslavement/mate preference.

    Again, we have seen this with REFUGEES in modern times. Palestinians have more kids than Israelis. Syrian refugees have more kids than Europeans. Over time there will be some intermarriage. This is not conquest though: it’s just recent emigres + demographics.

    Why is this NEVER mentioned as a mechanism for the spread of chromosomes? I’ll tell you why: people have pseudo-racist fantasies for their own chromosomes. Davidski’s self-esteem seems so tied to R1a. Many male commenters from Western Europe — who just happen to be R1b — feel the same way about R1b. It’s hogwash.

    Moreover, even if the scenarios were as people described, it would be like a Mexican mestizo claiming to be a Spanish conquistador. It’s just not accurate.

    I’d love your honest response, and I’ll post another set of questions right after this post.

  125. @Alberto

    OK, as promised, here’s another set of questions for you. Others are welcome to chime in.

    I call this: If Hg distribution is not random — AND YET.

    -Some of the HIGHEST “Steppe autosomes” are in Scandinavian people. AND YET, Scandinavian males bear some of the highest levels of I1, a very minor and local Paleolithic heritage, totally ABSENT from Steppe populations, which seemed to expand magically with the advent of population growth due to lactose persistence or whatever.

    -Some of the highest “EEF autosomes” are in Sardinian people. AND YET, Sardinian males bear some of the highest levels of I2, a Hunter Gatherer haplogroup, largely absent from Early European Farmers. As posted above, I2 was a Hunter Gather hg, and is also present in several Yamnaya samples (alongsides R1b). In other words: the people with the most EEF autosomes have a haplogroup that is either WHG or Western Steppe Herder.

    -The highest Steppe autosomes are in the Saami. AND YET, they bear the highest levels of Haplogroup N in Europe — again, largely absent from the Yamnaya etc. who were mostly R1b with some I2a.

    Tell me again this isn’t the result of randomness, founder effect, and genetic drift.

  126. @JDL

    I appreciate the dialogue too. Those are fair questions and I hope I can answer at least part of them (it’s not like I know everything, of course).

    – Regarding local haplogroups being at a disadvantage over migrating haplogroups I don’t think this is always the case. It depends on the circumstances. So each case has to be examined in its own context.

    In the Paleolithic, it was common for populations to go extinct in places like North Eurasia. Survival was complicated. The most likely is that each wave of migrants who came didn’t even meet the previous ones, since they may have been a gap between them already. They were all HGs, and there’s no reason to think that incoming ones had higher numbers that local ones. In fact, if locals had been successful, they would outnumber the migrants. But for the most part those local populations were gone before the migrants would come in.

    The next people bore G, in medium numbers, but higher numbers than the previous group.

    Here is where things start to change. The people who bore G had already made the leap to food production. They were farmers, and as such they could have larger communities than HGs, and more communities in a given area. This is why this G bearing population outnumbered the previous I bearing one. It’s an expected outcome.

    The next people bore R1, in greater numbers than the previous group.

    In this case, this was completely unexpected. The populations from the steppe (who were largely R1) were pastoralists since they lived in an area were crops didn’t work well. Pastoralist populations are small. In order to have a large population you need crops. Egyptians would never had been able to build their temples and pyramids if they had to feed the workers with meat. It was the abundance of grains that allowed them to feed the large number of people.

    So the idea that populations from the steppe came to Europe and replaced the Neolithic ones was something that didn’t make any sense at first. The idea that they were horse riding warriors never worked for me either, as it didn’t work for many others. That’s why David Anthony proposed a different mechanism for the spread of languages from the steppe to Europe (based on long distance relationships, a patron-client type of relationship, etc…). Essentially a small population with certain know-how that worked for those times was able to impose their language on a larger one.

    However, then ancient DNA came and showed that what happened was very different. The people from the steppe indeed replaced the Neolithic populations from Europe. Something that can only have one explanation: a population collapse within Europe at the end of the Neolithic. This is the only reason why the people from the steppe replaced the local Neolithic ones. In areas where the collapse was much smaller (like in the Balkans), the steppe people remained a minority and their haplogroups too.

    Again, we have seen this with REFUGEES in modern times. Palestinians have more kids than Israelis. Syrian refugees have more kids than Europeans. Over time there will be some intermarriage. This is not conquest though: it’s just recent emigres + demographics.

    Yes, this is mostly correct. Usually migrants (in modern societies anyway) have more children than local populations. That’s usually because migrants are the lowest class too. And so, they grow faster. However, this is a process that goes slowly and has a very different pattern from those described above. It’s a different case.

    Why is this NEVER mentioned as a mechanism for the spread of chromosomes? I’ll tell you why: people have pseudo-racist fantasies for their own chromosomes. Davidski’s self-esteem seems so tied to R1a. Many male commenters from Western Europe — who just happen to be R1b — feel the same way about R1b. It’s hogwash.

    Personal preferences have plagued the forums dedicated to ancient DNA, where people would come first because they took a genetic test and found out to belong to this or that haplogroup and thought it had a different meaning of what it really does, so they “defend” their haplogroups and want them to be better.

    I personally don’t have any interest in haplogroups (or genetics in general) other than using the data as a tool that can help us to understand history. And fortunately, this is the case of the professionals working on this field. Their conclusions have nothing to do with their own genetics or any personal preference. The limitations in their case comes from their different backgrounds, as I commented in a response to another user a bit earlier in this thread.

    Moreover, even if the scenarios were as people described, it would be like a Mexican mestizo claiming to be a Spanish conquistador. It’s just not accurate.

    I don’t know of any modern Mexican claiming to be a conquistador, but if they did it would certainly not be accurate. Though some Mexican can claim to descend from the Spanish conquistadors, and maybe that’s true.

    In the case of Europe, we’re the mestizos who descend from the populations from the steppe who repopulated most of Europe at the end of the Neolithic. Since those were not conquerors in any significant way, all we can claim is to descend from those migrants, which is essentially correct (though we can add that from a genetic point of view, we descend from the Anatolian farmers just as much or more, given that the genes of those Neolithic people were preserved through the admixture that the steppe people got from them as they moved into Europe).

  127. @JDL

    Some of the HIGHEST “Steppe autosomes” are in Scandinavian people. AND YET, Scandinavian males bear some of the highest levels of I1, a very minor and local Paleolithic heritage, totally ABSENT from Steppe populations, which seemed to expand magically with the advent of population growth due to lactose persistence or whatever.

    Yes, once a haplogroup enters into a population it becomes part of it. Whatever happens after with the frequency of that haplogroup is already an intra-population event and can’t be linked to the population from which the haplogroup came originally. Imagine that a group of those pastoralists that settled in Scandinavia found a HG woman with a child and decided to take them into the tribe. The child (a male) had haplogroup I1. He then married within the tribe, then his some married with women from a neighbour clan from the pastoralists too, etc… So after 500 or 1000 years that haplogroup has grown in frequency for some unknown reason (lucky circumstances or selection due to being linked to some genes that gave an advantage in survival rates). This is something that has nothing to do with the HGs from Scandinavia from when that first child came from. Therefor, it’s historically, culturally and linguistically irrelevant.

    Some of the highest “EEF autosomes” are in Sardinian people. AND YET, Sardinian males bear some of the highest levels of I2, a Hunter Gatherer haplogroup, largely absent from Early European Farmers. As posted above, I2 was a Hunter Gather hg, and is also present in several Yamnaya samples (alongsides R1b). In other words: the people with the most EEF autosomes have a haplogroup that is either WHG or Western Steppe Herder.

    Yes, this is another of those cases. I already mentioned it in the post. Throughout the Neolithic, haplogroup I2a rose in frequency within the Neolithic populations of Europe. It originally came from the European HGs, but entered the gene pool of the farmers early on. After 2-3 thousand years it became the most prevalent one displacing G2a to a second place. We don’t know the exact reason why, but it’s not relevant from a historical, cultural or linguistic point of view.

    The highest Steppe autosomes are in the Saami. AND YET, they bear the highest levels of Haplogroup N in Europe — again, largely absent from the Yamnaya etc. who were mostly R1b with some I2a.

    Yes, haplogroup N has very successful in populations from Northern Europe where it initially didn’t belong. It’s a Siberian haplogroup that for some reason has risen in frequency among populations to the east of the Baltic sea, regardless of their cultural or linguistic affiliation (it’s common in Estonia, Latvia and Lithuania, as well as in many northern Russians and Uralic and Turkic populations from North Eastern Europe). It’s again a similar case as the two described above, without any particular relevance from a historical point of view (though unfortunately for Uralic speakers, people have insisted in linking the origin of Uralic languages to the origin of this haplogroup even though there’s no good reason for doing so).

    There are more cases like the ones above, some which have not caused much confusion in the field (E1b-V13 in the Balkans, for example) while others like in the case of N1c have caused a great deal of confusion (R1a-L657 in India, for example).

    Tell me again this isn’t the result of randomness, founder effect, and genetic drift.

    It is the result of randomness, founder effects and genetic drift, and maybe selection too. You are right in that.

  128. I have updated the post with a few references in two places:

    – In the part about Central Asia and North India I have added some more quotes and links to make it more explicitly clear that the archaeology in which the steppe hypothesis relies, which comes mostly from Elena E. Kuzmina, is clearly outdated and rejected my modern archaeology.

    – In the final part about the expansion of IE languages from the Balkans to the rest of Europe I have added some quotes and links from papers dealing with the Nordic Bronze Age. Thanks to Jaydeep for pointing me in that direction.

  129. I had not read this paper about the genetic prehistory of Denmark before, but it shows the same as any other place in Northern and Western Europe:

    Allentoft, M.E., Sikora, M., Fischer, A. et al. 100 ancient genomes show repeated population turnovers in Neolithic Denmark. Nature 625, 329–337 (2024). https://doi.org/10.1038/s41586-023-06862-3

    They find a near total replacement first between the Mesolithic and the Neolithic, with low if any input from local hunter-gatherers. And 1000 years later, with the transition from the Funnelbeaker Culture (TRB) to the Sigle Grave Culture (SGC) again they find a near complete replacement:

    “Insights from a few low-coverage genomes have indeed shown a link to the Steppe expansions, but by mapping out ancestry components in the 100 ancient genomes we now uncover the full impact of this event and demonstrate a second near-complete population turnover in Denmark within just 1,000 years. This genetic shift was evident from PCA and ADMIXTURE analyses, in which Danish individuals dating to the SGC and Late Neolithic and Bronze Age (LNBA) cluster with other European LNBA individuals and show large proportions of ancestry components associated with Yamnaya groups from the Steppe (Figs. 1 and 3 and Extended Data Fig. 1). We estimate around 60–85% of ancestry related to Steppe groups (Steppe_5000BP_4300BP), with the remainder contributed from individuals with farmer-related ancestry associated with Eastern European GAC (Poland_5000BP_4700BP; 10–23%) and to a lesser extent from local Neolithic Scandinavian farmers (Scandinavia_5600BP_4600BP; 3–18%)”

    The transition is again quite abrupt:

    “The age of the Gjerrild skeletons (from around 4,600 cal. bp) matches the earliest example of steppe-related ancestry in our current study, identified in a skeleton from a megalithic tomb at Næs (NEO792). We estimated around 85% of Steppe-related ancestry in this individual, the highest amount among all Danish LNBA individuals (Extended Data Fig. 6a). Notably, NEO792 is also contemporaneous with the two most recent individuals in our dataset showing Anatolian farmer-related ancestry without any steppe-related ancestry (NEO580, Klokkehøj and NEO943, Stenderup Hage) testifying to a short period of ancestry co-existence before the FBC disappeared—similar to the disappearance of the Mesolithic Ertebølle people of hunter-gatherer ancestry a thousand years earlier.”

  130. The debate about the origin of R1b has always been important for understanding the genetics of western europe. JDL is right in the sense that Harvard jumped to conclusions 10 years ago when it claimed that R1a and R1b brought Indo-European languages to mainland Europe, when there were already known cases of R1b in some european regions.

    Since then, pre-Yamnaya R1b has continued to appear in both WHGs and EHGs, but until proven otherwise the oldest R1b-L754 remains Villabruna and everyone knows that Italy is in western Europe. The conclusion is simple: all the R1b lineages that appear later, descend from the Mesolithic Western, Baltic or Balkan HGs.

    It is also important that people assimilate once and for all that the first two samples of M269 we have are neither in Russia nor in Ukraine but in Bulgaria and Romania (Carpathian Basin) in deposits of the Gumelnita culture.

    The non-existence of M269* in the steppes until the Maykop culture has to make us reflect on the origin of L23*, and common sense tells us that this lineage could have arisen in the Baltic or the Balkans and that the overwhelming presence of Z2103 in the steppes is due to a massive founder effect, while its sibling L51>L151 did not share culture with them.

    In my opinion, the conclusions are obvious;

    1-The paternal lineages of the Yamnaya culture have their origin overwhelmingly in the WHGs (and their variants in the Baltic and the Balkans).

    2-Ergo, if we accept the theory of patriarchy in the transmission of language (i.e. in prehistoric patriarchal societies it is the male lineages that ensure the transmission of language), the Yamnaya culture had to speak the language of the European hunter-gatherers.

    3-The presence of M269* in a typical culture of Old Europe complicates this interpretation because both this marker and its descendant L23 had to speak the language of the EEFs, unless the farmers had adopted the language of the HGs (quite unlikely in my opinion).

    In any case, these reflections do not affect Alberto’s theory regarding the language spoken by Yamnaya and the CWC since there is no genetic continuity through the male line between these cultures. In my opinion Yamnaya was Indo-Europeanized by Majkop and both the CWC and the BBC and their descendants spoke Non-Indo-European languages.

    It should be noted that R1b-L151 not only did not appear in Yamnaya, but also did not participate in the hypothetical Indo-Europeanization of any region of the world where this language family was spoken. That is to say, there is no L151 neither in the Balkans, nor in Greece, Eastern Europe, the Baltic, Central Asia, Anatolia, Iran or India. With the data we have today, attributing to this lineage the Indo-Europeanization of Western Europe is a bad joke.

    Kurganist despair has degenerated into an absurd theory that is limited to stating that the transmission of Indo-European languages ​​occurred when a marker that has been found in the steppes appears in any of the Indo-European regions (Z2103, V1636, I2a-L699, etc.) although its presence is testimonial. They even go further saying that 2, 5 or 10% of hypothetical steppe ancestry is sufficient proof of the transmission of a language.

    What has happened to the linguistic principle that massive migration or colonization is necessary to produce a language change in a society?

    The first R1b-L51>L151 in Bohemia or Switzerland are buried in tombs typical of the Neolithic cultures of those regions (even in collective burials such as Swiss dolmens), that is, they are outliers, solitary explorers. What possibilities did these adventurers have to impose their language on the Neolithic societies that welcomed them. In my opinion none.The Indo-Europeanization of Western Europe by L151 is a Kurganist chimera, unless as descendants of the WHGs they spoke a kind of Proto-Indo-European of Mesolithic origin.

  131. Hello Alberto,

    Have you seen the pre-print released just this month at https://www.biorxiv.org/content/10.1101/2025.02.03.636298v1.full.pdf and entitled “Ancient DNA indicates 3,000 years of genetic continuity in the Northern Iranian Plateau, from the Copper Age to the Sassanid Empire” by Amjadi et al?

    It contains salient remarks such as (pp. 16-17):

    “Stability in the Historical periods of the Northern Iranian Plateau

    The CHG and EN Iranian ancestries became intermixed by the end of the prehistoric periods, consistent with observations across different periods13,14,52-56. These ancestries persist at levels of around 45-51% in the IA groups of the northwest Iranian Plateau (Hajji Firuz, Hasanlu, Dinkha Tepe sites) in our supervised ADMIXTURE analyses. Furthermore, they remain consistently predominant in the northern Iranian Plateau during the historical period, including the Achaemenid to Sassanid era burial sites of Marsin Chal, Liarsangbon and Vestemin. Most of these individuals can be modelled with CHG as a single source in both individual and group-based qpAdm models (including the combined Iran Historical group). However, the Liarsangbon individuals require an additional western component (16-26% ANF, Neolithic Armenia or Chalcolithic Israel). This finding is further supported by significant f4 results in the form of f4(Liarsangbon.SG, MersinChal.SG, ANF, Mbuti.DG) and aligns resembles the Liarsangbon site’s closer location to the western Iranian Plateau (Supplementary Table S11). Additionally, it also coincides with the cultural differences of Liarsangbon, evidenced by the discovery of non- local artefacts of Egyptian origin from the Roman times (Figures 3-4)57.

    Evaluating other possible source populations, we demonstrate through f4-statistics in the form of f4(CHG, Test, Samara_EBA_Yamnaya, Mbuti.DG) and qpAdm models, that the BA Steppe affinities is only apparent due to shared CHG-related ancestries, which were previously defined in the BA Steppe communities52 (represented in our dataset with Samara_EBA_Yamnaya, Supplementary tables S11,13). The AHG-type ancestry detected in ADMIXTURE persists into the historical period. Moreover, in some deep ancestry qpAdm models of the historical individuals, the AHG (Onge) component reaches detectable thresholds.

    After assessing the distal ancestry of the historical-period individuals (Marsin Chal, Liarsangbon, Vestemin), we modelled their proximal ancestries using a more focused approach. On the PCA, ADMIXTURE, and f4 analyses (Figure 3-4, Supplementary Figure S11.2-3, S11.6-8), these groups display ancestry patterns similar to those of the northeastern Iranian prehistoric samples, as well as to the BA Gonur and Sapalli Tepe from Turkmenistan and Uzbekistan. In accordance, the qpAdm models indicate a shared genetic ancestry of the historical period samples with the prehistoric northeast Iranians (Chalcolithic Tepe Hissar, BA Shah Tepe) or South-Central Asians (represented with Chalcolithic Geoksyur and BA Gonur group 1). Our analyses reveal that everyone out of the seven historical-period individuals yielding sufficient genome-wide data can be modelled either with one of these prehistoric groups as a single source or in a two-way model with an additional western Iranian source. In group-based analyses, shotgun genomes, such as those of the Marsin Chal and Liarsangbon groups (n=2 in each) can also be modelled with the same sources. However, the proportions of the northwestern IA components with elevated ANF/Levant-related ancestries are more prominent in Liarsangbon (Supplementary Table S14). IA Hajji Firuz is the only additional western source that provides plausible models for all historical groups. Using this source, Marsin Chal can be modelled with 78% BA Shah Tepe and additional ~22% IA Hajji Firuz, while for the Liarsangbon group these amounts are 37% BA Shah Tepe and 63% IA Hajji Firuz. The combined group, referred to as Iran_North_Historical, produces similar results and can be modelled with 52% BA Shah Tepe and 48% IA Hajji Firuz-type contributions (Supplementary Table S14). We interpret these findings as evidence that the genetic profiles of historical-period groups of the northern Iranian Plateau reflect their position along the broader east-west genetic cline, rather than resulting from specific admixture events.”

    And reiterated in the concluding remarks (p. 19):

    We demonstrated a strong Iranian Neolithic and CHG substrate in the historical-period samples from northern Iran, where these genetic components persisted in the pre- Medieval era. We confirmed the continuity from the Chalcolithic-Bronze Age into this period in northeastern Iran, despite this area hosting part of the Silk Road, which facilitated extensive human movement. Bronze Age Steppe ancestry remained relatively minor during the historical period in northern Iran. Instead, the historic period population of the northern Iranian Plateau exhibited strong genetic affinities with the Chalcolithic and Bronze Age communities of Turkmenistan, and northeastern-eastern Iran, forming homogeneous groups in our analyses as a part of the described east-west cline. As only one Iron Age genome is available from Turkmenistan, and there are none from the northeastern Iranian Plateau, further sampling is necessary to investigate the dynamics of this era, particularly to determine whether contacts between the two regions were sustained or disrupted after the Bronze Age.

    There weren’t any R1 among the Y Hgs available for the few new samples. The Achaemenid and Parthian samples were under J, there was no Y data for the Sassanid samples (Table 1, pp. 6-7).

  132. @ak2014b

    Yes, I had the chance to take a look at that preprint about Iran. I think that the results they found are what could be expected from those populations, with clear continuity since prehistory.

    Maybe some people expected to see more steppe admixture, but given that Andronovo must have been fully Indo-Iranian by around 1400 BC that would no longer be relevant from a linguistic point of view, as it won’t be to see the odd steppe sample in North India in the first mill. BC.

    As Gaska said just above, people have to stop looking for a sample that will support their view and instead look at the whole of the evidence, because these things have been very clear for very long already.

»


Leave a Reply

Your email address will not be published. Required fields are marked *