West Iranian vs. East Iranian ancestry (with Vahaduo’s tool tutorial)

As you all know, there have been two aDNA papers released recently about Central Asia to North India. I didn’t dedicate a post to them (there are comments in the previous thread about them, though), mostly because the first one (The formation of human populations in South and Central Asia, Narasimhan el al. 2019) had already been extensively commented when the preprint was out, and while it did bring more samples these mostly add quantity to already sampled populations with few new ones (and not relevant enough to deserve a new post), while the second one (An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers, Shinde et al. 2019) finally brought the first ancient sample from within modern India, but it was only one low quality one that didn’t add much to the better quality “Indus periphery” ones already present in the former paper.

However, there’s still a bit of confusion regarding the ancestry to the people of the Indus Valley (and generally to the genetic structure of SC Asian populations), so here I’ll try to give some insights that might help to clarify the situation for further, better informed, analysis.

The basic premise here would be to split Iranian ancestry into West and East Iranian. The main difference would be the ratio of Basal Eurasian to ANE ancestry (higher in the west, lower in the east), but given the lack of Mesolithic samples we’re still unable to get the whole picture. However, some basic concepts can still help us to better understand the situation. So let’s start.

Vahaduo’s online modelling tool

And I’ll use his post to introduce a recently released online tool that deserves more attention, given its quality and usefulness. It’s been written by Vahaduo, with a similar purpose to my own Xmix, but more complete, faster and not requiring any local installation. So I’ll use this post to show how to use it for any of the readers to be able to try their own models and be able to test for themselves whatever they are interested in.

The first (and only) thing you’ll need is to get some datasheets that are valid to use with Vahaduo’s program. The best (and recommended) ones being the Global 25 scaled datasheets from Eurogenes. One will have all the individual samples and the other the averages of each population. Ones you have these, you can proceed to the site and start testing. Here we’ll go directly to test what I mentioned above: East vs. West Iranian ancestry.

For West Iranian ancestry, I’ll use the average of the Early Neolithic samples from the Ganj Dareh site in the Zagros mountains. And for east Iranian I’ll use the average of the easternmost samples we have so far: Sarazm_Eneolithic. So I’ll need to copy the coordinates of these in the “SOURCE” tab (one per line):

Now, two sources will probably not be enough to test the samples from SC Asia and Indus periphery, since there are more streams of ancestry in them (at least one related to ANF and the other to AASI). So I’ll go ahead and add the average of Barcin_N samples and the average of modern Onge and Naxi populations.

Then for the targets, I’ll use individuals instead. In this case I’ll start with the “Indus periphery” samples, which are labelled in the datasheets as IRN_Shahr_I_Sokhta_BA2 and TKM_Gonur2_BA, so again one per line I copy and paste them in the “TARGET” tab:

And now we’re ready to run the program and get the results. Since we’ve added multiple target samples, we should go to the “MULTI” tab and click on the “RUN” button, which will show us this:

As you see, Naxi doesn’t appear in the results, and that’s because all the samples got 0% ancestry from it. If we wanted to see all the sources in the output, we’d just have to click on the “PRINT ZEROES -NO” button (which would change to “PRINT ZEROES – YES”) and click “RUN”. The “AGGREGATE – YES” button is to aggregate the percentage of multiple sources with the same label (for example if instead of using the average of Ganj_Dareh_N we would have used all the individuals as sources, we would choose to either see the results with each individual specified or to aggregate them into a single column with the sum of them).

Then we can download a .CSV file to import it into a spreadsheet and make further calculations if needed (or for sharing purposes using Google Docs, for example). The “DISTANCE” tab is also useful to calculate the distance between a sample to all the sources (you could copy for example the whole datasheet, being careful not to copy the first row with the PCA labels) and get the top 25 closest samples/populations.

It just takes some minutes to get familiar with the program and the options so go ahead and try it. It’s definitely a very useful tool.

Some insights into SC Asian and Indus Valley ancestry

So let’s start with what we see in the above model of the Indus periphery samples. Leaving (for now) aside the fact that they may have some recent admixture from the places where they were found, one striking thing is the very variable ratios of West and East Iranian ancestry. In the following spreadsheet the above results can be seen (Sheet 1) together with a second run with the 100AHG simulation provided by Matt in the previous thread (Sheet 2), and in both the the calculated ratio of West to East Iranian ancestry. It’s easy to see that there is no correlation between that ratio and the amount of AASI in each sample, which makes it irrelevant for this matter whether they have any admixture from the local populations or not. Either way, we’re seeing a diverse population not just in terms of AASI to West Eurasian, but in the more sutle, but still important, West to East Iranian ancestry.

This pattern of significantly different ratios in West and East Iranian ancestry is equally seen in the regular Shahr-I-Sokhta BA samples (Sheet 3) and in the Turan Eneolithic samples (Sheet 4). The Iranian-like ancestry in the Indus periphery samples is therefor very similar to the one in those places. But they’re not all homogeneous, and point to mixed populations with probable input from West Iran. Modelling the samples with more proximate sources and using the 100AHG simulation again, it looks like this:

And using the sample with the highest amount of AASI as a source instead of the 100AHG simulation, something like this:

So what does this mean? First that things are a bit more complicated than getting the average of a population and building a tree estimating the divergence time from another one under an assumption that there is no admixture between them just because they’re not the same. In more simple words, we can’t really know with certainty if there was some migration from the Zagros Neolithic to North India or if there was none. Both options are possible. What we can say, though, is that we’re talking about a significantly different case to the Neolithic transition in Europe, since there must not have been a large replacement by outside farmers in any case.

All of this opens some interesting questions regarding the genetic history of South Asia. Unfortunately, we don’t have the data to give any answer to those questions, but it’s still worth knowing them and the different possible answers. For example:

  • Who were the Mesolithic Hunter-Gatheres from North India?
  • Who were the first farmers?
  • Was there any subsequent migration before the Bronze Age?

Let’s break up the genetic structure of (putative) IVC samples into the 3 main streams of ancestry:

  1. AASI
  2. West Iranian
  3. East Iranian

This does not mean necessarily three different populations. Two or more of these ancestries could have been already mixed since very early. But let’s examine the possibilities:

First, a basic look at the geography of India tells us that there are no major barriers within it, compared to the big barriers with the outside. This makes less likely he possibility of two extremely different populations during the Mesolithic living in South and North India, and the one in north India being almost identical to the ones outside (Iran and Turan). It could be (only aDNA can tell us), but it looks like the least parsimonious.

Together with the diversity in the ratios of those 3 streams of ancestries, it’s really unlikely that we could be talking about an isolated India-specific population. We have to think in terms of some degree of migration to India from outside before the Bronze Age.

The possibilities about who was where at each point in time are many, and I won’t argue for any of them. It’s speculative at this point. But as possible examples:

We could have a AASI-rich population, but with significant East Iranian ancestry too during the Mesolithic. Then we could have a moderate migration from the Zagros Neolithic and no more migrations up to the IVC time where we have samples. This would be somehow similar to the Neolithic transition in Turan, where presumably a mostly East Iranian population was there in the Mesolithic and received some migration from West Iran during the Neolithic transition. The difference (apart from the lack of AASI ancestry in Turan), is that the communication between West Iran and Turan is easier, and gene flow continued (both ways) throughout the Chalcolithic and Bronce Age.

The problem with this scenario is how to explain the presumed differences in levels of AASI in the IVC and their lack of correlation with the East Iranian ancestry that would have been associated with it.

Scenarios were we separate the three streams of ancestry could better explain the situation, though given that Turan Chacolithic had already a diversity in East and West Iranian ancestry ratios that could serve as a single migration too (note that neither West Iran Chalcolithic or Turan Bronze Age would fit well as admixing sources due to their excess of ANF-related ancestry). I’ll leave to the comments any further variations within these constraints.

The Steppe ancestry in Turan and North India

This subject has already been discussed everywhere in great detail, for a very long time. So I didn’t plan to look at it again. I don’t have much more to say, but I’ll go through it fast.

The post BMAC samples that we have hardly show any steppe admixture. In the same spreadsheet linked above (Sheet 5), I’ve added the samples with an average date in calBP of  <3700 years in descending order (note that the Present is defined as 1950 CE, so you’d need to add 69 years to get the real BP as of today). There’s one Parkhai_LBA_outlier (1497-1413 calBCE) that shows 9.2% Sintashta_MLBA admixture. The rest until the last BA samples (3250 BP) are in the noise levels. It’s only the single Iron Age sample from Turkmenistan (912-799 calBCE) that has a big increase to 50%.

In the Swat Valley, we have the earliest samples from the period 1200-800 BCE. They have significantly more steppe admixture, ranging between 20% and 0% and an average of around 10%. The variability of the amount of steppe ancestry doesn’t seem very compatible with their estimate of admixture happening 26 generations before in that same place, in that same population. But the shortcomings of their observations that provide evidence of the arrival of steppe ancestry to South Asia in the first half of the second mill. should have been already evident without looking at individual variability with up to 0% levels.

Another of the inferences for supporting such evidence was their observation that after he MLBA the steppe got Siberian/East Asian admixture, which is not found in modern India. However, they could model modern population using the Kangju samples from Kazakhstan (II-V CE). Modern samples are never a good way to make inferences about prehistory (including modern frequencies of certain uniparental markers). It seems rather arbitrary why would populations choose Sintashta or Kangju (though maybe Kshatriya ones make sense),

or indeed why would they choose Sintashta_MLBA or the Kashkarchi_BA samples from 1200-1000 BCE which are almost identical,

or even adding the Turkmenistan_IA sample mentioned above too as a source (as suggested in the comments from the previous thread), which further splits the steppe ancestry in a relatively random way.

Overall not much to add about all of this steppe part. We’ll have to wait to see those samples from the first half of the second mill. BC around the Punjab before we can know with certainty how all this went.




164 thoughts on “West Iranian vs. East Iranian ancestry (with Vahaduo’s tool tutorial)

  1. Thanks Alberto. Vahaduo’s modelling tool is cool. I myself, won’t try and tackle the complexities of the present-day Indian population structure, instead using two of the simulations I generated and ancient pops, I’ll look at the InPe and Swat/PAK_IA:

    #1: Modelling Indus_Periphery, then Swat, using my 0AHG and 100AHG sims and a set of later steppe and Central Asia populations: https://imgur.com/a/8YgJvPg

    (The set of pops in this scenario was: KAZ_Kumsay_EBA, RUS_Sintashta_MLBA, TJK_Dashti_Kozy_BA, RUS_Catacomb, KGZ_Aigyrzhal_BA, KAZ_Kangju, TKM_IA, IRN_Hajji_Firuz_BA, IRN_Hajji_Firuz_IA, IRN_Hasanlu_IA, IRN_Tepe_Hissar_C, TKM_Gonur1_BA_o, TKM_Gonur1_BA, IRN_Shahr_I_Sokhta_BA1, IRN_Ganj_Dareh_N, 0AHG, 100AHG, Gonur2BA, Shahr_ISokhta2BA. I didn’t realise about the “Show zeros” button so some won’t show up in runs, but they are being tested.)

    Indus_Periphery individuals fit fairly well as a composite of the 0AHG and 100AHG, though in fairness I did not use much of a range of Central Asian Neolithic pops, and some fit better than others, and Alberto’s analysis is quite probably the superior one on this.

    The Swat_IA set largely seem to show a preference for Kumsay_EBA+Kangju on the steppe side, and other than that not much unity. Kumsay are largely a combination of WSHG with a dose of Steppe Piedmont ancestry about 55:45, while Kangju are a later complex composite of largely the typical Steppe_MLBA (European Corded Ware like) ancestry with Turan and some low level of East Asian ancestry.

    Only the outliers Loebanr_IA_o:I12138 (who also takes a chunk of Aigyrzhal) and a sample I label PAK_Saidu_Sharif_H_o2:I6893 who seems to me to be an outlier, seem to need direct Sintashta / Sintashta like Dashti-Kozy ancestry, while others largely prefer Hasanlu_IA and Haji_Firuz_IA to pick up anything which Kumsay+Kangju doesn’t cover. On the “southern” side they don’t clearly prefer the 0AHG sim over various real pops.

    #2: Putting 0AHG in competition with the real Indus_Periphery set: https://imgur.com/a/uYJO4k1

    It seems that the 0AHG sim is universally dispreferred to the real Indus_Periphery pops (does not contribute). However, some populations do fit with some degree of better combining my AHG sim with real Eastern Iran+Turan populations, rather than use the real Gonur2 / Shahr_I_Sohkta2.

    The dominant proportions of the steppe related ancestry from Kumsay+Kangju seem pretty robust to including the real Indus_Periphery.

    #3: Removing both my sims: https://imgur.com/a/wi86pPT
    That tends to swell Shahr_I_Sokhta2 as the biggest potential contributor of AHG/AASI. It also results in a slight rebalancing of ancestry from Kumsay->Kangju, but largely doesn’t change too much.

    (#4: Though I hadn’t planned to do this when starting this post, some modern “cline terminal” populations, by no means an exhaustive analysis – https://imgur.com/a/2abV0pv. Most populations tend to swerve for IRN_IA / KAZ_Kangju_IA / TKM_IA contributors over direct Sintashta ancestry, except of Ror who get a relatively good fit with it, and Balochi and Brahui who seem to like a composite of relatively direct Ganj_Dareh with some Sintashta related, as well as a dose of Iran_IA, and are relative resistant to the 0AHG sim that represents an idealised Indus_Periphery_West with 0AHG.)

  2. “In the Swat Valley, we have the earliest samples from the period 1200-800 BCE. They have significantly more steppe admixture, ranging between 20% and 0% and an average of around 10%. The variability of the amount of steppe ancestry doesn’t seem very compatible with their estimate of admixture happening 26 generations before in that same place, in that same population. But the shortcomings of their observations that provide evidence of the arrival of steppe ancestry to South Asia in the first half of the second mill. should have been already evident without looking at individual variability with up to 0% levels.”

    Totally agree. Their 2000-1500bce dating for steppe entry into NW south asia, along with the idea that the steppe folk chose to skip BMAC on their route seems very random.
    If you plot the location of R1a and their radiocarbon dates on a map for Iran, Uzbekistan, Turkmenistan, Tajikistan, Kazakhstan, Uzbekistan, Pakistan – you will see a clear path from North to south starting around 2000bce in North Kazakhstan to 1400bce kokcha uzbekistan, to 1200bce kashkarchi, uzbekistan to couple of samples in 1000-800bce swat valley.
    There is one kokcha, south uzbekistan R1a sample which is poorly dated 2500-1500bce, but maybe thats closer to 1500bce as the other r1a sample from same place is dated 1400bce.

    They use ALDER admixture dating to push forward their point of a 2000-1500bce entry, but it seems to me that the R1a dominant male population entry into south asia could be rather late.. otherwise it looks like a female mediated entry in the samples found so far.. we do find 3 steppe women in dashty_kozy dated to 1500bce on their way south.

  3. @ Alberto

    With regard to Neolithic South Asia, it is hard to have solid opinions, even if at present I find the suggestion that Farming has a completely native origin in north India somewhat surprising. Did you note that Hotu Belt Cave sample has now been dated to 10,000 BC ?
    And in your analogy- ”What we can say, though, is that we’re talking about a significantly different case to the Neolithic transition in Europe, since there must not have been a large replacement by outside farmers in any case.” I would point out that this is not in fact the case, despite academic papers making such a claim. This is because such views are based on looking at LBK, who are the immigrants (from the Aegean) which moved into their own niche which was initially sparsely populated by HGs; followed by 2-phase miscegenation

    As to the ‘steppe ancestry progession’ in South Asia, I agree that some of the claims are unconvincing. Instead, the impact should be understood in at least 2 mechanisms.:

    1) initial contact between steppe pastoralists & BMAC groups c. 2000 BC
    – this was limited & ritualised
    – mostly female mediated (as we see no R1a in BMAC chiefs)
    – related to a diverse range of steppe groups incl. Dali-EBA-like, to Yamnaya like to Andronovo like. These would have been all quite distinctive linguistically & culturally, so they cannot be conflated under one rubric.

    2) Then we see the actaul ‘spillover” beginning after 1500 BC, with an actual range expansion of steppe pastoralists predominantly of the Sintashta-derivation , moving beyond their initial territorial domains.

  4. Yes, the data about the steppe progression that we have so far is what it is, and not what it’s said to be. We’ll see with further data.

    The Neolithic is not too well researched so far in South Asia, at least as far as I know (maybe there are more works not published in English yet, or that I just don’t know about), so it’s hard to say much. But it’s still a fundamental part of the history and so I wanted to start with some preliminary observations about it. The transition to the Neolithic and the transition to the Chalcolithic (which seems to take off rapidly and we soon find a large civilization emerging) are two very interesting subjects.


    Yes, there are some mountains all over central India, but do you think they are major barriers for more or less continuous contacts? The word may mean “obstructor”, but this is what the link to Wikipedia says:

    The Vindhya Range (also known as Vindhyachal) (pronounced [ʋɪnd̪ʱjə]) is a complex, discontinuous chain of mountain ridges, hill ranges, highlands and plateau escarpments in west-central India.

    What is your take in this matter? Do you think that AASI-rich populations of hunter-gatherers were confined to the south until the Chalcolithic/EBA when thy started to move north and overtake the fully West Eurasian farmers from the north? That seems a strange proposal, no?

  5. “What is your take in this matter? Do you think that AASI-rich populations of hunter-gatherers were confined to the south until the Chalcolithic/EBA when thy started to move north and overtake the fully West Eurasian farmers from the north? That seems a strange proposal, no?”
    Although I wasnt asked, Id like to put forth a few points. Thanks.
    Before the rakhigarhi sample was published, the widespread consensus was that AASI dominated the north of India. However, the rakhigarhi and Indus periphery samples are Indian_IranN dominated, rather than AHG/AASI. This should be the first setback to the consensus and make them reassess.

    Also, there is hardly any ancient population in the north that needs to be proximally modeled with AHG as source. some indus_periphery sample + steppe_mlba is good enough for most swat iron age pops. so there’s no need for extra AHG between 2500bc rakhigarhi and 1000bc to model the swat valley populations. correct me if Im wrong here. this is what Im basing my claim on –

    “We next characterized the post-2000-BCE Steppe Cline, represented in our analysis by 117 individuals dating to between 1400 BCE and 1700 CE from the Swat and Chitral districts of northernmost South Asia (Figs. 2 and 4). We found that we could jointly model all individuals on the Steppe Cline as a mixture of two sources, albeit different from the two sources in the earlier cline. One end is consistent with a point along the Indus Periphery Cline. The other end is consistent with a mixture of ~41% Central_Steppe_MLBA ancestry and ~59% from a subgroup of the Indus Periphery Cline with relatively high Iranian
    farmer–related ancestry ”

    On the other hand, the richest AASI group is still 40% Indian_IranN. This leads to the speculation that Iran like ancestry wasnt restricted only to the north either.

    All in all it conveys to me that pure AASI was scant in the north in 5000bce, probably existed from Madhya pradesh (below vindhyas) till south during that time.

  6. ancient indian literature is pretty clear about the divide and there is no evidence of linguistic replacement north or south even with all the genetic mixing

    south indians likely had oversea links from sumer to indonesia possibly australia very early so nobody confined or overtaken either

  7. Interestingly, the islanders of the Persian gulf sampled in the new Iranian study plotted close to present day North Indians. There seems to have been a complex population dynamic along the gulf starting in the Neolithic – the region seemingly connected all the early centers of civilization in Eurasia.

  8. @A

    Yes, I think the consensus you mention about North India being what was then called ASI (now AASI or AHG-related) until after 2000 BC was totally unrealistic, and already proved wrong with the first samples that came in with the Narisamhan preprint.

    But let’s leave that behind and think in realistic terms. During the Harappan period it seems that there was still a significant variation in the levels of West Eurasian and AASI. This means that there were two different populations still admixing with each other. At some point back in time, these two populations would have been exactly that, two different populations. Who lived where? Who were the first farmers and who the hunter-gatherers? Did the Neolithic bring migration from outside India? And the Chalcolithic? Those are the questions that are interesting and that will be answered eventually by aDNA. I’m not advocating anything in particular, just asking people interested in it to think about what seems to be the best explanation to get us to the IVC time with what we know.

    @raj, yes, literature may be clear, but I’m talking about preliterate societies. From Mesolithic to the Harappan period. And I’m not talking about language replacements or languages at all. Just about populations and their roles in Indian prehistory.

  9. Has anyone expanded on the relationship between Onge and Iran_Neo? IIRC, the authors of the South Asian paper modelled Onge as deriving 30% of its ancestry from an Iran_Neo population, but my guess would be that this is wrong. Could it be the other way around? Can Iran_Neo be modelled as Dzudzuana + Onge or something the like?

  10. About probable migration to India from outside before the Bronze Age, in Narasimhan et al. 2019, we read : ”Our finding, based on the sizes of blocks of ancestry (13) (fig. S59), that the mixture that formed the Indus Periphery Cline occurred by ~5400 to 3700 BCE—at least a millennium before the formation of the mature IVC—raises two possibilities. One is that Iranian farmer–related ancestry in this group was characteristic of the Indus Valley hunter-gatherers in the same way as it was characteristic of northern Caucasus and Iranian plateau hunter-gatherers. The presence of such ancestry in hunter-gatherers from Belt and Hotu Caves in northeastern Iran increases the plausibility that this ancestry could have existed in hunter-gatherers farther east. An alternative is that this ancestry reflects movement into South Asia from the Iranian plateau of people accompanying the eastward spread of wheat and barley agriculture and goat and sheep herding as early as the seventh millennium BCE and forming early farmer settlements, such as those at Mehrgarh in the hills flanking the Indus Valley (59, 60). However, this is in tension with the observation that the Indus Periphery Cline people had little if any Anatolian farmer–related ancestry, which is strongly correlated with the eastward spread of crop-based agriculture in our dataset.
    Thus, although our analysis supports the idea that eastward spread of Anatolian farmer–related ancestry was associated with the spread of farming to the Iranian plateau and Turan, our results do not support large-scale eastward movements of ancestry from western Asia into South Asia after ~6000 BCE (the time after which all ancient individuals from Iran in our data have substantial Anatolian farmer–related ancestry, in contrast to South Asians who have very little)…”

    From the Neolithic Mehrgarh there is continuity for many aspects of course, but there is a change with the Chalcolithic Mehrgarh , a change involving also burials and anthropological features of the skeletons. After that, as remarked by Kennedy,there is not a significant anthropological change until Iron age Sarai Khola after 800 BCE, of course too late for the IAMT and not involving the whole of the subcontinent.

    Here some remarks from the book of Possehl ”The Indus Civilization: A Contemporary Perspective”:-
    And also from the book of Bryant and Patton ”The Indo-Aryan Controversy: Evidence and Inference in Indian History”: –

  11. @ alberto “At some point back in time, these two populations would have been exactly that, two different populations. Who lived where? Who were the first farmers and who the hunter-gatherers? Did the Neolithic bring migration from outside India? And the Chalcolithic?”

    Firstly, many thanks for this tool. Its easy to use for a newbie like me. What distance % is acceptable?

    Maybe i didnt put my view across properly or maybe the facts i presented were wrong, in which case kindly correct me.

    1. It would be fair to say, and most will agree, that the indus periphery samples (dated 3000-2000bce) were from between the area of western afghanistan and rakhigarhi.
    2. As per Narsimhan’s modeling, the swat valley IA (1000bce) samples (which fall in the above geographic area) all fall on the indus periphery and steppe_mlba cline, with need for no extra AHG ancestry. Pic of the cline modeled by Narsimhan that swat valley samples fall on (green bubbles) https://ibb.co/1dSmpBb. Correct me if im wrong in this analysis.
    3. So it would be fair to say that at least in the swat valley and surrounding regions, there was no hidden AASI rich population (ie. AASI>50pc) which admixed between 2500 & 1000bce. Thats a huge time gap. Again, correct me if Im wrong in this conclusion.
    4. So, if theres no AASI rich population in the swat region post 2500bce, its most likely that there was no AASI rich population there prior to 2500Bce as well. Unless they were present but culled or driven out to south east.

    This leads us to the conclusion that the swat area was never AASI rich prior to 2500bce, but was IranN rich. Reich sort of agrees when he states “We say ‘Iranian-related’ because we don’t know where they lived,” Reich says. They could have lived in the Iranian plateau, but the team’s data point to them having lived in South Asia for many thousands of years before the Indus Valley Civilisation, he adds.”

    More aDNA will be great, of course and will make the picture clearer.

  12. @Nirjhar

    Thanks, that’s what I understood from the scarce available data too. Mehrgarh seems to be the only site with anthropological data from the Neolilithic to the Chalcolithic, and some sort of discontinuity is found between both.

    It seems correct too that West Iran was getting Anatolian admixture since around 6000 BC (though Seh_Gabi_LN from c. 5700 BC still have very small amount, I think), so a Chalcolithic migration from that area doesn’t fit well with the Indus periphery samples. However, Chalcolithic Turan is still a fitting source for Indus periphery samples.

  13. @A

    I don’t think I need to correct you about what you’re saying. And I agree that the Swat Valley is very unlikely to have been a place with a AASI-rich population before 2000 BC. From all the possible places in historical India that one seems the least likely of all.

    But my question is different. I’m wondering about how the Indus Cline (if it gets confirmed with further sampling from the area in question) came to be. I thought this was an interesting question regarding the prehistory of South Asia, but maybe from a South Asian point of view it’s actually an uncomfortable one. On the other hand, I don’t think that if someone was uncomfortable with finding out about their prehistory they would be here in the first place. So I don’t know what to think.

  14. if people are curious about indian genetic history why dont you put your theories and questions to the authors of the recent papers at harvard or to niraj rai directly

    they seem eager to talk these days

  15. @alberto
    if your question is why there is variability in the indus periphery samples wrt iranN & AHG, my answer is that i dont know. Maybe it has to do with how western/eastern the location of admixture is. then again, it could be social standing. Also, i dont understand why the question would make someone uncomfortable.

    Modern indian pops prefer SiS_BA2(InPe with higher AHG) over lowest AHG InPe, except for Kalash. I also found that Kangju seems to be the best steppe source for Kashmiri Pandits, for other north indians not so much.

    -Central_Steppe_Emba at Kumsay & Mereke chooses Indus_periphery_0AHG as provided by Matt over Ganj_Dareh_N in all 6 samples, if this is even a valid test. Results

  16. @a, when I use a lot more competing populations KAZ_Kumsay_EBA (which is like Mereke_EBA), tends to select Piedmont_Eneo (samples from Progress and Vonyuchka sites in the Caucasus which can used to model most of the ancestry of Yamnaya), with some Sarazm_EN, while KAZ_Dali_EBA tends to prefer Sarazm_Eneo, compared to the 0AHG zombie, but with some Piedmont_Eneo. The separate samples KAZ_EMBA prefer just being NE Asian+WSHG.

    I would guess each of these populations is a mix of Botai like WSHG like pops with different balances of the influences entering Kazakhstan at the time?

    (Graphic: https://imgur.com/a/HELwJiV – just done after the end of my other modelling upthread to see how Vahaduo behaves generally and if it replicates other fits well. It looks pretty good, but I’m not sure how it handles very distal modelling.).

    I guess it’s not implausible that if the hypothetical 0AHG population did exist, then it may have some ancestral relationship with Sarazm_EN, though.

  17. @A

    Ok, sorry, I probably got the impression from lumping together the answers from yourself and @raj, neither of which addressed the question but it seems for different reasons.

    The Kumsay_EBA samples are interesting, as they are contemporary with the earliest Yamnaya and Afanasievo. The only male is Q1a, and as Matt said above they’re mostly a mix of West Siberia/Kazakhstan hunter-gatherers and some Progress-like population. They’re too late for any sort of Iran_N admixture, so you’d need to include something like Geoksiur_En to get better models: https://ibb.co/r3tYH1Y

    Also to complement the models from the post with steppe admixture in post BMAC I run a non-exhaustive list of samples from the central steppe LBA (Sheet 1) and IA (Sheet 2):


    Also rather patchy admixture from the south, though more significant in the LBA period especially (where steppe admixture in Turan is hardly showing up). Low levels of Indus periphery too all around.

  18. Hi Alberto

    Iran Neo like ancestry was almost certainly present in India in Mesolithic and even before.
    But I am pretty sure that there was another migration from West Asia to South Asia before IVC formed. Some haplotypes there are too young to be of Mesolithic origin.

  19. the rakhigarhi paper was pretty clear about the separation timeline and indicating it was out of india flow at various times

    people should just contact the authors if the paper was not clear enough

    the genetics data matches both archaeology and some of the linguistic out of india theories so all evidences line up

    the kowtow to steppe orthodoxy was likely unavoidable for publication in a western journal

    but just the mention of the phrase out of india broke the cone of silence and western walled garden of ignorance

    various attempts to resurrect anatolian and out of iran theories however contrived are to be expected since oit basically implies romans greeks persians etc were all punjabis once haha

  20. @Aram

    Yes, that’s what I think is most likely. Rather than an all or nothing, there’s probably a mix of both local West Eurasian and other coming from migration. We’ll have to see with ancient DNA the details.

  21. @matt @alberto i was only trying to distally model central steppe emba as vagheesh did with inputs as AnatoliaN, PPN, EEHG, WEHG, WSHG, ganj_darehN & 0AHG. only that i found it pulling towards 0AHG and not Ganj_dareh.

    I agree that for proximal modeling the best sources would be different and you guys have analyzed that above.

    My question to Matt & Alberto is this – is the tool good enough to differentiate between Ganj_dareh_N & 0AHG, or are the 2 so close and the inut data so crude that the results are useless?
    For eg Parkhai_En is modeled by the vahaduo tool distally as 50% 0AHG, 42% Ganj_Dareh_N, rest Anatolia_N with a distance of 3.4%. Removing either of 0AHG or Ganj_Dareh worsens distance to 4.5+%. Can we conclude that Parkhai EN contains both of these cousin ancestries?

    Another issue could be that Ganj_Dareh is older whereas 0AHG is simulated from a 3000bce sample. How does it affect the comparison between ganj_dareh_N & 0AHG especially when targets are closer to 3000bce and 8000bce respectively?

    Matt said “if the hypothetical 0AHG population did exist”
    it was my uderstanding that such a population did exist as per Shinde2019. The only question was where and at what time. Did I read it wrong?

  22. @A

    Yes, the tool is good enough to differentiate Ganj_Dareh-N and 0AHG simulation, since they are not too similar to each other. Ganj_Dareh_N is what I called in the post “West Iranian”, while the 0AHG simulation is closer to “East Iranian” (close to Shahr-i-Sokhta_BA1 samples and to the Eneolithic samples from Turan). Adding it as the target and using the whole Global 25 datasheet with individuals as sources, here’s the top 25 closest ones to it.


    All the samples between West Iran and Eastern Central Asia are a mix of Iran_N and something like Sarazm_En. There was probably a cline since the Mesolithic, but movements continued all the time, with later arrival of Anatolian ancestry too.

    So both Parkhai_En and the 0AHG simulation can be modelled as a mix of those ancestries. 0AHG can only represent one individual, not so much a population from that time, since we’ve seen that there is enough variability between samples. So I guess that populations with individuals very similar to 0AHG indeed existed over a very broad area of Iran, SC Asia and maybe North India.

    Proximate sources are usually favoured in modelling, but sometimes older ones are picked up too if they are needed for a better model, so there is no rule about the time of the samples.

  23. talageri danino kazanas and some others all have their own models

    basically some version of people left west and north west

    i have posted papers possibly dating some of this to circa 4500 BC to 2200 BC anatolia caucasus caspian etc

    but it really is european history not ours so it is upto europeans to figure out the details not indians

    as long that doesnt involve denying indian history denigrating hindus taking credit for sanskrit or playing divisive indian politics it is not any of indian business and we can go our own ways

  24. “So both Parkhai_En and the 0AHG simulation can be modelled as a mix of those ancestries. 0AHG can only represent one individual, not so much a population from that time, since we’ve seen that there is enough variability between samples. So I guess that populations with individuals very similar to 0AHG indeed existed over a very broad area of Iran, SC Asia and maybe North India.”

    Yes, this west-iranian to east-iranian cline is quite clear, from Tepe-Hissar to Parkhai to Namazga to Sarazm.

  25. https://ibb.co/jrQtTgk

    Modern North Indian pops do like Kangju more than Sintashta for steppe. Kokcha & kashkarchi are rejected.

    Only Kashmiri Pandits only settle for Kangju/Kushana as steppe source. This makes sense as Kushanas ruled Kashmir for over 2 centuries.

    So it seems likely that steppe entered india in at least 2 waves. The 2nd wave was in historical period.

  26. @A: it was my uderstanding that such a population did exist as per Shinde2019. The only question was where and at what time. Did I read it wrong?

    As I understand it, the thing with hypothetical populations is like, take a scenario like: you have populations A, B, C. They mix to create AC and BC, which then drift and then mix with each other ABC and drift more. But if you tried to extract C from ABC, you’d end up with a population AB that actually never existed historically!

    It’s that kind of thing – you may be able to extract an AB (0 AHG) by removing C (100 AHG), but such a population may never have existed historically without C.

    I think Alberto has answered on the point about Iran_N vs 0AHG.

  27. “It’s that kind of thing – you may be able to extract an AB (0 AHG) by removing C (100 AHG), but such a population may never have existed historically without C.”

    Thanks for the explanation. This is a problem only if there are 3 populations? What if there are only 2 separate populations A & C (0AHG & 100AHG, a 3rd population is not being expected). The above conundrum does not apply in this case? If C is removed, you should be left with A which has to exist in reality?

  28. @A

    Obviously if you have a population which is a mix of two other ones, then those two other ones must have existed. So under this assumption, as you said before, the question is where and when. Which ends up being the same question I was asking from the beginning.

  29. what difference does it make to europeans turan or punjab

    zagros caucasus are borderlands turan was crossroad of civilizations not a center of anything let alone high culture robust social ordering or technological innovations

    why seek genesis amongst the barbarian hordes its really a puzzle

  30. ”why seek genesis amongst the barbarian hordes its really a puzzle”
    We don’t exactly get the impression of ”barbarian hordes” when we study Vedas, Avesta, Iliad etc do we? 🙂 , those emerged when life was more or less settled & there was a civilization .

  31. @matt
    Is kangju being selected for modern pops due to the high steppe component (>60%)?
    Narsimhan theorized that ANI was such a population (with 53% steppe_mlba on the steppe Cline). So he thinks there’s such a ghost pop prior to swat valley IA which gave rise to modern indians.

    Also Im surprised as to how Kangju could not be rejected by Narsimhan, but Kushan did. Both Kangju and Kushan seem to have a similar Han component.

  32. @raj, i don’t think anyone is denying presence of civilization/settled life in india/south asia prior to the supposed steppe migration.

  33. the r1a paper supposedly claims himalayan roots and possibly out of india migration to the steppes

    so this could be archaeologically distinctive evidence in support

  34. “the r1a paper supposedly claims himalayan roots and possibly out of india migration to the steppes” — @Raj, I saw the slides from the upcoming paper in Dr Chaubey’s presentation. When do you think this migration occured ? We already have some AASI/AHG related ancestry in samples dating ~2500 BCE in North india and eastern iran.

  35. sometime before corded ware appeared on the steppes and europe with r1a and the autosome

    and the doggie haha

  36. @ Raj

    How does an IE homeland in EE denigrate Indians ? Sanskrit was obviously invented somewhere down that way
    If you propose an OIT homeland, then we need to account for all Phyla; not just Indo-Aryan. I’m open to scenarios .

  37. i meant denigrating hindus with all the labelling and spurious dating of our history

    there is nothing obvious about the invention of sanskrit and lot of debate about the various dates

    talageri and others have some models maybe right or wrong and anyway not relevant to the indian big picture since it is for europeans

    btw lets see the hindu number system referred as such in the west

    or do you call iphones as apple fedex iphones

  38. and if people check the phoenician dna paper closely the winds of change are blowing for the origin of your alphabets as well

    lots of fireworks in the decades to come

    or maybe the same old same old lets see

  39. “talageri and others have some models maybe right or wrong and anyway not relevant to the indian big picture since it is for europeans” —- @Raj, How do you explain the varying Steppe_MLBA related ancestry in modern south asians(varying along geographical and jaati cline) ?

    “How does an IE homeland in EE denigrate Indians ?” — I don’t think it matters for most indians 🙂 , however, raj might be scared that it might aggravate existing faultlines.

  40. steppe mlba doesnt matter if r1a itself is out of india

    seems to be have been mostly women anyway and i dont judge peoples marital preferences haha

    as explained it is not the ie homeland issue per se but the denigration of hindus who talk about oit and getting labelled xyz to shut down the debate

    i dont mind faultlines and hope such lines are well respected

  41. “seems to be have been mostly women anyway and i dont judge peoples marital preferences haha” —@Raj, that seems to be the case with swat valley aDNAs, however, interior india’s modern day samples show the opposite pattern.

  42. @raj

    If you have been following this blog fo some time you probably know that no one here is against anything that’s well argued and based on facts. This blog is about West Eurasian history, and not about politics. You won’t see here debates of whether the people living in the Eurasian steppe in the EMBA were European or not, because there’s not point in back projecting modern political entities into deep prehistory.

    So I hope that you can understand that your comments, the tone in them and the content, are out of place here. There are many places out there were you will find people willing to have some fight about those political topics, but not here.

    If you want to contribute respectfully to any topic being debated you’re welcome, and you can post some interesting paper that we (many of us) might not know, like the one about corded incised pottery above (interesting paper, though it’s about the possible interactions between East India and SE Asia and possibly China, and not relevant for the rest of West Eurasia as far as I can tell from my quick glance at it now).

    I welcome a speculative, but realistic enough, scenario about an out of India hypothesis if you want to elaborate. But you’ll have to deal with the questions I opened in the post in order to do that, because you’d need an out of India migration of a population that should be 100% West Eurasian (East Iranian, basically), so that should predate the admixture with an AASI-rich population that we already see in the Harappan period. And you’d need to explain why such AASI-rich population living beyond the Vindhyas (at a time when they would still be hunter-gatherers) decided to cross it and enter the urban centres in the NW part of India and had such a significant genetic impact in that urban BA population.

    If you are willing to participate here in such way (without negative attitudes against anyone, with respectful and informative comments that may provide interesting alternative views) you’re welcome. You will be received with respectful and helpful feedback, even when in disagreement.

  43. Re. Chaubey’s presentation, I’m waiting to see the paper. If the conclusions came from some amateur I would not give them a second thought, but Gyaneshwer Chaubey is a well respected scientist and has been working for many years in that field.

    From one of the slides, I kind of understood that what he proposes is that the split between R1a-Z2123 and R1a-L657 is quite older (some 2000 years) than previously estimated. But we’ll need to wait for the paper to really see what are the findings.

  44. This is a summary of Choubey’s talk:
    1. R* rooted in Himalaya , but also found in North, South, East and Central India.
    2. Indian branch is exclusively Indian.
    3. Gradient is opposite of migration hypothesis, it is going from east(Bihar) to west.
    4. Gangetic plain has highest diversity of M780.
    5. South India has highest frequency of M780.
    6. M780 is surely not from Steppe.
    7. Full continuity from origin to spanning 20kya years to modern day without a break.
    8. They are pointing out the flaw in Silva et al’s paper based on ancient DNA from David Reich’s lab, as they do not sequence full genome, they do only capture sequencing. If any one mutation is missing then the whole tree goes haywire.
    9. European branch is a cousin branch which split 6 to 10 kya. The common ancestor is not known.

    Now if you add this information with the fact that Narasimhan et al’s study already states that AASI admix happened at around 6kya then this split predates that mix, ergo, if it went out of india, it went only with a West Eurasian type ancestry.
    There is good evidence to suggest that the West Eurasian ancestry in India was having a deep presence, and atleast 10 ky without admixing with AASI. This post summarises that evidence well.

    If it went out, there are two directions it would go, one towards Western Iran/Eastern Anatolia(some evidence in above link also tweeted by https://twitter.com/NirajRai3/status/1169686739251654656?s=19) , and another route East of Caspian Sea, mixing with populations like Khvalynsk which does see an increase in Iran like admix, and is also one of the oldest sites on Steppe to show up R1a.

    We also now have a paper suggesting local origin of farming in India, which many, including Mallory have stated that, is a more important feature of PIE, than some speculation about wheel, horse etc. So that part is also covered.

    Regarding the affinities of Balto-Slavic with Indo-Iranian, is purely because of proximity and later migrations of Indo-Iranized Scythian/Sarmatians which were absorbed wholesale.

    Somebody mentioned that Paul Heggarty has an earlier dating for Indo-Iranian split, and perhaps the entire tree, based on his phlogenetic analysis, does someone have an idea what is that?

  45. I think my earlier question might be relevant with regard to Bhikshu’s comments: what’s the east Eurasian ancestry in Iran_N, and where did it comes from?

    Andamanese people are supposed to have become isolated before the Mesolithic, and their Iran_N affinity has been estimated to be about 30%.

  46. @Bhikshu

    About Chaubey’s presentation I’ll still need to see the paper before having a more informed opinion. The last point about about an earlier split is the only really relevant one in this context.

    Leaving R1a aside, the article you linked to proposes one of the possible scenarios I wrote in the post, but it does not attempt to address the problems about it. So let’s see:

    – We have an east-Iranian like population in the north from 20-15kya. And we have an AASI population from at least as long somewhere else (presumably in the south?). Then Neolithic development starts in India (let’s say as an autochthonous development) within that East Iranian-like population from the north, and continues for a couple of millennia until the Chalcolithic.

    – At this point there is a large migration out of India that brings a language shift to West and Central Asia and eventually to Europe. But somehow all the haplogroups that are from India don’t expand outside with this migration (the evidence of long split times between Indian and out of India haplogroups works in both ways: if it prevents people to have moved in, it also prevents people to have moved out). Unless that population was 100% R1a and that haplogroup was the only one that went out of India. But it disappeared fast from West and Central Asia, since it’s not found there in the Chalcolithic.

    – Meanwhile, the AASI population from the south, who were still hunter-gatherers decided that after 10-15k years without moving it was about time they moved, so they went north and mixed big time with an advanced and settled agricultural population. Apparently they replaced the R1a too with their own haplogroups from the south, because all the samples we have from South Asia lack R1a until very late (so somehow it rebounded again after almost extinction to achieve modern levels).

    All this, as you see, is very problematic. If one is able to see the problems with the steppe hypothesis, he/she should be able to see the problems with any other one too and keep the same level of critical thinking.

  47. Regarding Haplogroups moving out, R2 was found in Iranian Neolithic, and most likely expanded from India, as R2* is found in India. Haplogroup L is another one that is shared with West Asian populations, these along with R1a do give some indications.

    Why AASI was restricted only to the South or East(?), we can’t say with certainty, we would need more sampling from India. We hardly have anything from areas where R1a dominates. Though in Sanskrit literature there is a mention about a sage Agastya(probably with his clan), moving south to have a spiritual balance, and Vindhya mountains bowing to let him cross to those lands. So even though it is a semi-myhological account, it does tell about a deep memory of the people of how the spread or interaction happened, with a different kind of northern population making contacts with southern one, for spiritual/religious reasons. Also the same argument could be made about Eastern Iran and Northern India, why would they(west Eurasian pop) be restricted only to Eastern Iran, there is no reason to do so, and we also know that CHG replaced Dzudzuana on this type of ancestry’s western edge, so the origin is most likely further east.

    I am not claiming that we have all answers, but none of it is out of the realm of possibility, specially given the backdrop of a weak and failing steppe hypothesis.

  48. @alberto
    I have no special OIT hypothesis, and have no intention of proving genesis of all languages from India as such. If others want to understand origin of their language, onus is on them to do it. But we wish they dont include sanskrit and vedic culture in their 150 yr long quest akin to finding eldorado.

    There is a lot of work already done which firmly pushes the date of the vedic period prior to 2000bce. This includes the drying of the saraswati, finding fire altars at various locations, sanauli warrior culture & chariots buried, independent dating of astronomical details in vedic era literature, etc.
    Im quite convinced that the Indo Iranian loanwords in Mitanni came from NW south asia. I agree with Talageris claim that the loanwords are characteristic of the latest of RV mandalas. We now have 2 separate papers confirming that zebu, water buffalos and asian elephants from the IVC area reached Syria & the near east post 2000bce through human agency. You also see IVC migrants in SiS first, and then Gonur (2000bce), in the general direction of the near east.

    That makes the steppe ancestry moot, which anyway is female mediated (with the data we have, will change opinion if we find R1a rich samples with high steppe ancestry) , too little, too slow. Scythians, Kushanas, Huns, Greeks, Parsis all came to India and mixed, but none of them could even keep their own language, forget about imposing theirs. Why should we accept that Sintashta Steppe could do that? We dont even know what languages they spoke. All we know is that they had horses and chariots in their own homeland (no archaeo proof they brought it on their way into NW india).

    Now, if theres some evidence as to how vedic culture entered india pre 2000bc, then i’m game for that.

    As for Chaubey’s new R1a paper, it is irrelevant to the Aryan question imo. It might help in understanding later population movements. R1a does not dominate, and is not important to Zoroastrian/parsi Y haplogroups, even in their priestly caste.

  49. I think the major obstacle to OIT or Out-of-Deep-Iran/Turan is this, from Damgaard 2018:

    >PCA (Fig. 2B) indicates that all the Anatolian genome sequences from the Early Bronze Age (~2200 BCE) and Late Bronze Age (~1600 BCE) cluster with a previously sequenced Copper Age (~3900 to 3700 BCE) individual from Northwestern Anatolia and lie between Anatolian Neolithic (Anatolia_N) samples and CHG samples but not between Anatolia_N and EHG samples. A test of the form D(CHG, Mbuti; Anatolia_EBA, Anatolia_N) shows that these individuals share more alleles with CHG than Neolithic Anatolians do (Z = 3.95), and we are not able to reject a two-population qpAdm model in which these groups derive ~60% of their ancestry from Anatolian farmers and ~40% from CHG-related ancestry (P = 0.5). This signal is not driven by Neolithic Iranian ancestry, because the result of a similar test of the form D(Iran_N, Mbuti; Anatolia_EBA, Anatolia_N) does not deviate from zero (Z = 1.02).

    I do not believe the ‘tracer dye’ hypothesis that Reich and colleagues came up with, but since all extant IE languages split after the Anatolian this findings require an explanation – that would put the onus on those who propose OIT. Do samples from Turan provide a better fit for those Anatolians?

    EDIT: Namazga_CA plots closer with CHG than Iran_N does, so these might tried as a source for the eastern ancestry in Anatolia.

  50. Alberto,

    If we are looking for the possible uniparental markers that could have accompanied an Out of India migration the following may be suggested besides R1a –

    R2 – present in Neolithic Iran and is also present in modern Central Asians including the Uighurs.

    L1a – present in the Chalcolithic period in the Caucasus and also in Bronze Age Central Asia.

    Q1a/Q1b clades – some of these have shown in the more recent samples from the steppe and a recent paper on Q showed some of these lineages are perhaps rooted in South Asia.

    J2b2 – This is shared with the Aegean and could have had a spread from an Eastern origin.

    J2a – Perhaps from Central Asia into Anatolia & the Aegean ?

    R1b Z2103 – also from Central Asia where it has a sizeable presence in some native groups ?

    Among the maternal lines we have,

    M52 in a Maykop sample.

    M5a and U7 in Tarim Basin samples circa 2000 BC.

    Perhaps some W clades ?

    Besides, we have Zebu admixture all across the Near East and also into the Podolian cattle which are considered an ancient breed of steppe cattle and also includes the Ukrainian Grey Steppe.

    We also have Elephants and Water Buffalos brought into the Near East.

    We may also recall the corded ware dog genome which showed Indian/Iranian dog & wolf admixture.

    All these if looked at collectively can open a reasonable avenue of inquiry.


    As for why AASI HGs migrated North, maybe they did so after making a transition to farming ? Remember that the Eastern Gangetic plain was an independent center of rice domestication as early as 8-9 kya bp. This rice cultivation reaches the Harappans after 3500 BC as known from sites in Haryana.

  51. Yes, this is more constructive. I do understand some level of frustration when it’s been basically Western scholars who has been researching the PIE question and proposing theories that were at odds with Indian history. But that was then and this is now, and now is the time to find out the reality. For everyone.

    @A, Sanskrit is fundamental in any IE research. Many people, including from India, are interested in the origin of IE languages. You probably are interested too. So it’s not about leaving Sanskrit out of the research unless someone doesn’t want to know about it’s origin.

    I agree that there is quite some evidence that pushes Vedic to a period where it’s probably incompatible with the steppe hypothesis. I hope to have some guest post(s) that will explain some of the textual information available better than I could do.

    I don’t think that Vedic culture could possibly have developed out of India. Why would it? The Vedas were composed in India, by people who considered themselves natives to that place (except from some distant past, semi-mythical I guess, that talks about a place in the north with days that last 6 months or whatever).

    But the language that would become Sanskrit could have come perfectly from outside India (because it belongs to the same family as many others from all across West Eurasia). I guess one should try to understand this as deep prehistory, somehow like for any Spanish person (I’m Spanish) it’s very clear that our language came from the Latium and with it a great cultural impact. We know this and we are happy about it. Where was the ancestor of Latin 4000 years earlier is irrelevant for the Spanish people/culture. It’s just a thing from prehistory that’s interesting, but that’s it.

    @Marko, I don’t remember where were those stats with Anatolia_EBA. Damgaard et al. had a lot of samples, but in the analysis they mostly concentrated on a subset of them from the steppe.

    I’d like to check if they had some Turan samples there because i don’t think the evidence is against it. In fact, somewhere in North Iran, with an early entrance into Turan (and probably to India) seems to me the only alternative that is surviving all the incoming data.


    And you can’t explain Mycenaeans without a good amount of Kura-Araxes type of ancestry.

  52. @marko that seems incorrect. Namazga_En plots farthest from CHG. Indian Iran like component is closest.

    Distance to: TKM_Namazga_Tepe_En
    0.05958605 0AHG
    0.08360807 IRN_Ganj_Dareh_N
    0.13845439 GEO_CHG

    Target: TKM_Namazga_Tepe_En
    Distance: 2.9421% / 0.02942091
    37.6 IndianIranian0AHG
    32.4 IRN_Ganj_Dareh_N
    16.6 CHG
    8.2 Anatolia_N
    5.2 WSHG

    Presence of J2a1 in Mycenaens, minoans & anatolian bronze age needs some investigation imo. That imo is also an IE marker. J2 specifically, and also L dominate modern zoroastrians.

  53. @Jaydeep

    Yes, that’s a good collection of peripheral evidence to support an out of India scenario. But we need more solid evidence. If we get Chalcolithic samples from North India and they lack any AASI admixture then that opens up much better possibilities. Though we’d need to know that these were there since the Neolithic and not recent incomers from Turan.

    I still find it difficult to explain that an AASI population was from the Mesolithic (or probably Paleolithic) in the Gangetic plain while an eastern Iranian was to the west around the Indus and they stayed isolated for thousands of years. It could be, but is there any good reason to think that could have been the case?

  54. @A

    I was referring to the two-dimensional PCA, the proximity might be a result of complex admixtures of course. The aim should be to confine the eastern ancestry that enters Anatolia in the Copper Age. I suspect the reason Iran_N isn’t a good proxy might be its inflated Basal Eurasian ancestry, pulling it away from the northern regions due to the extreme divergence of said component.

    I think I’m more and more in agreement with what Alberto said here:

    >In fact, somewhere in North Iran, with an early entrance into Turan (and probably to India) seems to me the only alternative that is surviving all the incoming data.

  55. Alberto,

    As I said, the Gangetic plains had early rice farming while the more western farmers had barley & wheat. This represents a stark contrast in subsistence strategies.

    I would not say that such populations existed next to each other without genetic mixing. We may envisage a limited trickling gene flow in both directions.

    To better understand the genetic isolation we may envisage the possibility of ecological barriers. For much thought the last 100 kya, South Asia largely, except its Northwest, has remained very suitable for human and animal habitat. So the notion of major population turnovers may not apply to South Asia. However there was uniform geography and as Michael Petraglia & colleagues have shown, there was a mosaic of different ecozones existent in South Asia across the Paleolithic. Now it is certainly possible that a HG population accustomed to thrive in a particular ecozone may crossover in a neighbouring ecozone but find the new habitat much less conducive to thrive and also being inhabited by other hostile populations who are much more at home there. In such a scenario, the chances of survival of the migrant group would be minimal. Perhaps this may lead to HG populations across different ecological zones not mixing much except through a minor trickle like admixture.

  56. @Jaydeep,

    Yes, the paper about corded ware linked above by @raj supports longstanding contacts between East India and SE Asia, but not with NW India. This goes well with the idea that AASI could have arrived from the East with rice cultivation.

    I’ll have to revisit the subject of rice in the IVC. It seems it was there, but it only became important after 2000 BC? Apart from rice, does anything suggest that the IVC could have experienced such a big growth due to a rather big migration from the Gangetic plain?

  57. “I don’t think that Vedic culture could possibly have developed out of India. Why would it? The Vedas were composed in India, by people who considered themselves natives to that place (except from some distant past, semi-mythical I guess, that talks about a place in the north with days that last 6 months or whatever).”

    I agree that whichever way you look at it the Vedic culture is Indian, and there is nothing to be defensive about it. But I think, the anger comes from the way a story has been pushed on very flimsy grounds, and then a very ugly kind of scholarship built on top of that shaky foundation, to tear apart a living tradition, which was never meant for such pseudo scientific pre-historical analysis. The western scholars took the texts as-is from the tradition, arbitrarily declared interpretations as Brahmanical corruption, became modern era ‘Brahmins’ themselves, offering the ‘correct historical’ interpretation, so, the texts were passed correctly down the ages but not the interpretation, very convenient.
    The 6 months day and night claim has come from one such analysis, a tortuous extrapolation of some verses, by some overly eager enthusiasts of the “historical method” like Tilak. There is no direct reference of anything like that. The Agastya story I told is not just an arbitrary reference but very much part of legends and accounts with variations in both North and South of India, though I just used it as a reference point for future research, not a historical claim in itself.

    But, my intention is not to spam this space. I think this is one of the most open minded and genuine blog on archaeo-genetics and IE population history, than a lot of other agenda driven ones out there.

    My only contention is that we have to consider all possibilities on things like the source of East Eurasian affinity in populations like AAF. We may go for now with what data we have but not rule out further development, specially with a lack of data from older dates from an indo-european area, the size of half of Europe.

    Cheers and Peace!

  58. Kahsmiri pandits definitely get their steppe from Kangju/Kushanas. They have a Han component.
    Target: Kashmiri_Pandit
    Distance: 3.4863% / 0.03486321
    71.4 IRN_Shahr_I_Sokhta_BA2
    28.6 RUS_Sintashta_MLBA

    Much better fit with Han
    Target: Kashmiri_Pandit
    Distance: 2.3667% / 0.02366717
    69.6 IRN_Shahr_I_Sokhta_BA2
    24.8 RUS_Sintashta_MLBA
    5.6 Han

    Kushanas did rule over Kashmir for couple of centuries, at least post 0ce.

    Keeping sintashta, kangju & kushana as steppe source along with Indus Periphery source. removing Han.

    Target: Kashmiri_Pandit
    Distance: 2.1622% / 0.02162236
    56.6 IRN_Shahr_I_Sokhta_BA2
    26.8 KAZ_Kangju
    16.6 TJK_Ksirov_H_Kushan

    Removing Kangju as source, as only Kushana is attested in kashmir. However, Kushanas descend from Kangju themselves. so both models are possible fits.

    Target: Kashmiri_Pandit
    Distance: 2.4445% / 0.02444549
    54.0 IRN_Shahr_I_Sokhta_BA2
    39.2 TJK_Ksirov_H_Kushan
    6.8 RUS_Sintashta_MLBA

  59. @Alberto

    “If we get Chalcolithic samples from North India and they lack any AASI admixture then that opens up much better possibilities. Though we’d need to know that these were there since the Neolithic and not recent incomers from Turan.

    I still find it difficult to explain that an AASI population was from the Mesolithic (or probably Paleolithic) in the Gangetic plain while an eastern Iranian was to the west around the Indus and they stayed isolated for thousands of years. It could be, but is there any good reason to think that could have been the case?”

    Indeed, Indus and Gangetic plains have no special barrier between them in the northern part (in the south, there is the Thar desert), although the Indus valley genetically is much more ‘Western’ than other parts of India, especially in mtDNA as I remember. Also in prehistory, the dental analysis of Hemphill and Lukacs has given two groups: Indus valley sites and peninsular India sites (Mesolithic Ganga valley and Chalcolithic Inamgaon). But there is an exception: Neolithic Mehrgarh. Its teeth are close to those of Inamgaon in Maharashtra, they had Sundadont traits (typical of SE Asia) and few Carabelli cusp (typical of Europeans). Chalcolithic Mehrgarh is very different, also craniometrically, and closer to Harappa and Gandhara grave culture Timargarha, that are close to Tepe Hissar in Iran.
    So, the affinity with Inamgaon (that has roots in Malwa culture of Central India) suggests that Neol. Mehrgarh was more AASI, although it had at least trade with the west (turquoise, lapis lazuli, wheat). In the Chalcolithic, apparently there was a wave of more Iranian-like people. The age of Chalc. Mehrgarh (4500 BCE) matches the calculated age of mixing of IranN and AASI. I think that some IranN component must already have been in Neol. Mehrgarh, but possibly it became dominant in the Chalcolithic period.

  60. When the Iran-like component split from ancestral branch 12000 years ago, it would require the split(let’s assume it happened in Iran) component to remain in isolation away from the other Iranian ancestry atleast for 6-7000 years till it mixed with AASI. Where was it? For NW India to go from AASI dominated to 80:20 iran-like:aasi in 2600 BCE Rakhigarhi, it would need an overwhelming migration of Iranian like ancestry. Where is the archaeology for that? The Mesolithic Ganga Valley skeletons have average height of 6 feet, while the southern HGs are pretty short, so who were they?

  61. Chalcolithic Mehrgarh has new traits compared to Neolithic Mehrgarh (copper melting, use of gold, seals, beads, new stone industry, different burials, increase of wheat, appearance of oats), and the skeletons reveal a new population, the only real discontinuity in prehistoric India before Sarai Khola (800-200 BC) according to Hemphill and Lukacs. Unfortunately, at Mehrgarh there is the only cemetery found of this period, as Possehl remarks.
    Mesolithic people of the Ganga valley are isolated from other Indian samples, but they have some ‘peripheral association’ whith Neol. Mehrgarh and Chalc. Inamgaon.

  62. @giocomo
    modern balochis show good fit with bmac ancestry ie BMAC1 for Gonur. if that helps.

  63. @Giacomo Benedetti

    Yes, for now I also think that’s the most parsimonious scenario. The available data from India (archaeological, anthropological, apart from the lack of aDNA) is scarce, so we are still guessing and we’ll have to wait for aDNA to clarify all of this. But in general the idea that AASI only arrived to NW India after the Chalcolithic looks problematic, even if there have been some possible reasons outlined above.


    The split time calculated between Iran_N and the Iranian-like ancestry in India has many caveats. Besides, it was already obvious that the Iranian-like ancestry in India cannot de directly descendant from the Iran_N (Zagros) samples, simply because it’s different (even if there could be some Iran_N admixture in India).

    However, the resemblance between this Iranian-like ancestry and the whole East Iranian (including SC Asia / Turan) is very high and there’s no need for it to have diverged many thousands of years earlier (this, again, goes both ways: for an Out of India to work you do need these ancestries to be very similar, which allows an Into India too. Otherwise none of both options would be possible).

    So overall I think that a decent amount of gene flow must have happened between NW India and East Iran/Turan around the Chalcolithic (which is also the main alternative to the steppe model for these populations to speak closely related languages). The direction of it is uncertain, so we’ll wait for aDNA to know.

  64. @Alberto

    Yes, fair enough. This could very well have been the scenario, we’ll wait for further data on this.

  65. https://www.nature.com/articles/s41598-019-40399-8#Sec2
    Munda paper is available.

    The munda speaking austro asiatics arrived in Orissa in SE India somewehere between 2400 & 1200bce. They mixed with a population which had slightly less west asian (indian iran N) than modern paniya, about 22%.

    So we can at least reject the Munda substrate hypothesis for Vedic sanskrit, for one.

  66. Slightly off topic, but if anyone is interested in getting back to the y-dna questions, for visualisation I have made some plots which colour code the Swat/PAK IA-H samples by y-haplogroup (if male – colour star, if female – black triangle) and then plotted them against the Eurogenes West Eurasia 9 PCA (just as it’s the PCA that seems to be able to get most on there).

    See: https://imgur.com/a/pOZVaww

    May be a useful visual reference for anyone looking to identify when the first of some particular y turns up in Swat/PAK IA-H, in what context and at what time, and how this correlates with the main cline within these samples.

    E.g. R1a is somewhat scarce and uncorrelated with position on the cline, through first appearance at around 1000 BCE, sample I12457, certainly Iron Age and probably Buddhist/historical period, then is somewhat associated with more Central Asian related position as we get into post-Medieval period

    The early enriched steppe related Swat samples are: I1992, an male called as E1a – who is described as being in a family group with I6194, I1799, I3262, who largely didn’t make it through to this analysis other than I6194 – and I12138, a female individual

    Btw, since I1992’s higher quality first degree relative pair I1799+I3262 called as the same E1b1b1b2a as most of the early Udegram_IA males, it seems reasonably possible that I1992 is E1b1b1b2a rather than E1a.

    The plots against time are a little busy in places, unfortunately as a very large number of the samples esssentially cluster around 900 BCE.

    List at the end of gallery only includes those samples that were available on the PCA being cross plotted with. It may be worth cross checking this against the supplementary data from the paper to see if there are any more who are not on there, but the missing samples are essentially all either first degree relatives and not independent data points, and/or very low quality.

  67. Giacomo,

    With the advent of aDNA, I am not very inclined to give much importance to skeletal craniometric & dental studies which cannot give such a high resolution as genome wide data provides.

    Neolithic Mehrgarh undoubtedly had some linkages with people of Iranian Neolithic as the archaeological assemblage of both cultures testify. So how do you square this people of Neolithic Mehrgarh being AASI like. Isn’t this a major contradiction ?

    On the other hand, the linkages of Chalcolithic Iran with Chalcolithic Mehrgarh are much more tenuous. Plus, Jarrige places the Baluchistan Chalcolithic starting from 6000 BC as a major or primary regional center of innovation which then spread both westward & northward. Jarrige even disagrees that there is any Geoksiur influence at Mundigak but traces its origin from Baluchistani Chalcolithic.

    At any rate, all Iranian Chacolithic samples after 6000 BC had high levels of Anatolian Farmer ancestry, which is completely missing from Indus Periphery samples. So a migration from Iran in the Chalcolithic period has to be rejected.

    In contrast, archaeologists are unanimous that there is some proto-Elamite influence at Shahr I Sokhta and also some linkages with Namazga Chl. Not surprisingly we find Anatolian Farmer ancestry in Shahr I Sokhta samples which are not Indus Peirphery.

  68. Alberto,

    The closer links between the East Iranian Farmer ancestries in Turan and in South Asia looks quite probable. But I am not sure of when it started.

    In most of the iniparental studies I have seen which focused on markers spread between Iran, Central Asia & South Asia, such as ydna Q or mtDNA U7, invariably the deepest splits are between Iran & South Asia. Nevertheless there are some younger lineages present in Central Asia which are older than 10 kya. So do we surmise from this that Iran herder/HG ancestry in Turan is also pre-Neolithic ? If true, this will complicate matters even further.

    From around 4000 BC there are definitely signs of interactions or atleast pottery similarities between the North & South of Hindu Kush but the knowledge of this period is still sketchy and we await more research data to come forth.

  69. Jaydeep,

    When they speak about Iranian farmers having too much ANF ancestry from 6000 BC to fit as a source for Indus periphery samples, they refer to West Iran (Seh Gabi and Hajji Firuz). So I agree that a significant migration from West Iran during the Chalcolithic should be rejected.

    But in more eastern parts we have samples that can fit as sources at around 3000 BCE and later (Shahr-I-Sokhta BA, Parkhai, Geoksiur, Sarazm, Bustan, Anau, Namazga… Even some Teppe Hissar is required in the models I posted for Indus periphery).

    The case of the deep splits in uniparental markers between modern populations from India and Iran/Turan is of limited value given how unreliable the studies of uniparental markers of modern populations have been to infer what happened 6000 years earlier. Above, however, you pointed out a few that could be related between India and outside when arguing for an Out of India scenario, so I guess those same ones could work the other way around too.

    The reality is that we have poor data, be it archaeological, anthropological or from aDNA from the Neolithic/early Chalcolithic, so it’s hard to make a strong case either way. We do know, however, that there were intensive contacts from the late Chalcolithic throughout all these areas, and we know that by at least the MLBA they spoke closely related languages. We have the genetic evidence of shared ancestry in the form of East Iranian one. So all this should tell us that we should expect gene flow to have happened at a significant level.

    For now, Out of India has the problem of he putative presence of AASI, while East Iran/Turan doesn’t have that problem. *If* aDNA from Neolithic/Chalcolithic India shows that the population there was 100% east Iranian-like then things will be more even. But that’s still quite a big if. Let’s see if they don’t let us waiting for another few years before we ca get some answers about this.

  70. @Alberto thanks for lowering the barrier. I don’t have too much nowadays to do much reading, I noted recently from Nirjhar’s post in FB pictorial depictions of BMAC cattle show Bos Taurus and not Bos Indicus. Bos Taurus is pretty exotic in the IVC though.

  71. One should not forget the 3 Armenians from 4000bce with L1a1. They prefer 20% of indian iran farmer over Ganj dareh.
    This indian iran farmer like ancestry is again found in the sole female sample in western anatolia at barcin 3800bce

    Target: ARM_Areni_C
    Distance: 3.4041% / 0.03404109
    36.8 Anatolia_Barcin_N
    21.0 NW_Indian_0AHG
    20.2 GEO_CHG
    11.4 RUS_Samara_HG
    10.6 Levant_PPNC
    0.0 IRN_Ganj_Dareh_N
    0.0 RUS_Sosonivoy_HG
    0.0 RUS_Shamanka_N
    0.0 Baltic_LVA_HG
    0.0 Baltic_LVA_MN
    0.0 100AHG
    0.0 Levant_PPNB

    Target: Anatolia_Barcin_C
    Distance: 3.0835% / 0.03083526
    56.6 Anatolia_Barcin_N
    20.2 GEO_CHG
    14.6 NW_Indian_0AHG
    6.6 Levant_PPNC
    1.8 RUS_Samara_HG
    0.2 RUS_Sosonivoy_HG
    0.0 IRN_Ganj_Dareh_N
    0.0 RUS_Shamanka_N
    0.0 Baltic_LVA_HG
    0.0 Baltic_LVA_MN
    0.0 100AHG
    0.0 Levant_PPNB

  72. @Nirjhar

    Yes, the Sanauli findings are quite amazing and can change a lot of things. I guess it will still require some time until we have a more clear context of these findings. I hope they will get DNA results from this site soon.


    Yes, those Areni Cave samples are very interesting showing already a connection to the east (and to the west). If we could get more samples from that period it would be interesting.

  73. Here are the first Craniofacial reconstructions of Harappan people, coming from two individuals of Rakhigarhi ~4500 YBP:
    Craniofacial reconstruction of the Indus Valley Civilization individuals found at 4500-year-old Rakhigarhi cemetery, Won Joon Lee et al . 2019
    They apparently had typical north indian features. See the videos in the link.

  74. Does anyone know how to create a west Eurasian PCA instead of an all Eurasian one? This clusters InPe samples regardless of AHG level. I’m using PAST

  75. @ “ A”

    “Presence of J2a1 in Mycenaens, minoans & anatolian bronze age needs some investigation imo. “

    In Europe & Anatolia; J2a1 is a possible association with Hatto-Minoan languages.
    Overall it’s a South Caucasian – & spread through northern Iran:/ Turan & Anatolia/Aegean during chalcolithic & Bronze.m Age

  76. @A
    “modern balochis show good fit with bmac ancestry ie BMAC1 for Gonur. if that helps.” Thanks! Balochis should come (at least partly) from the west, because of their linguistic position in Iranian languages, anyway, it can be a sign that BMAC is ancestral to Iranian speakers.

    “Neolithic Mehrgarh undoubtedly had some linkages with people of Iranian Neolithic as the archaeological assemblage of both cultures testify. So how do you square this people of Neolithic Mehrgarh being AASI like. Isn’t this a major contradiction ?”
    I agree that there is apparently a contradiction. When I first read about the many similarities between Iranian Neolithic and Mehrgarh I was surprised, because I already knew about the South Asian features of Neol. Mehrgarh people. So, a possibility is that they had already the arrival of some (East) Iranian farmers but the dominant component was local, AASI-like, although culturally they assimilated many elements with farming and goat-herding.
    However, I have just discovered a study on “Regional variation in incisor shoveling in Indian population” that reveals that shoveling is common especially in West India (Rajasthan, Gujarat, Maharashtra and Goa) with even 85% of full shovel-shaped incisors (similar to Mehrgarh with 83-89% on the upper incisors), while in South India 91% of the subjects had no shovel at all! So, apparently this trait is not connected with South Indians. Inamgaon had 91% shovel-shaped upper incisor 1, and it is in Maharashtra, so it belongs to West India. Harappa has 55%, Timargarha in the north, instead, had only 14% (see p.282 here: https://books.google.it/books?id=Qm9GfjNlnRwC&pg=PA289&lpg=PA289&dq=sundadont+mehrgarh&source=bl&ots=7WdLf3iT1h&sig=ACfU3U1Ov_B7CzZ1wLHF0qhOTpauJTf9TQ&hl=en&sa=X&ved=2ahUKEwiuqs3di4jlAhUJr6QKHVhMCYIQ6AEwBXoECAgQAQ#v=onepage&q=shovel&f=false). So, what can be the source of shoveling? It is typical of East Asians and Amerindians, but so why is it so frequent in West India?
    On the other hand, according to another study, 72% of Rajputs have no shovel shaped incisor 1.

    Another interesting datum we can find in the same table cited above is the change in frequency of Carabelli’s trait in the first molar between Neol. Mehrgarh (26% only) and Chalc. Mehrgarh (61%). Harappa has less, 44%. Average in Europe is 65%, while in modern Isfahan, Iran, on 500 individuals even 96% had this trait!

    But there is another surprising fact: crania from Kish in Mesopotamia, dated 3000 BC, have only 24% of Carabelli’s trait:
    Interestingly, Hemphill-Lukacs-Kennedy 1991 shows that crania from Kish are close to those from Cemetery H open burials. Kish has also some shovel shaped incisors, Metal age Anatolia and curiously also 27% of Middle Minoan Knossos: http://www.royalacademy.dk/Publications/High/354_Alexandersen,%20Verner.pdf
    Unfortunately I have not found data for Iranian neolithic sites, but Jarmo, that is a Neol. site in Iraq often compared also with Mehrgarh, has no shoveling.

  77. “Thanks! Balochis should come (at least partly) from the west, because of their linguistic position in Iranian languages, anyway, it can be a sign that BMAC is ancestral to Iranian speakers.” —>

    Hey Giacomo, i tried to model Balochis using a chalcolithic western source “Iran_Tepe_Hissar_C” (which has plenty of ANF).
    Here’s how they look

    Target : Balochi
    Distance: 2.7145% / 0.02714534

    51.0 Iran_Tepe_Hisar_C
    27.2 Iran_Shahr_i_Sokhta_BA2
    17.8 Sintastha_MLBA
    4.0 Onge

    Now, if one assumes that Shahr-i-Sokhta BA2 like folks ( Eastern iranian + AASI) existed during neolithic mehrgarh, could one assume that there was a migration of Tepe_Hissar like population (western Iranian + ANF ) from further west during the chalcolithic ?

  78. @tim

    I think that using modern Balochis to infer a Chalcolithic migration is very dubious. They are recent migrants and a modern population.

    Whoever migrated to India during the Chalcolithic should already be represented in the BA samples we have from the Indus periphery, and therefore have low ANF admixture.

  79. @tim @alberto
    “I think that using modern Balochis to infer a Chalcolithic migration is very dubious. They are recent migrants and a modern population.”
    I agree. Balochis are attested in Eastern Iran in the 9th century, and it is thought they came from the Caspian region. https://en.wikipedia.org/wiki/Baloch_people#History
    The strong Tepe Hissar connection can confirm this Caspian origin. Tepe Hissar people were not recent migrants, according to Narasimhan’s paper, but were quite stable in time. Archaeologically, Tepe Hissar 3C has clear BMAC elements, then it finishes, in the 2nd mill. BCE. It is interesting that Hemphill & co. found anthropological affinities between people of Tepe Hissar 3 and Harappa (cemetery R37). Now we can attribute this similarity to the dominant Iranian-farmer component.

    BTW, Hemphill has faced very directly the Indo-Aryan issue in two much more recent papers: https://www.academia.edu/8627556/Bioanthropology_of_the_Hindu_Kush_Highlands_A_Dental_Morphology_Investigation

    In the last one, Hemphill recognizes that also his own previous model, suggesting the arrival of Dravidians in Chalcolithic Mehrgarh, does not work, because it has no affinity with present Dravidians from SE India. Instead, it has some affinity with present Kho people, Dardic speakers with a particularly archaic, even close to Sanskrit language. It is curious that he has not mentioned the possibility that Chalc. Mehrgarh was actually Indo-Iranian, but he states that the Aryan Invasion theory has no ground because there is no affinity of post-Harappans with Central Asians, except Sarai Khola that is too late. On the other hand, his study shows also affinity of Kho with Djarkutan…
    Maybe you remember the Eurogenes post on them: http://eurogenes.blogspot.com/2018/01/the-kho-people-archaic-indo-aryans.html

    The source saying they have 80% R1a has disappeared from wikipedia, do you know it? I have found only a paper on mtdna: https://www.researchgate.net/publication/331844587_Genetic_structure_of_Kho_population_from_north-western_Pakistan_based_on_mtDNA_control_region_sequences

  80. Giacomo,

    Ulahh, Olofssen et al. (2017) has a sample of 20 Kohistani Dardic speakers.

    1 G2a
    10 H1a
    1 L1
    2 Q
    1 R
    5 R1a

    Some of the Pashto speakers in the Dir region have ~80% R1a. Other Pashtun tribes seem to have ~80% G2a – I think that suggests strong bottlenecks on the Y-chromosome among those groups.

    The nomad Gujars in the region are dominated by haplogroup L1.

  81. @Alberto

    I see what you’re looking at. After going through my stuff, I am kinda leaning towards a Hyrcanian homeland even though I would like some Anatolian samples. Though, I still think PII or pre-PII came from the west.

    @Giacomo Benedetti

    “Archaeologically, Tepe Hissar 3C has clear BMAC elements, then it finishes, in the 2nd mill. BCE.”

    It’s the the other way around. Hissar IIIB (IIIB: ca. 2400-2170 cal) and IIIC (2170-1900 cal. BCE) elements (grey ware..etc) took over during the later stages of BMAC after 1800BCE. These elements were wrongly interpreted by Kuzmina and Antony as Andronovo nomads taking over BMAC. Some archaeologist also call it the Elamite influence in BMAC, which is nonesense.

  82. @Vara

    While I don’t have a very specific homeland, my preference for an early presence in East Iran / Turan comes down to one linguistic and another genetic reasons. The linguistic we mentioned before already and it’s the difficulty of explaining Tocharian with any other model. The genetic one is shown in the post, where any significant migration to India within a PIE time frame c. 4500 BC must be from an Eastern area given the low Anatolian admixture in India.

    But this is a very generic idea and not something I would argue strongly for. Still waiting for aDNA to answer some questions before going deeper into the problem.

  83. @Vara

    ““Archaeologically, Tepe Hissar 3C has clear BMAC elements, then it finishes, in the 2nd mill. BCE.”
    It’s the the other way around. Hissar IIIB (IIIB: ca. 2400-2170 cal) and IIIC (2170-1900 cal. BCE) elements (grey ware..etc) took over during the later stages of BMAC after 1800BCE.”
    I am not speaking of grey ware, it is commonly said that IIIC has strong presence of BMAC elements, for instance here in Encyclopaedia Iranica (http://www.iranicaonline.org/articles/tepe-hissar): “many connections with Margiana (Marv) and Bactria occur in Hissar IIIC. These include mini-columns, alabaster discs, animal figurines, bidents, tridents, axe-adzes, compartmented copper stamp seals, lanceheads with bent tangs, metal horns, cosmetic bottles, beads with incised circles, etc.”
    Do you think that these elements came from Hissar to BMAC?
    BTW, there is also said that Hissar IIIB has a building with a fire altar…

    Related to Inamgaon that is often cited for the skeletons, its roots are in Malwa culture that also had fire altars. It had a barley and wheat agriculture, and there were also horses at Inamgaon. I think they were already Indo-Aryan colonists, so the affinity of Inamgaon with Neol. Mehrgarh does not mean they were all Dravidians, although probably in Maharashtra they mixed with proto-Dravidian speakers.

    In Karnataka and Tamil Nadu the first agriculture has millet and no barley and wheat (https://www.britannica.com/place/India/The-end-of-the-Indus-civilization), so it seems really independent, which can explain the formation and spread of Dravidian languages.

  84. Have you guys read the Lech Valley paper? It really looks like those R1b-L51 folks made a special point to completely replace competing male lineages everywhere they went.

  85. I have been troubled by the described 30% or so autosomnal impact on Northern South Asia post-1700BC with the postulated Steppe Migration.

    Is it possible there is a major problem with the modelling? I know I am making a massively controversial statement here but here are supporting points

    1. YDNA does not support this massive shift. L657 is not found in Steppe-MLBA. The dominant R1A clade in Steppe MLBA is low/negligable in South Asia.

    What is postulated by the AMT is a large autosomnal impact with minimal affect on Y-DNA. This is the opposite of what happens in a elite takeover.

    2. A 30% autosomnal shift would require a migration of families who then have large numbers of children. Elite takeovers tend to result in men having more offspring with local women and diluting their original autosomnal composition.

    Who wants to bring their families over difficult and inhospital, politically unstable terrain, through mountains and possibly deserts.

    3. Phentype Data. A recent reconstruction of 2 Rakhigarhi skulls showed mostly Caucosoid features, notably with “hawk-shaped, Roman” nose.

    If you guys are knowledgable about South Asia you will know that this is a very common and in many cases quite extreme feature in Northern South Asia. Without data on how phenotypes, especially those based on multiple genes, like nose shape, are affected by population migrations and mixing, it is difficult to interpret this scientifically. However, it does seem such a ‘sensitive’ phenotype (one which is present to varying degrees) would be more affected by population mixing, and caould easily disappear by out-mixing, certainly by out-mixing 30%. The Habsburg Jaw is a good example of a comparable trait. I dont think that would survive out-breeding to extent of 30%.

    “Shriver found that there was a very strong statistical correlation between the amounts of admixture and the facial traits.”

    I was wondering what you guys think about this?

    If we take the autosomnal modelling away, do the Y-DNA data support the conclusions of the recent papers.

    If not, why are we placing more emphasis on autosomnal modelling instead of y-dna, when are interested in a takeover by elite dominance? The papers use autosomnal modelling to push ‘results’ as this promotes their proprietary work. Y-DNA is easy and not innovative.

    Kudos to Frank for mentioning on Eurogenes that L-657 had not even been found in Steppe MLBA. Without that comment I wouldnt even know, people are looking at the wrong things.

    With a 100 or so samples from Steppe MLBA, and modern Indian populations, we dont really need autosomnal modelling, which currently seems to require alot more data and refinement for it to produce uncontentious results.

    Genetics is not a major area of expertise for me so I dont have good data to hand. What do you guys, does y-dna support a migration of MLBA into South Asia?

  86. @mzp

    There is no doubt that Swat_IA samples pull towards steppe component, you can see that on the PCA plots. in a 2 way qpAdm of Indus Periphery Pool (all 11 samples) and central_steppe_mlba, the steppe mlba autosomal component of Swat_IA (85 samples labeled as iron age, not including other 30 historical and medieval samples ) is ~22.3%(+-1.1%) and not 30%.

    There are a lot of issues with Narsimhan’s modeling. Noone has dissected his paper’s modeling thoroughly. I have been doing so for over a week now using qpAdm, trying to reproduce his results. Narsimhan does a bad job by rejecting any other steppe sources apart from MLBA, in my opinion.

    For the above Swat_IA = IndusPeriphery + central_steppe_mlba qpAdm model
    The p-value of this is too low (with allsnps = YES – p-value = 0.001198, without allsnps it is 0.00017). As per Narsimhans supplement (pdf page 283 Fig S50) his p-value is 0.006, so Im guessing the difference is because of the right outgroups we both chose. I cant seem to find the ones he used in his model.
    Regardless, this p-value would normally be rejected (usually >0.05 is accepted, or >0.01 if one is pushing it). But Narsimhan lowers it to >0.005 just so as to accept his favourite model.

    There is another issue with his modeling. Indus periphery 11 samples are hardly representative. They dont even cluster together neatly enough to choose steppe source properly. ie. choosing a subset of those 11 InPe samples as source can easily make the model accept Molaly_LBA as steppe source over Mlba. Given that noone knows what Swat ancestry was exactly like prior to steppe folk arriving, his conclusion that only MLBA is possible steppe source is very premature. He just doesnt have enough Indus or swat samples to make this conclusion.

    For eg. Indus5 (5 samples) + Molaly_LBA is accepted for Swat_IA with p value = 0.04 with coefficients (65%+-3.6% , 35%+-6%) whereas central_steppe_mlba instead of Molaly is rejected with p-value 0.0000007.

    This is also supported by Vahaduo global25. SwatIA samples choose Han over Onge or even Matts 100AHG pure AASI component. ie there is some LBA ancestry involved which has east asian component. Might also explain the 1 Q1a found in swat valley.

    Most likely, both mlba & lba were involved in the migration, however different groups need to be modeled differently, unlike what Narsimhan has done.

  87. Can someone help me understand this from narsimhan supplement

    “Using previously reported calls on 1000 Genomes Project Y chromosomes (223), we observe that 62 out of the 221 South Asian males have an R1a Y chromosome corresponding to a ninety-five percent binomial confidence interval of 22-34% for Steppe MLBA ancestry on the entirely male line, which is significantly higher than the ninety-five percent confidence interval of 9-14% on the autosomes in the same set of individuals. These results shows the process of admixture of Central_Steppe_MLBA into the ancestors of the ANI was male biased, and reveal that the directionality of sex bias was opposite to the pattern observed for the contribution of Central_Steppe_MLBA to SPGT.”

    Isnt this circular reasoning? Is it not possible that R1a L657 ( (wherever the origin is) expanded in India much later from a few founders after female mediated steppe ancestry had already come in? eg Mauryan expansion post 500bce (which would explain how R1a reached Sri Lanka).

    Apart from the minor presence of R1a in Swat, the 2 outlier samples from Swat with the highest steppe (~50pc, Loebanr_IA_o, Udegram_IA_o) are both mediated through steppe females. 1st is male with E1 Y haplogroup, the other is female with steppe mtdna T1a1.

  88. Found something that will put an end to the steppe Indo aryan hypothesis.

    Heres the archaeological context on Bustan BA (1600-1300BC). From Narsimhan supplement Metadata
    “Archaeological investigations at Bustan Burial Mound have revealed a complex funerary ritual related to the usage of fire. On top of the graves there were piled rocks, showing the influence of Steppe traditions. There were inhumation as well as cremation burials. There was a dedicated chamber for cremation of bodies at Bustan, including multi-usage hearths and altars. The altars were functionally classified into ones used for libations, ones used for meals, and ones used for sacrifices. The funerary rite documented at Bustan, specifically in relation to the role of fire, is not known at this time from any other site Iran, South Asia, or
    the Central Eurasian Steppes.”

    More details available here http://www.archeo.ru/izdaniya-1/archaeological-news/annotations-of-issues/arheologicheskie-vesti.-spb-1995.-vyp.-4.-annotacii

    “Three bonfires were made for each cremation act. Their traces were found at the level of buried soil south, west, and east of the incinerators (figs. 1; 2: B). These finds are closely paralleled by the Vedic texts, where cremation, described as an offering to the sacred fire carrying the body to heaven, is said to be made in three open fires (Rigveda X, 16, 18; Atharvaveda XVIII, 2, 7; Asvalayana-grihyasutra IV, 1, 2).”
    These are late vedic practices.

    Of course, Bustan BA has no trace of steppe ancestry,( not even in the outliers, except 1 which has elevated steppe as well as IVC ancestry).

    Before the genetic data, this was connected with the assumption that this site was infested with incoming Aryans. But now you have Aryan culture with 0 steppe mlba or LBA genetics.

    The dominant Y haplogroup here is J2a (also dominant in brahmins and more specifically, modern zoroastrians)

  89. @AK
    Yes I noticed this when the pre-print came out . And from Harappan we also have similar data from Sites like Kalibangan , Banawali, Lothal .
    But I have seen Narasimhan arguing that cultural aspects were of local origin, but the language was brought with steppe migrations 😉 .

  90. Hi, Alberto, Matt and A

    I’m curious about what you guys think about Dzudzuana ancestry in Iran_N and CHG as suggested by Lazaridis…

    “Iran_N/CHG are seen as descendants of populations that existed in the Villabruna→Basal Eurasian cline alluded to above, but with extra Basal Eurasian ancestry (compared to Dzudzuana), and also with ENA/ANE ancestry. ”

    “CHG/Iran_N were Dzudzuana+Basal Eurasian (or, equivalently Villabruna+Basal Eurasian) derived populations also modified by ENA/ANE admixture.”

    How would you guys interpret this in terms of Iran_N, as well as Iran_N ancestry in South Asia?

  91. @AK, L657 is from South Asia and also well spread Persian gulf arabia and . Bit non L657 in India is not negligible and is also quite well spread with no notable structure.

    ““Using previously reported calls on 1000 Genomes Project Y chromosomes (223), we observe that 62 out of the 221 South Asian males have an R1a Y chromosome corresponding to a ninety-five percent binomial confidence interval of 22-34% for Steppe MLBA ancestry on the entirely male line, which is significantly higher than the ninety-five percent confidence interval of 9-14% on the autosomes in the same set of individuals”

    62/223 ~ 28% is interpreted as 22-34 % vs 9-14 %.

    Is this a valid comparison? seems dubious. coarse SNP call vs autosomal component. Is there a way of doing component analysis on just the Y chromosome or is it too tiny for good stats? Can the experts weigh in please.

    If we assume half of those R1a are L657 which radiates out from Nepal(as per anthrogenica) then we are left with an MLBA signal with a fairly gender neutral distribution.

    Also swat mlba is more female mediated and males come later during the Iron Age and historic period. Its a typical pattern where females diffuse first vs males who are patrilocal.

Comments are closed.