West Iranian vs. East Iranian ancestry (with Vahaduo’s tool tutorial)

As you all know, there have been two aDNA papers released recently about Central Asia to North India. I didn’t dedicate a post to them (there are comments in the previous thread about them, though), mostly because the first one (The formation of human populations in South and Central Asia, Narasimhan el al. 2019) had already been extensively commented when the preprint was out, and while it did bring more samples these mostly add quantity to already sampled populations with few new ones (and not relevant enough to deserve a new post), while the second one (An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers, Shinde et al. 2019) finally brought the first ancient sample from within modern India, but it was only one low quality one that didn’t add much to the better quality “Indus periphery” ones already present in the former paper.

However, there’s still a bit of confusion regarding the ancestry to the people of the Indus Valley (and generally to the genetic structure of SC Asian populations), so here I’ll try to give some insights that might help to clarify the situation for further, better informed, analysis.

The basic premise here would be to split Iranian ancestry into West and East Iranian. The main difference would be the ratio of Basal Eurasian to ANE ancestry (higher in the west, lower in the east), but given the lack of Mesolithic samples we’re still unable to get the whole picture. However, some basic concepts can still help us to better understand the situation. So let’s start.

Vahaduo’s online modelling tool

And I’ll use his post to introduce a recently released online tool that deserves more attention, given its quality and usefulness. It’s been written by Vahaduo, with a similar purpose to my own Xmix, but more complete, faster and not requiring any local installation. So I’ll use this post to show how to use it for any of the readers to be able to try their own models and be able to test for themselves whatever they are interested in.

The first (and only) thing you’ll need is to get some datasheets that are valid to use with Vahaduo’s program. The best (and recommended) ones being the Global 25 scaled datasheets from Eurogenes. One will have all the individual samples and the other the averages of each population. Ones you have these, you can proceed to the site and start testing. Here we’ll go directly to test what I mentioned above: East vs. West Iranian ancestry.

For West Iranian ancestry, I’ll use the average of the Early Neolithic samples from the Ganj Dareh site in the Zagros mountains. And for east Iranian I’ll use the average of the easternmost samples we have so far: Sarazm_Eneolithic. So I’ll need to copy the coordinates of these in the “SOURCE” tab (one per line):

Now, two sources will probably not be enough to test the samples from SC Asia and Indus periphery, since there are more streams of ancestry in them (at least one related to ANF and the other to AASI). So I’ll go ahead and add the average of Barcin_N samples and the average of modern Onge and Naxi populations.

Then for the targets, I’ll use individuals instead. In this case I’ll start with the “Indus periphery” samples, which are labelled in the datasheets as IRN_Shahr_I_Sokhta_BA2 and TKM_Gonur2_BA, so again one per line I copy and paste them in the “TARGET” tab:

And now we’re ready to run the program and get the results. Since we’ve added multiple target samples, we should go to the “MULTI” tab and click on the “RUN” button, which will show us this:

As you see, Naxi doesn’t appear in the results, and that’s because all the samples got 0% ancestry from it. If we wanted to see all the sources in the output, we’d just have to click on the “PRINT ZEROES -NO” button (which would change to “PRINT ZEROES – YES”) and click “RUN”. The “AGGREGATE – YES” button is to aggregate the percentage of multiple sources with the same label (for example if instead of using the average of Ganj_Dareh_N we would have used all the individuals as sources, we would choose to either see the results with each individual specified or to aggregate them into a single column with the sum of them).

Then we can download a .CSV file to import it into a spreadsheet and make further calculations if needed (or for sharing purposes using Google Docs, for example). The “DISTANCE” tab is also useful to calculate the distance between a sample to all the sources (you could copy for example the whole datasheet, being careful not to copy the first row with the PCA labels) and get the top 25 closest samples/populations.

It just takes some minutes to get familiar with the program and the options so go ahead and try it. It’s definitely a very useful tool.

Some insights into SC Asian and Indus Valley ancestry

So let’s start with what we see in the above model of the Indus periphery samples. Leaving (for now) aside the fact that they may have some recent admixture from the places where they were found, one striking thing is the very variable ratios of West and East Iranian ancestry. In the following spreadsheet the above results can be seen (Sheet 1) together with a second run with the 100AHG simulation provided by Matt in the previous thread (Sheet 2), and in both the the calculated ratio of West to East Iranian ancestry. It’s easy to see that there is no correlation between that ratio and the amount of AASI in each sample, which makes it irrelevant for this matter whether they have any admixture from the local populations or not. Either way, we’re seeing a diverse population not just in terms of AASI to West Eurasian, but in the more sutle, but still important, West to East Iranian ancestry.

This pattern of significantly different ratios in West and East Iranian ancestry is equally seen in the regular Shahr-I-Sokhta BA samples (Sheet 3) and in the Turan Eneolithic samples (Sheet 4). The Iranian-like ancestry in the Indus periphery samples is therefor very similar to the one in those places. But they’re not all homogeneous, and point to mixed populations with probable input from West Iran. Modelling the samples with more proximate sources and using the 100AHG simulation again, it looks like this:

And using the sample with the highest amount of AASI as a source instead of the 100AHG simulation, something like this:

So what does this mean? First that things are a bit more complicated than getting the average of a population and building a tree estimating the divergence time from another one under an assumption that there is no admixture between them just because they’re not the same. In more simple words, we can’t really know with certainty if there was some migration from the Zagros Neolithic to North India or if there was none. Both options are possible. What we can say, though, is that we’re talking about a significantly different case to the Neolithic transition in Europe, since there must not have been a large replacement by outside farmers in any case.

All of this opens some interesting questions regarding the genetic history of South Asia. Unfortunately, we don’t have the data to give any answer to those questions, but it’s still worth knowing them and the different possible answers. For example:

  • Who were the Mesolithic Hunter-Gatheres from North India?
  • Who were the first farmers?
  • Was there any subsequent migration before the Bronze Age?

Let’s break up the genetic structure of (putative) IVC samples into the 3 main streams of ancestry:

  1. AASI
  2. West Iranian
  3. East Iranian

This does not mean necessarily three different populations. Two or more of these ancestries could have been already mixed since very early. But let’s examine the possibilities:

First, a basic look at the geography of India tells us that there are no major barriers within it, compared to the big barriers with the outside. This makes less likely he possibility of two extremely different populations during the Mesolithic living in South and North India, and the one in north India being almost identical to the ones outside (Iran and Turan). It could be (only aDNA can tell us), but it looks like the least parsimonious.

Together with the diversity in the ratios of those 3 streams of ancestries, it’s really unlikely that we could be talking about an isolated India-specific population. We have to think in terms of some degree of migration to India from outside before the Bronze Age.

The possibilities about who was where at each point in time are many, and I won’t argue for any of them. It’s speculative at this point. But as possible examples:

We could have a AASI-rich population, but with significant East Iranian ancestry too during the Mesolithic. Then we could have a moderate migration from the Zagros Neolithic and no more migrations up to the IVC time where we have samples. This would be somehow similar to the Neolithic transition in Turan, where presumably a mostly East Iranian population was there in the Mesolithic and received some migration from West Iran during the Neolithic transition. The difference (apart from the lack of AASI ancestry in Turan), is that the communication between West Iran and Turan is easier, and gene flow continued (both ways) throughout the Chalcolithic and Bronce Age.

The problem with this scenario is how to explain the presumed differences in levels of AASI in the IVC and their lack of correlation with the East Iranian ancestry that would have been associated with it.

Scenarios were we separate the three streams of ancestry could better explain the situation, though given that Turan Chacolithic had already a diversity in East and West Iranian ancestry ratios that could serve as a single migration too (note that neither West Iran Chalcolithic or Turan Bronze Age would fit well as admixing sources due to their excess of ANF-related ancestry). I’ll leave to the comments any further variations within these constraints.

The Steppe ancestry in Turan and North India

This subject has already been discussed everywhere in great detail, for a very long time. So I didn’t plan to look at it again. I don’t have much more to say, but I’ll go through it fast.

The post BMAC samples that we have hardly show any steppe admixture. In the same spreadsheet linked above (Sheet 5), I’ve added the samples with an average date in calBP of  <3700 years in descending order (note that the Present is defined as 1950 CE, so you’d need to add 69 years to get the real BP as of today). There’s one Parkhai_LBA_outlier (1497-1413 calBCE) that shows 9.2% Sintashta_MLBA admixture. The rest until the last BA samples (3250 BP) are in the noise levels. It’s only the single Iron Age sample from Turkmenistan (912-799 calBCE) that has a big increase to 50%.

In the Swat Valley, we have the earliest samples from the period 1200-800 BCE. They have significantly more steppe admixture, ranging between 20% and 0% and an average of around 10%. The variability of the amount of steppe ancestry doesn’t seem very compatible with their estimate of admixture happening 26 generations before in that same place, in that same population. But the shortcomings of their observations that provide evidence of the arrival of steppe ancestry to South Asia in the first half of the second mill. should have been already evident without looking at individual variability with up to 0% levels.

Another of the inferences for supporting such evidence was their observation that after he MLBA the steppe got Siberian/East Asian admixture, which is not found in modern India. However, they could model modern population using the Kangju samples from Kazakhstan (II-V CE). Modern samples are never a good way to make inferences about prehistory (including modern frequencies of certain uniparental markers). It seems rather arbitrary why would populations choose Sintashta or Kangju (though maybe Kshatriya ones make sense),

or indeed why would they choose Sintashta_MLBA or the Kashkarchi_BA samples from 1200-1000 BCE which are almost identical,

or even adding the Turkmenistan_IA sample mentioned above too as a source (as suggested in the comments from the previous thread), which further splits the steppe ancestry in a relatively random way.

Overall not much to add about all of this steppe part. We’ll have to wait to see those samples from the first half of the second mill. BC around the Punjab before we can know with certainty how all this went.

 

 

 

164 thoughts on “West Iranian vs. East Iranian ancestry (with Vahaduo’s tool tutorial)

  1. This is a summary of Choubey’s talk:
    1. R* rooted in Himalaya , but also found in North, South, East and Central India.
    2. Indian branch is exclusively Indian.
    3. Gradient is opposite of migration hypothesis, it is going from east(Bihar) to west.
    4. Gangetic plain has highest diversity of M780.
    5. South India has highest frequency of M780.
    6. M780 is surely not from Steppe.
    7. Full continuity from origin to spanning 20kya years to modern day without a break.
    8. They are pointing out the flaw in Silva et al’s paper based on ancient DNA from David Reich’s lab, as they do not sequence full genome, they do only capture sequencing. If any one mutation is missing then the whole tree goes haywire.
    9. European branch is a cousin branch which split 6 to 10 kya. The common ancestor is not known.

    Now if you add this information with the fact that Narasimhan et al’s study already states that AASI admix happened at around 6kya then this split predates that mix, ergo, if it went out of india, it went only with a West Eurasian type ancestry.
    There is good evidence to suggest that the West Eurasian ancestry in India was having a deep presence, and atleast 10 ky without admixing with AASI. This post summarises that evidence well.
    http://t-o-i-h.blogspot.com/?m=1

    If it went out, there are two directions it would go, one towards Western Iran/Eastern Anatolia(some evidence in above link also tweeted by https://twitter.com/NirajRai3/status/1169686739251654656?s=19) , and another route East of Caspian Sea, mixing with populations like Khvalynsk which does see an increase in Iran like admix, and is also one of the oldest sites on Steppe to show up R1a.

    We also now have a paper suggesting local origin of farming in India, which many, including Mallory have stated that, is a more important feature of PIE, than some speculation about wheel, horse etc. So that part is also covered.

    Regarding the affinities of Balto-Slavic with Indo-Iranian, is purely because of proximity and later migrations of Indo-Iranized Scythian/Sarmatians which were absorbed wholesale.

    Somebody mentioned that Paul Heggarty has an earlier dating for Indo-Iranian split, and perhaps the entire tree, based on his phlogenetic analysis, does someone have an idea what is that?

  2. I think my earlier question might be relevant with regard to Bhikshu’s comments: what’s the east Eurasian ancestry in Iran_N, and where did it comes from?

    Andamanese people are supposed to have become isolated before the Mesolithic, and their Iran_N affinity has been estimated to be about 30%.

  3. @Bhikshu

    About Chaubey’s presentation I’ll still need to see the paper before having a more informed opinion. The last point about about an earlier split is the only really relevant one in this context.

    Leaving R1a aside, the article you linked to proposes one of the possible scenarios I wrote in the post, but it does not attempt to address the problems about it. So let’s see:

    – We have an east-Iranian like population in the north from 20-15kya. And we have an AASI population from at least as long somewhere else (presumably in the south?). Then Neolithic development starts in India (let’s say as an autochthonous development) within that East Iranian-like population from the north, and continues for a couple of millennia until the Chalcolithic.

    – At this point there is a large migration out of India that brings a language shift to West and Central Asia and eventually to Europe. But somehow all the haplogroups that are from India don’t expand outside with this migration (the evidence of long split times between Indian and out of India haplogroups works in both ways: if it prevents people to have moved in, it also prevents people to have moved out). Unless that population was 100% R1a and that haplogroup was the only one that went out of India. But it disappeared fast from West and Central Asia, since it’s not found there in the Chalcolithic.

    – Meanwhile, the AASI population from the south, who were still hunter-gatherers decided that after 10-15k years without moving it was about time they moved, so they went north and mixed big time with an advanced and settled agricultural population. Apparently they replaced the R1a too with their own haplogroups from the south, because all the samples we have from South Asia lack R1a until very late (so somehow it rebounded again after almost extinction to achieve modern levels).

    All this, as you see, is very problematic. If one is able to see the problems with the steppe hypothesis, he/she should be able to see the problems with any other one too and keep the same level of critical thinking.

  4. Regarding Haplogroups moving out, R2 was found in Iranian Neolithic, and most likely expanded from India, as R2* is found in India. Haplogroup L is another one that is shared with West Asian populations, these along with R1a do give some indications.

    Why AASI was restricted only to the South or East(?), we can’t say with certainty, we would need more sampling from India. We hardly have anything from areas where R1a dominates. Though in Sanskrit literature there is a mention about a sage Agastya(probably with his clan), moving south to have a spiritual balance, and Vindhya mountains bowing to let him cross to those lands. So even though it is a semi-myhological account, it does tell about a deep memory of the people of how the spread or interaction happened, with a different kind of northern population making contacts with southern one, for spiritual/religious reasons. Also the same argument could be made about Eastern Iran and Northern India, why would they(west Eurasian pop) be restricted only to Eastern Iran, there is no reason to do so, and we also know that CHG replaced Dzudzuana on this type of ancestry’s western edge, so the origin is most likely further east.

    I am not claiming that we have all answers, but none of it is out of the realm of possibility, specially given the backdrop of a weak and failing steppe hypothesis.

  5. @alberto
    I have no special OIT hypothesis, and have no intention of proving genesis of all languages from India as such. If others want to understand origin of their language, onus is on them to do it. But we wish they dont include sanskrit and vedic culture in their 150 yr long quest akin to finding eldorado.

    There is a lot of work already done which firmly pushes the date of the vedic period prior to 2000bce. This includes the drying of the saraswati, finding fire altars at various locations, sanauli warrior culture & chariots buried, independent dating of astronomical details in vedic era literature, etc.
    Im quite convinced that the Indo Iranian loanwords in Mitanni came from NW south asia. I agree with Talageris claim that the loanwords are characteristic of the latest of RV mandalas. We now have 2 separate papers confirming that zebu, water buffalos and asian elephants from the IVC area reached Syria & the near east post 2000bce through human agency. You also see IVC migrants in SiS first, and then Gonur (2000bce), in the general direction of the near east.

    That makes the steppe ancestry moot, which anyway is female mediated (with the data we have, will change opinion if we find R1a rich samples with high steppe ancestry) , too little, too slow. Scythians, Kushanas, Huns, Greeks, Parsis all came to India and mixed, but none of them could even keep their own language, forget about imposing theirs. Why should we accept that Sintashta Steppe could do that? We dont even know what languages they spoke. All we know is that they had horses and chariots in their own homeland (no archaeo proof they brought it on their way into NW india).

    Now, if theres some evidence as to how vedic culture entered india pre 2000bc, then i’m game for that.

    As for Chaubey’s new R1a paper, it is irrelevant to the Aryan question imo. It might help in understanding later population movements. R1a does not dominate, and is not important to Zoroastrian/parsi Y haplogroups, even in their priestly caste.

  6. I think the major obstacle to OIT or Out-of-Deep-Iran/Turan is this, from Damgaard 2018:

    >PCA (Fig. 2B) indicates that all the Anatolian genome sequences from the Early Bronze Age (~2200 BCE) and Late Bronze Age (~1600 BCE) cluster with a previously sequenced Copper Age (~3900 to 3700 BCE) individual from Northwestern Anatolia and lie between Anatolian Neolithic (Anatolia_N) samples and CHG samples but not between Anatolia_N and EHG samples. A test of the form D(CHG, Mbuti; Anatolia_EBA, Anatolia_N) shows that these individuals share more alleles with CHG than Neolithic Anatolians do (Z = 3.95), and we are not able to reject a two-population qpAdm model in which these groups derive ~60% of their ancestry from Anatolian farmers and ~40% from CHG-related ancestry (P = 0.5). This signal is not driven by Neolithic Iranian ancestry, because the result of a similar test of the form D(Iran_N, Mbuti; Anatolia_EBA, Anatolia_N) does not deviate from zero (Z = 1.02).

    I do not believe the ‘tracer dye’ hypothesis that Reich and colleagues came up with, but since all extant IE languages split after the Anatolian this findings require an explanation – that would put the onus on those who propose OIT. Do samples from Turan provide a better fit for those Anatolians?

    EDIT: Namazga_CA plots closer with CHG than Iran_N does, so these might tried as a source for the eastern ancestry in Anatolia.

  7. Alberto,

    If we are looking for the possible uniparental markers that could have accompanied an Out of India migration the following may be suggested besides R1a –

    R2 – present in Neolithic Iran and is also present in modern Central Asians including the Uighurs.

    L1a – present in the Chalcolithic period in the Caucasus and also in Bronze Age Central Asia.

    Q1a/Q1b clades – some of these have shown in the more recent samples from the steppe and a recent paper on Q showed some of these lineages are perhaps rooted in South Asia.

    J2b2 – This is shared with the Aegean and could have had a spread from an Eastern origin.

    J2a – Perhaps from Central Asia into Anatolia & the Aegean ?

    R1b Z2103 – also from Central Asia where it has a sizeable presence in some native groups ?

    Among the maternal lines we have,

    M52 in a Maykop sample.

    M5a and U7 in Tarim Basin samples circa 2000 BC.

    Perhaps some W clades ?

    Besides, we have Zebu admixture all across the Near East and also into the Podolian cattle which are considered an ancient breed of steppe cattle and also includes the Ukrainian Grey Steppe.

    We also have Elephants and Water Buffalos brought into the Near East.

    We may also recall the corded ware dog genome which showed Indian/Iranian dog & wolf admixture.

    All these if looked at collectively can open a reasonable avenue of inquiry.

    ———////——-

    As for why AASI HGs migrated North, maybe they did so after making a transition to farming ? Remember that the Eastern Gangetic plain was an independent center of rice domestication as early as 8-9 kya bp. This rice cultivation reaches the Harappans after 3500 BC as known from sites in Haryana.

  8. Yes, this is more constructive. I do understand some level of frustration when it’s been basically Western scholars who has been researching the PIE question and proposing theories that were at odds with Indian history. But that was then and this is now, and now is the time to find out the reality. For everyone.

    @A, Sanskrit is fundamental in any IE research. Many people, including from India, are interested in the origin of IE languages. You probably are interested too. So it’s not about leaving Sanskrit out of the research unless someone doesn’t want to know about it’s origin.

    I agree that there is quite some evidence that pushes Vedic to a period where it’s probably incompatible with the steppe hypothesis. I hope to have some guest post(s) that will explain some of the textual information available better than I could do.

    I don’t think that Vedic culture could possibly have developed out of India. Why would it? The Vedas were composed in India, by people who considered themselves natives to that place (except from some distant past, semi-mythical I guess, that talks about a place in the north with days that last 6 months or whatever).

    But the language that would become Sanskrit could have come perfectly from outside India (because it belongs to the same family as many others from all across West Eurasia). I guess one should try to understand this as deep prehistory, somehow like for any Spanish person (I’m Spanish) it’s very clear that our language came from the Latium and with it a great cultural impact. We know this and we are happy about it. Where was the ancestor of Latin 4000 years earlier is irrelevant for the Spanish people/culture. It’s just a thing from prehistory that’s interesting, but that’s it.

    @Marko, I don’t remember where were those stats with Anatolia_EBA. Damgaard et al. had a lot of samples, but in the analysis they mostly concentrated on a subset of them from the steppe.

    I’d like to check if they had some Turan samples there because i don’t think the evidence is against it. In fact, somewhere in North Iran, with an early entrance into Turan (and probably to India) seems to me the only alternative that is surviving all the incoming data.

    https://ibb.co/D9GMfYF

    And you can’t explain Mycenaeans without a good amount of Kura-Araxes type of ancestry.

  9. @marko that seems incorrect. Namazga_En plots farthest from CHG. Indian Iran like component is closest.

    Distance to: TKM_Namazga_Tepe_En
    0.05958605 0AHG
    0.08360807 IRN_Ganj_Dareh_N
    0.13845439 GEO_CHG

    Target: TKM_Namazga_Tepe_En
    Distance: 2.9421% / 0.02942091
    Aggregated
    37.6 IndianIranian0AHG
    32.4 IRN_Ganj_Dareh_N
    16.6 CHG
    8.2 Anatolia_N
    5.2 WSHG

    Presence of J2a1 in Mycenaens, minoans & anatolian bronze age needs some investigation imo. That imo is also an IE marker. J2 specifically, and also L dominate modern zoroastrians.

  10. @Jaydeep

    Yes, that’s a good collection of peripheral evidence to support an out of India scenario. But we need more solid evidence. If we get Chalcolithic samples from North India and they lack any AASI admixture then that opens up much better possibilities. Though we’d need to know that these were there since the Neolithic and not recent incomers from Turan.

    I still find it difficult to explain that an AASI population was from the Mesolithic (or probably Paleolithic) in the Gangetic plain while an eastern Iranian was to the west around the Indus and they stayed isolated for thousands of years. It could be, but is there any good reason to think that could have been the case?

  11. @A

    I was referring to the two-dimensional PCA, the proximity might be a result of complex admixtures of course. The aim should be to confine the eastern ancestry that enters Anatolia in the Copper Age. I suspect the reason Iran_N isn’t a good proxy might be its inflated Basal Eurasian ancestry, pulling it away from the northern regions due to the extreme divergence of said component.

    I think I’m more and more in agreement with what Alberto said here:

    >In fact, somewhere in North Iran, with an early entrance into Turan (and probably to India) seems to me the only alternative that is surviving all the incoming data.

  12. Alberto,

    As I said, the Gangetic plains had early rice farming while the more western farmers had barley & wheat. This represents a stark contrast in subsistence strategies.

    I would not say that such populations existed next to each other without genetic mixing. We may envisage a limited trickling gene flow in both directions.

    To better understand the genetic isolation we may envisage the possibility of ecological barriers. For much thought the last 100 kya, South Asia largely, except its Northwest, has remained very suitable for human and animal habitat. So the notion of major population turnovers may not apply to South Asia. However there was uniform geography and as Michael Petraglia & colleagues have shown, there was a mosaic of different ecozones existent in South Asia across the Paleolithic. Now it is certainly possible that a HG population accustomed to thrive in a particular ecozone may crossover in a neighbouring ecozone but find the new habitat much less conducive to thrive and also being inhabited by other hostile populations who are much more at home there. In such a scenario, the chances of survival of the migrant group would be minimal. Perhaps this may lead to HG populations across different ecological zones not mixing much except through a minor trickle like admixture.

  13. @Jaydeep,

    Yes, the paper about corded ware linked above by @raj supports longstanding contacts between East India and SE Asia, but not with NW India. This goes well with the idea that AASI could have arrived from the East with rice cultivation.

    I’ll have to revisit the subject of rice in the IVC. It seems it was there, but it only became important after 2000 BC? Apart from rice, does anything suggest that the IVC could have experienced such a big growth due to a rather big migration from the Gangetic plain?

  14. “I don’t think that Vedic culture could possibly have developed out of India. Why would it? The Vedas were composed in India, by people who considered themselves natives to that place (except from some distant past, semi-mythical I guess, that talks about a place in the north with days that last 6 months or whatever).”

    I agree that whichever way you look at it the Vedic culture is Indian, and there is nothing to be defensive about it. But I think, the anger comes from the way a story has been pushed on very flimsy grounds, and then a very ugly kind of scholarship built on top of that shaky foundation, to tear apart a living tradition, which was never meant for such pseudo scientific pre-historical analysis. The western scholars took the texts as-is from the tradition, arbitrarily declared interpretations as Brahmanical corruption, became modern era ‘Brahmins’ themselves, offering the ‘correct historical’ interpretation, so, the texts were passed correctly down the ages but not the interpretation, very convenient.
    The 6 months day and night claim has come from one such analysis, a tortuous extrapolation of some verses, by some overly eager enthusiasts of the “historical method” like Tilak. There is no direct reference of anything like that. The Agastya story I told is not just an arbitrary reference but very much part of legends and accounts with variations in both North and South of India, though I just used it as a reference point for future research, not a historical claim in itself.

    But, my intention is not to spam this space. I think this is one of the most open minded and genuine blog on archaeo-genetics and IE population history, than a lot of other agenda driven ones out there.

    My only contention is that we have to consider all possibilities on things like the source of East Eurasian affinity in populations like AAF. We may go for now with what data we have but not rule out further development, specially with a lack of data from older dates from an indo-european area, the size of half of Europe.

    Cheers and Peace!

  15. Kahsmiri pandits definitely get their steppe from Kangju/Kushanas. They have a Han component.
    Target: Kashmiri_Pandit
    Distance: 3.4863% / 0.03486321
    Aggregated
    71.4 IRN_Shahr_I_Sokhta_BA2
    28.6 RUS_Sintashta_MLBA

    Much better fit with Han
    Target: Kashmiri_Pandit
    Distance: 2.3667% / 0.02366717
    Aggregated
    69.6 IRN_Shahr_I_Sokhta_BA2
    24.8 RUS_Sintashta_MLBA
    5.6 Han

    Kushanas did rule over Kashmir for couple of centuries, at least post 0ce.

    Keeping sintashta, kangju & kushana as steppe source along with Indus Periphery source. removing Han.

    Target: Kashmiri_Pandit
    Distance: 2.1622% / 0.02162236
    Aggregated
    56.6 IRN_Shahr_I_Sokhta_BA2
    26.8 KAZ_Kangju
    16.6 TJK_Ksirov_H_Kushan

    Removing Kangju as source, as only Kushana is attested in kashmir. However, Kushanas descend from Kangju themselves. so both models are possible fits.

    Target: Kashmiri_Pandit
    Distance: 2.4445% / 0.02444549
    Aggregated
    54.0 IRN_Shahr_I_Sokhta_BA2
    39.2 TJK_Ksirov_H_Kushan
    6.8 RUS_Sintashta_MLBA

  16. @Alberto

    “If we get Chalcolithic samples from North India and they lack any AASI admixture then that opens up much better possibilities. Though we’d need to know that these were there since the Neolithic and not recent incomers from Turan.

    I still find it difficult to explain that an AASI population was from the Mesolithic (or probably Paleolithic) in the Gangetic plain while an eastern Iranian was to the west around the Indus and they stayed isolated for thousands of years. It could be, but is there any good reason to think that could have been the case?”

    Indeed, Indus and Gangetic plains have no special barrier between them in the northern part (in the south, there is the Thar desert), although the Indus valley genetically is much more ‘Western’ than other parts of India, especially in mtDNA as I remember. Also in prehistory, the dental analysis of Hemphill and Lukacs has given two groups: Indus valley sites and peninsular India sites (Mesolithic Ganga valley and Chalcolithic Inamgaon). But there is an exception: Neolithic Mehrgarh. Its teeth are close to those of Inamgaon in Maharashtra, they had Sundadont traits (typical of SE Asia) and few Carabelli cusp (typical of Europeans). Chalcolithic Mehrgarh is very different, also craniometrically, and closer to Harappa and Gandhara grave culture Timargarha, that are close to Tepe Hissar in Iran.
    So, the affinity with Inamgaon (that has roots in Malwa culture of Central India) suggests that Neol. Mehrgarh was more AASI, although it had at least trade with the west (turquoise, lapis lazuli, wheat). In the Chalcolithic, apparently there was a wave of more Iranian-like people. The age of Chalc. Mehrgarh (4500 BCE) matches the calculated age of mixing of IranN and AASI. I think that some IranN component must already have been in Neol. Mehrgarh, but possibly it became dominant in the Chalcolithic period.

  17. When the Iran-like component split from ancestral branch 12000 years ago, it would require the split(let’s assume it happened in Iran) component to remain in isolation away from the other Iranian ancestry atleast for 6-7000 years till it mixed with AASI. Where was it? For NW India to go from AASI dominated to 80:20 iran-like:aasi in 2600 BCE Rakhigarhi, it would need an overwhelming migration of Iranian like ancestry. Where is the archaeology for that? The Mesolithic Ganga Valley skeletons have average height of 6 feet, while the southern HGs are pretty short, so who were they?

  18. Chalcolithic Mehrgarh has new traits compared to Neolithic Mehrgarh (copper melting, use of gold, seals, beads, new stone industry, different burials, increase of wheat, appearance of oats), and the skeletons reveal a new population, the only real discontinuity in prehistoric India before Sarai Khola (800-200 BC) according to Hemphill and Lukacs. Unfortunately, at Mehrgarh there is the only cemetery found of this period, as Possehl remarks.
    Mesolithic people of the Ganga valley are isolated from other Indian samples, but they have some ‘peripheral association’ whith Neol. Mehrgarh and Chalc. Inamgaon.

  19. @giocomo
    modern balochis show good fit with bmac ancestry ie BMAC1 for Gonur. if that helps.

  20. @Giacomo Benedetti

    Yes, for now I also think that’s the most parsimonious scenario. The available data from India (archaeological, anthropological, apart from the lack of aDNA) is scarce, so we are still guessing and we’ll have to wait for aDNA to clarify all of this. But in general the idea that AASI only arrived to NW India after the Chalcolithic looks problematic, even if there have been some possible reasons outlined above.

    @Bhikshu

    The split time calculated between Iran_N and the Iranian-like ancestry in India has many caveats. Besides, it was already obvious that the Iranian-like ancestry in India cannot de directly descendant from the Iran_N (Zagros) samples, simply because it’s different (even if there could be some Iran_N admixture in India).

    However, the resemblance between this Iranian-like ancestry and the whole East Iranian (including SC Asia / Turan) is very high and there’s no need for it to have diverged many thousands of years earlier (this, again, goes both ways: for an Out of India to work you do need these ancestries to be very similar, which allows an Into India too. Otherwise none of both options would be possible).

    So overall I think that a decent amount of gene flow must have happened between NW India and East Iran/Turan around the Chalcolithic (which is also the main alternative to the steppe model for these populations to speak closely related languages). The direction of it is uncertain, so we’ll wait for aDNA to know.

  21. @Alberto

    Yes, fair enough. This could very well have been the scenario, we’ll wait for further data on this.

  22. https://www.nature.com/articles/s41598-019-40399-8#Sec2
    Munda paper is available.

    The munda speaking austro asiatics arrived in Orissa in SE India somewehere between 2400 & 1200bce. They mixed with a population which had slightly less west asian (indian iran N) than modern paniya, about 22%.

    So we can at least reject the Munda substrate hypothesis for Vedic sanskrit, for one.

  23. Slightly off topic, but if anyone is interested in getting back to the y-dna questions, for visualisation I have made some plots which colour code the Swat/PAK IA-H samples by y-haplogroup (if male – colour star, if female – black triangle) and then plotted them against the Eurogenes West Eurasia 9 PCA (just as it’s the PCA that seems to be able to get most on there).

    See: https://imgur.com/a/pOZVaww

    May be a useful visual reference for anyone looking to identify when the first of some particular y turns up in Swat/PAK IA-H, in what context and at what time, and how this correlates with the main cline within these samples.

    E.g. R1a is somewhat scarce and uncorrelated with position on the cline, through first appearance at around 1000 BCE, sample I12457, certainly Iron Age and probably Buddhist/historical period, then is somewhat associated with more Central Asian related position as we get into post-Medieval period

    The early enriched steppe related Swat samples are: I1992, an male called as E1a – who is described as being in a family group with I6194, I1799, I3262, who largely didn’t make it through to this analysis other than I6194 – and I12138, a female individual

    Btw, since I1992’s higher quality first degree relative pair I1799+I3262 called as the same E1b1b1b2a as most of the early Udegram_IA males, it seems reasonably possible that I1992 is E1b1b1b2a rather than E1a.

    The plots against time are a little busy in places, unfortunately as a very large number of the samples esssentially cluster around 900 BCE.

    List at the end of gallery only includes those samples that were available on the PCA being cross plotted with. It may be worth cross checking this against the supplementary data from the paper to see if there are any more who are not on there, but the missing samples are essentially all either first degree relatives and not independent data points, and/or very low quality.

  24. Giacomo,

    With the advent of aDNA, I am not very inclined to give much importance to skeletal craniometric & dental studies which cannot give such a high resolution as genome wide data provides.

    Neolithic Mehrgarh undoubtedly had some linkages with people of Iranian Neolithic as the archaeological assemblage of both cultures testify. So how do you square this people of Neolithic Mehrgarh being AASI like. Isn’t this a major contradiction ?

    On the other hand, the linkages of Chalcolithic Iran with Chalcolithic Mehrgarh are much more tenuous. Plus, Jarrige places the Baluchistan Chalcolithic starting from 6000 BC as a major or primary regional center of innovation which then spread both westward & northward. Jarrige even disagrees that there is any Geoksiur influence at Mundigak but traces its origin from Baluchistani Chalcolithic.

    At any rate, all Iranian Chacolithic samples after 6000 BC had high levels of Anatolian Farmer ancestry, which is completely missing from Indus Periphery samples. So a migration from Iran in the Chalcolithic period has to be rejected.

    In contrast, archaeologists are unanimous that there is some proto-Elamite influence at Shahr I Sokhta and also some linkages with Namazga Chl. Not surprisingly we find Anatolian Farmer ancestry in Shahr I Sokhta samples which are not Indus Peirphery.

  25. Alberto,

    The closer links between the East Iranian Farmer ancestries in Turan and in South Asia looks quite probable. But I am not sure of when it started.

    In most of the iniparental studies I have seen which focused on markers spread between Iran, Central Asia & South Asia, such as ydna Q or mtDNA U7, invariably the deepest splits are between Iran & South Asia. Nevertheless there are some younger lineages present in Central Asia which are older than 10 kya. So do we surmise from this that Iran herder/HG ancestry in Turan is also pre-Neolithic ? If true, this will complicate matters even further.

    From around 4000 BC there are definitely signs of interactions or atleast pottery similarities between the North & South of Hindu Kush but the knowledge of this period is still sketchy and we await more research data to come forth.

  26. Jaydeep,

    When they speak about Iranian farmers having too much ANF ancestry from 6000 BC to fit as a source for Indus periphery samples, they refer to West Iran (Seh Gabi and Hajji Firuz). So I agree that a significant migration from West Iran during the Chalcolithic should be rejected.

    But in more eastern parts we have samples that can fit as sources at around 3000 BCE and later (Shahr-I-Sokhta BA, Parkhai, Geoksiur, Sarazm, Bustan, Anau, Namazga… Even some Teppe Hissar is required in the models I posted for Indus periphery).

    The case of the deep splits in uniparental markers between modern populations from India and Iran/Turan is of limited value given how unreliable the studies of uniparental markers of modern populations have been to infer what happened 6000 years earlier. Above, however, you pointed out a few that could be related between India and outside when arguing for an Out of India scenario, so I guess those same ones could work the other way around too.

    The reality is that we have poor data, be it archaeological, anthropological or from aDNA from the Neolithic/early Chalcolithic, so it’s hard to make a strong case either way. We do know, however, that there were intensive contacts from the late Chalcolithic throughout all these areas, and we know that by at least the MLBA they spoke closely related languages. We have the genetic evidence of shared ancestry in the form of East Iranian one. So all this should tell us that we should expect gene flow to have happened at a significant level.

    For now, Out of India has the problem of he putative presence of AASI, while East Iran/Turan doesn’t have that problem. *If* aDNA from Neolithic/Chalcolithic India shows that the population there was 100% east Iranian-like then things will be more even. But that’s still quite a big if. Let’s see if they don’t let us waiting for another few years before we ca get some answers about this.

  27. @Alberto thanks for lowering the barrier. I don’t have too much nowadays to do much reading, I noted recently from Nirjhar’s post in FB pictorial depictions of BMAC cattle show Bos Taurus and not Bos Indicus. Bos Taurus is pretty exotic in the IVC though.

  28. One should not forget the 3 Armenians from 4000bce with L1a1. They prefer 20% of indian iran farmer over Ganj dareh.
    This indian iran farmer like ancestry is again found in the sole female sample in western anatolia at barcin 3800bce

    Target: ARM_Areni_C
    Distance: 3.4041% / 0.03404109
    Aggregated
    36.8 Anatolia_Barcin_N
    21.0 NW_Indian_0AHG
    20.2 GEO_CHG
    11.4 RUS_Samara_HG
    10.6 Levant_PPNC
    0.0 IRN_Ganj_Dareh_N
    0.0 RUS_Sosonivoy_HG
    0.0 RUS_Shamanka_N
    0.0 Baltic_LVA_HG
    0.0 Baltic_LVA_MN
    0.0 100AHG
    0.0 Levant_PPNB

    Target: Anatolia_Barcin_C
    Distance: 3.0835% / 0.03083526
    Aggregated
    56.6 Anatolia_Barcin_N
    20.2 GEO_CHG
    14.6 NW_Indian_0AHG
    6.6 Levant_PPNC
    1.8 RUS_Samara_HG
    0.2 RUS_Sosonivoy_HG
    0.0 IRN_Ganj_Dareh_N
    0.0 RUS_Shamanka_N
    0.0 Baltic_LVA_HG
    0.0 Baltic_LVA_MN
    0.0 100AHG
    0.0 Levant_PPNB

  29. @Nirjhar

    Yes, the Sanauli findings are quite amazing and can change a lot of things. I guess it will still require some time until we have a more clear context of these findings. I hope they will get DNA results from this site soon.

    @A

    Yes, those Areni Cave samples are very interesting showing already a connection to the east (and to the west). If we could get more samples from that period it would be interesting.

  30. Here are the first Craniofacial reconstructions of Harappan people, coming from two individuals of Rakhigarhi ~4500 YBP:
    Craniofacial reconstruction of the Indus Valley Civilization individuals found at 4500-year-old Rakhigarhi cemetery, Won Joon Lee et al . 2019
    https://link.springer.com/article/10.1007%2Fs12565-019-00504-3
    They apparently had typical north indian features. See the videos in the link.

  31. Does anyone know how to create a west Eurasian PCA instead of an all Eurasian one? This clusters InPe samples regardless of AHG level. I’m using PAST

  32. @ “ A”

    “Presence of J2a1 in Mycenaens, minoans & anatolian bronze age needs some investigation imo. “

    In Europe & Anatolia; J2a1 is a possible association with Hatto-Minoan languages.
    Overall it’s a South Caucasian – & spread through northern Iran:/ Turan & Anatolia/Aegean during chalcolithic & Bronze.m Age

  33. @A
    “modern balochis show good fit with bmac ancestry ie BMAC1 for Gonur. if that helps.” Thanks! Balochis should come (at least partly) from the west, because of their linguistic position in Iranian languages, anyway, it can be a sign that BMAC is ancestral to Iranian speakers.

    @Jaydeep
    “Neolithic Mehrgarh undoubtedly had some linkages with people of Iranian Neolithic as the archaeological assemblage of both cultures testify. So how do you square this people of Neolithic Mehrgarh being AASI like. Isn’t this a major contradiction ?”
    I agree that there is apparently a contradiction. When I first read about the many similarities between Iranian Neolithic and Mehrgarh I was surprised, because I already knew about the South Asian features of Neol. Mehrgarh people. So, a possibility is that they had already the arrival of some (East) Iranian farmers but the dominant component was local, AASI-like, although culturally they assimilated many elements with farming and goat-herding.
    However, I have just discovered a study on “Regional variation in incisor shoveling in Indian population” that reveals that shoveling is common especially in West India (Rajasthan, Gujarat, Maharashtra and Goa) with even 85% of full shovel-shaped incisors (similar to Mehrgarh with 83-89% on the upper incisors), while in South India 91% of the subjects had no shovel at all! So, apparently this trait is not connected with South Indians. Inamgaon had 91% shovel-shaped upper incisor 1, and it is in Maharashtra, so it belongs to West India. Harappa has 55%, Timargarha in the north, instead, had only 14% (see p.282 here: https://books.google.it/books?id=Qm9GfjNlnRwC&pg=PA289&lpg=PA289&dq=sundadont+mehrgarh&source=bl&ots=7WdLf3iT1h&sig=ACfU3U1Ov_B7CzZ1wLHF0qhOTpauJTf9TQ&hl=en&sa=X&ved=2ahUKEwiuqs3di4jlAhUJr6QKHVhMCYIQ6AEwBXoECAgQAQ#v=onepage&q=shovel&f=false). So, what can be the source of shoveling? It is typical of East Asians and Amerindians, but so why is it so frequent in West India?
    On the other hand, according to another study, 72% of Rajputs have no shovel shaped incisor 1.

    Another interesting datum we can find in the same table cited above is the change in frequency of Carabelli’s trait in the first molar between Neol. Mehrgarh (26% only) and Chalc. Mehrgarh (61%). Harappa has less, 44%. Average in Europe is 65%, while in modern Isfahan, Iran, on 500 individuals even 96% had this trait!
    http://www.srmjrds.in/article.asp?issn=0976-433X;year=2013;volume=4;issue=1;spage=12;epage=15;aulast=Mosharraf

    But there is another surprising fact: crania from Kish in Mesopotamia, dated 3000 BC, have only 24% of Carabelli’s trait:
    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.6774&rep=rep1&type=pdf
    Interestingly, Hemphill-Lukacs-Kennedy 1991 shows that crania from Kish are close to those from Cemetery H open burials. Kish has also some shovel shaped incisors, Metal age Anatolia and curiously also 27% of Middle Minoan Knossos: http://www.royalacademy.dk/Publications/High/354_Alexandersen,%20Verner.pdf
    Unfortunately I have not found data for Iranian neolithic sites, but Jarmo, that is a Neol. site in Iraq often compared also with Mehrgarh, has no shoveling.

  34. “Thanks! Balochis should come (at least partly) from the west, because of their linguistic position in Iranian languages, anyway, it can be a sign that BMAC is ancestral to Iranian speakers.” —>

    Hey Giacomo, i tried to model Balochis using a chalcolithic western source “Iran_Tepe_Hissar_C” (which has plenty of ANF).
    Here’s how they look

    Target : Balochi
    Distance: 2.7145% / 0.02714534

    51.0 Iran_Tepe_Hisar_C
    27.2 Iran_Shahr_i_Sokhta_BA2
    17.8 Sintastha_MLBA
    4.0 Onge

    Now, if one assumes that Shahr-i-Sokhta BA2 like folks ( Eastern iranian + AASI) existed during neolithic mehrgarh, could one assume that there was a migration of Tepe_Hissar like population (western Iranian + ANF ) from further west during the chalcolithic ?

  35. @tim

    I think that using modern Balochis to infer a Chalcolithic migration is very dubious. They are recent migrants and a modern population.

    Whoever migrated to India during the Chalcolithic should already be represented in the BA samples we have from the Indus periphery, and therefore have low ANF admixture.

  36. @tim @alberto
    “I think that using modern Balochis to infer a Chalcolithic migration is very dubious. They are recent migrants and a modern population.”
    I agree. Balochis are attested in Eastern Iran in the 9th century, and it is thought they came from the Caspian region. https://en.wikipedia.org/wiki/Baloch_people#History
    The strong Tepe Hissar connection can confirm this Caspian origin. Tepe Hissar people were not recent migrants, according to Narasimhan’s paper, but were quite stable in time. Archaeologically, Tepe Hissar 3C has clear BMAC elements, then it finishes, in the 2nd mill. BCE. It is interesting that Hemphill & co. found anthropological affinities between people of Tepe Hissar 3 and Harappa (cemetery R37). Now we can attribute this similarity to the dominant Iranian-farmer component.

    BTW, Hemphill has faced very directly the Indo-Aryan issue in two much more recent papers: https://www.academia.edu/8627556/Bioanthropology_of_the_Hindu_Kush_Highlands_A_Dental_Morphology_Investigation
    https://www.academia.edu/8627533/Are_the_Kho_an_Indigenous_Population_of_the_Hindu_Kush_A_Dental_Morphometric_Approach

    In the last one, Hemphill recognizes that also his own previous model, suggesting the arrival of Dravidians in Chalcolithic Mehrgarh, does not work, because it has no affinity with present Dravidians from SE India. Instead, it has some affinity with present Kho people, Dardic speakers with a particularly archaic, even close to Sanskrit language. It is curious that he has not mentioned the possibility that Chalc. Mehrgarh was actually Indo-Iranian, but he states that the Aryan Invasion theory has no ground because there is no affinity of post-Harappans with Central Asians, except Sarai Khola that is too late. On the other hand, his study shows also affinity of Kho with Djarkutan…
    Maybe you remember the Eurogenes post on them: http://eurogenes.blogspot.com/2018/01/the-kho-people-archaic-indo-aryans.html

    The source saying they have 80% R1a has disappeared from wikipedia, do you know it? I have found only a paper on mtdna: https://www.researchgate.net/publication/331844587_Genetic_structure_of_Kho_population_from_north-western_Pakistan_based_on_mtDNA_control_region_sequences

  37. Giacomo,

    Ulahh, Olofssen et al. (2017) has a sample of 20 Kohistani Dardic speakers.

    1 G2a
    10 H1a
    1 L1
    2 Q
    1 R
    5 R1a

    Some of the Pashto speakers in the Dir region have ~80% R1a. Other Pashtun tribes seem to have ~80% G2a – I think that suggests strong bottlenecks on the Y-chromosome among those groups.

    The nomad Gujars in the region are dominated by haplogroup L1.

  38. @Alberto

    I see what you’re looking at. After going through my stuff, I am kinda leaning towards a Hyrcanian homeland even though I would like some Anatolian samples. Though, I still think PII or pre-PII came from the west.

    @Giacomo Benedetti

    “Archaeologically, Tepe Hissar 3C has clear BMAC elements, then it finishes, in the 2nd mill. BCE.”

    It’s the the other way around. Hissar IIIB (IIIB: ca. 2400-2170 cal) and IIIC (2170-1900 cal. BCE) elements (grey ware..etc) took over during the later stages of BMAC after 1800BCE. These elements were wrongly interpreted by Kuzmina and Antony as Andronovo nomads taking over BMAC. Some archaeologist also call it the Elamite influence in BMAC, which is nonesense.

  39. @Vara

    While I don’t have a very specific homeland, my preference for an early presence in East Iran / Turan comes down to one linguistic and another genetic reasons. The linguistic we mentioned before already and it’s the difficulty of explaining Tocharian with any other model. The genetic one is shown in the post, where any significant migration to India within a PIE time frame c. 4500 BC must be from an Eastern area given the low Anatolian admixture in India.

    But this is a very generic idea and not something I would argue strongly for. Still waiting for aDNA to answer some questions before going deeper into the problem.

  40. @Vara

    ““Archaeologically, Tepe Hissar 3C has clear BMAC elements, then it finishes, in the 2nd mill. BCE.”
    It’s the the other way around. Hissar IIIB (IIIB: ca. 2400-2170 cal) and IIIC (2170-1900 cal. BCE) elements (grey ware..etc) took over during the later stages of BMAC after 1800BCE.”
    I am not speaking of grey ware, it is commonly said that IIIC has strong presence of BMAC elements, for instance here in Encyclopaedia Iranica (http://www.iranicaonline.org/articles/tepe-hissar): “many connections with Margiana (Marv) and Bactria occur in Hissar IIIC. These include mini-columns, alabaster discs, animal figurines, bidents, tridents, axe-adzes, compartmented copper stamp seals, lanceheads with bent tangs, metal horns, cosmetic bottles, beads with incised circles, etc.”
    Do you think that these elements came from Hissar to BMAC?
    BTW, there is also said that Hissar IIIB has a building with a fire altar…

    Related to Inamgaon that is often cited for the skeletons, its roots are in Malwa culture that also had fire altars. It had a barley and wheat agriculture, and there were also horses at Inamgaon. I think they were already Indo-Aryan colonists, so the affinity of Inamgaon with Neol. Mehrgarh does not mean they were all Dravidians, although probably in Maharashtra they mixed with proto-Dravidian speakers.

    In Karnataka and Tamil Nadu the first agriculture has millet and no barley and wheat (https://www.britannica.com/place/India/The-end-of-the-Indus-civilization), so it seems really independent, which can explain the formation and spread of Dravidian languages.

  41. Have you guys read the Lech Valley paper? It really looks like those R1b-L51 folks made a special point to completely replace competing male lineages everywhere they went.

  42. I have been troubled by the described 30% or so autosomnal impact on Northern South Asia post-1700BC with the postulated Steppe Migration.

    Is it possible there is a major problem with the modelling? I know I am making a massively controversial statement here but here are supporting points

    1. YDNA does not support this massive shift. L657 is not found in Steppe-MLBA. The dominant R1A clade in Steppe MLBA is low/negligable in South Asia.

    What is postulated by the AMT is a large autosomnal impact with minimal affect on Y-DNA. This is the opposite of what happens in a elite takeover.

    2. A 30% autosomnal shift would require a migration of families who then have large numbers of children. Elite takeovers tend to result in men having more offspring with local women and diluting their original autosomnal composition.

    Who wants to bring their families over difficult and inhospital, politically unstable terrain, through mountains and possibly deserts.

    3. Phentype Data. A recent reconstruction of 2 Rakhigarhi skulls showed mostly Caucosoid features, notably with “hawk-shaped, Roman” nose.

    If you guys are knowledgable about South Asia you will know that this is a very common and in many cases quite extreme feature in Northern South Asia. Without data on how phenotypes, especially those based on multiple genes, like nose shape, are affected by population migrations and mixing, it is difficult to interpret this scientifically. However, it does seem such a ‘sensitive’ phenotype (one which is present to varying degrees) would be more affected by population mixing, and caould easily disappear by out-mixing, certainly by out-mixing 30%. The Habsburg Jaw is a good example of a comparable trait. I dont think that would survive out-breeding to extent of 30%.

    “Shriver found that there was a very strong statistical correlation between the amounts of admixture and the facial traits.”
    https://www.sciencedaily.com/releases/2009/02/090214162756.htm

    I was wondering what you guys think about this?

    If we take the autosomnal modelling away, do the Y-DNA data support the conclusions of the recent papers.

    If not, why are we placing more emphasis on autosomnal modelling instead of y-dna, when are interested in a takeover by elite dominance? The papers use autosomnal modelling to push ‘results’ as this promotes their proprietary work. Y-DNA is easy and not innovative.

    Kudos to Frank for mentioning on Eurogenes that L-657 had not even been found in Steppe MLBA. Without that comment I wouldnt even know, people are looking at the wrong things.

    With a 100 or so samples from Steppe MLBA, and modern Indian populations, we dont really need autosomnal modelling, which currently seems to require alot more data and refinement for it to produce uncontentious results.

    Genetics is not a major area of expertise for me so I dont have good data to hand. What do you guys, does y-dna support a migration of MLBA into South Asia?

  43. @mzp

    There is no doubt that Swat_IA samples pull towards steppe component, you can see that on the PCA plots. in a 2 way qpAdm of Indus Periphery Pool (all 11 samples) and central_steppe_mlba, the steppe mlba autosomal component of Swat_IA (85 samples labeled as iron age, not including other 30 historical and medieval samples ) is ~22.3%(+-1.1%) and not 30%.

    There are a lot of issues with Narsimhan’s modeling. Noone has dissected his paper’s modeling thoroughly. I have been doing so for over a week now using qpAdm, trying to reproduce his results. Narsimhan does a bad job by rejecting any other steppe sources apart from MLBA, in my opinion.

    For the above Swat_IA = IndusPeriphery + central_steppe_mlba qpAdm model
    The p-value of this is too low (with allsnps = YES – p-value = 0.001198, without allsnps it is 0.00017). As per Narsimhans supplement (pdf page 283 Fig S50) his p-value is 0.006, so Im guessing the difference is because of the right outgroups we both chose. I cant seem to find the ones he used in his model.
    Regardless, this p-value would normally be rejected (usually >0.05 is accepted, or >0.01 if one is pushing it). But Narsimhan lowers it to >0.005 just so as to accept his favourite model.

    There is another issue with his modeling. Indus periphery 11 samples are hardly representative. They dont even cluster together neatly enough to choose steppe source properly. ie. choosing a subset of those 11 InPe samples as source can easily make the model accept Molaly_LBA as steppe source over Mlba. Given that noone knows what Swat ancestry was exactly like prior to steppe folk arriving, his conclusion that only MLBA is possible steppe source is very premature. He just doesnt have enough Indus or swat samples to make this conclusion.

    For eg. Indus5 (5 samples) + Molaly_LBA is accepted for Swat_IA with p value = 0.04 with coefficients (65%+-3.6% , 35%+-6%) whereas central_steppe_mlba instead of Molaly is rejected with p-value 0.0000007.

    This is also supported by Vahaduo global25. SwatIA samples choose Han over Onge or even Matts 100AHG pure AASI component. ie there is some LBA ancestry involved which has east asian component. Might also explain the 1 Q1a found in swat valley.

    Most likely, both mlba & lba were involved in the migration, however different groups need to be modeled differently, unlike what Narsimhan has done.

  44. Can someone help me understand this from narsimhan supplement

    “Using previously reported calls on 1000 Genomes Project Y chromosomes (223), we observe that 62 out of the 221 South Asian males have an R1a Y chromosome corresponding to a ninety-five percent binomial confidence interval of 22-34% for Steppe MLBA ancestry on the entirely male line, which is significantly higher than the ninety-five percent confidence interval of 9-14% on the autosomes in the same set of individuals. These results shows the process of admixture of Central_Steppe_MLBA into the ancestors of the ANI was male biased, and reveal that the directionality of sex bias was opposite to the pattern observed for the contribution of Central_Steppe_MLBA to SPGT.”

    Isnt this circular reasoning? Is it not possible that R1a L657 ( (wherever the origin is) expanded in India much later from a few founders after female mediated steppe ancestry had already come in? eg Mauryan expansion post 500bce (which would explain how R1a reached Sri Lanka).

    Apart from the minor presence of R1a in Swat, the 2 outlier samples from Swat with the highest steppe (~50pc, Loebanr_IA_o, Udegram_IA_o) are both mediated through steppe females. 1st is male with E1 Y haplogroup, the other is female with steppe mtdna T1a1.

  45. Found something that will put an end to the steppe Indo aryan hypothesis.

    Heres the archaeological context on Bustan BA (1600-1300BC). From Narsimhan supplement Metadata
    “Archaeological investigations at Bustan Burial Mound have revealed a complex funerary ritual related to the usage of fire. On top of the graves there were piled rocks, showing the influence of Steppe traditions. There were inhumation as well as cremation burials. There was a dedicated chamber for cremation of bodies at Bustan, including multi-usage hearths and altars. The altars were functionally classified into ones used for libations, ones used for meals, and ones used for sacrifices. The funerary rite documented at Bustan, specifically in relation to the role of fire, is not known at this time from any other site Iran, South Asia, or
    the Central Eurasian Steppes.”

    More details available here http://www.archeo.ru/izdaniya-1/archaeological-news/annotations-of-issues/arheologicheskie-vesti.-spb-1995.-vyp.-4.-annotacii

    “Three bonfires were made for each cremation act. Their traces were found at the level of buried soil south, west, and east of the incinerators (figs. 1; 2: B). These finds are closely paralleled by the Vedic texts, where cremation, described as an offering to the sacred fire carrying the body to heaven, is said to be made in three open fires (Rigveda X, 16, 18; Atharvaveda XVIII, 2, 7; Asvalayana-grihyasutra IV, 1, 2).”
    These are late vedic practices.

    Of course, Bustan BA has no trace of steppe ancestry,( not even in the outliers, except 1 which has elevated steppe as well as IVC ancestry).

    Before the genetic data, this was connected with the assumption that this site was infested with incoming Aryans. But now you have Aryan culture with 0 steppe mlba or LBA genetics.

    The dominant Y haplogroup here is J2a (also dominant in brahmins and more specifically, modern zoroastrians)

  46. @AK
    Yes I noticed this when the pre-print came out . And from Harappan we also have similar data from Sites like Kalibangan , Banawali, Lothal .
    But I have seen Narasimhan arguing that cultural aspects were of local origin, but the language was brought with steppe migrations 😉 .

  47. Hi, Alberto, Matt and A

    I’m curious about what you guys think about Dzudzuana ancestry in Iran_N and CHG as suggested by Lazaridis…

    “Iran_N/CHG are seen as descendants of populations that existed in the Villabruna→Basal Eurasian cline alluded to above, but with extra Basal Eurasian ancestry (compared to Dzudzuana), and also with ENA/ANE ancestry. ”

    “CHG/Iran_N were Dzudzuana+Basal Eurasian (or, equivalently Villabruna+Basal Eurasian) derived populations also modified by ENA/ANE admixture.”

    How would you guys interpret this in terms of Iran_N, as well as Iran_N ancestry in South Asia?

  48. @AK, L657 is from South Asia and also well spread Persian gulf arabia and . Bit non L657 in India is not negligible and is also quite well spread with no notable structure.

    ““Using previously reported calls on 1000 Genomes Project Y chromosomes (223), we observe that 62 out of the 221 South Asian males have an R1a Y chromosome corresponding to a ninety-five percent binomial confidence interval of 22-34% for Steppe MLBA ancestry on the entirely male line, which is significantly higher than the ninety-five percent confidence interval of 9-14% on the autosomes in the same set of individuals”

    62/223 ~ 28% is interpreted as 22-34 % vs 9-14 %.

    Is this a valid comparison? seems dubious. coarse SNP call vs autosomal component. Is there a way of doing component analysis on just the Y chromosome or is it too tiny for good stats? Can the experts weigh in please.

    If we assume half of those R1a are L657 which radiates out from Nepal(as per anthrogenica) then we are left with an MLBA signal with a fairly gender neutral distribution.

    Also swat mlba is more female mediated and males come later during the Iron Age and historic period. Its a typical pattern where females diffuse first vs males who are patrilocal.

»


Comments are closed.