Chapter 7: Trans-ancestry analysis (Video Transcript)

7.1: Cross-Ancestry Analysis

Title: Cross-ancestry PTSD and Polygenic Findings, and the Cross-Population SIG of the PGC

Presenter(s): Laramie Duncan, PhD (Department of Psychiatry and Behavioral Sciences, Stanford University)

Caroline Nievergelt:

Welcome to the PGC Worldwide Lab meeting, I see that people are still calling in, so let’s give them a minute. Okay, looks like it’s stable now. So today’s topic is analysis across ancestries. We have three speakers today: Laramie Duncan, Elizabeth Atkinson, and Alicia Martin. All three talks are centered around GWAS analysis across ancestries, which is a really timely topic.

Today, we’re gonna do something a little bit different than usual. We are actually gonna wait until after all three talks are over to take questions. Then, we will allow you to unmute yourself. So if you have a question for any of the presenters, you can just unmute yourself and ask a question directly. And if that’s not gonna work, then we can still take questions via the chat function.

Okay, so let’s get started with Laramie Duncan, who is currently an assistant professor of psychiatry at Stanford University. Laramie received a joint PhD in neuroscience and clinical psychology from the University of Colorado, and then did a postdoc in statistical genetics in the labs of Jordan Smoller and Mark Daly at MGH [Massachusetts General Hospital] and Harvard Med School. Today, she will be talking about GWAS analysis in PTSD across diverse populations, a special interest group in the PGC. So take it away, Laramie.

Laramie Duncan:

Okay, great. Thanks so much, Caroline. I really appreciate this opportunity to present. So thank you to you and Pat. Like you mentioned, I’m going to talk about three topics today related to cross-ancestry analyses. First, the PGC PTSD analyses. Second, polygenic risk scoring analyses. And third, the cross-population special interest group within the PGC. So, starting with PGC PTSD, the original wave of the PGC PTSD analysis was led by Dr Karestan Koenen and my postdoctoral mentor, as Caroline mentioned, was Dr Mark Daly. They provided incredible leadership and mentorship for this so I wanted to mention them upfront.

And the reason that we’re talking about PGC PTSD is that, actually, compared to most large-scale GWAS, our samples were relatively diverse. So, as you can see here, we actually had a minority of European ancestry participants, and we had a rather large sample of African Americans, as well as some Latino/Hispanic individuals. This is again for the first wave of PTSD analyses within the PGC.

At the time when I made this figure, I looked at the other large psychiatric GWAS. And like the rest of medical genetics, we can see that most samples are of European ancestry. The samples that aren’t of European ancestry across medical genetics tend to be from East Asian populations. So, despite the fact that PGC PTSD does have greater representation of more populations, it’s important to note that it’s still not reflective of world ancestry.

But nevertheless, in conducting these analyses, the fact that we had these more diverse samples necessitated modifications to our analysis. And so, when we approached this problem, we did what anyone does: we thought about what might be appropriate, talked to experts, and tried a number of different analysis approaches. And there were a lot of modifications that were necessary.

Just showing one here. We decided ultimately to conduct a trans-ancestry meta-analysis. So, in order to do that, we needed to assign ancestry for each of the participants in our study, such that within each cohort, we could individually conduct ancestry-specific GWAS prior to meta-analysis within each ancestry, and then trans-ancestry meta-analysis. So, actually, though this work has continued in terms of making further analytical improvements to the pipeline, I just want to point everyone here to Caroline’s work on this, and Adam Maihofer and her group. They’ve continued to improve the pipeline with some really nice additions and changes to quality control, imputation steps, and the analysis. This second round of the PGC PTSD analysis is now available on bioRxiv. The link is here. So, for questions about the most up-to-date improvements, I would suggest that people check out this paper or talk to Caroline.

Going back to PGC PTSD, the first wave of the PTSD analyses, though, I’d like to point out something that was interesting and informative from these analyses. So, purely by chance alone, it happened to be the case that we had about 10,000 individuals of African American ancestry and European ancestry, with about 25% cases each. Despite these comparable sample sizes, it turned out that we had no significant results in the African American samples, whereas, as expected, we had significant SNP heritability estimates in the European ancestry individuals, as well as polygenic predictions. For example, using schizophrenia external GWAS results to predict PTSD in the European ancestry samples. And based on power calculations, we thought that we would have enough power in these European ancestry samples. But due to expectations of lower transferability in African American samples, we didn’t know if we would have enough power. And in these samples, it turned out that we did not.

The reasons for this are at least somewhat well understood. Number one, we had worse coverage of African ancestry variants in our samples. This is both because African ancestry individuals have greater genetic diversity than other populations and so, for any given number of variants, you can’t cover as much of the genome. But also, as many people know, there’s a bias towards European ancestry variants on genotyping chips. So, both of these problems probably contributed. Also, the lack of external data resources from non-European populations is really a problem, particularly for these African ancestry samples. But this is true for other populations as well. And then, finally, there are also methodological inadequacies currently. For example, differences in linkage disequilibrium and allele frequencies are still not handled as well as they could be in many different analyses.

So, the good news is that in the second wave of the PTSD analyses - this paper that I mentioned before from Caroline - we had larger sample sizes for all populations. And with these larger sample sizes, we could actually estimate heritability and have polygenic predictions in the African American samples. So, it does look like this was an issue of power. And, certainly as expected, it helps to have larger sample sizes.

I’m going to turn now to a couple of quick points about polygenic risk scores. I don’t know if it’s partly just being in Silicon Valley and being near 23andMe, but everyone is talking about polygenic risk scores here. Not within my psychiatry building, but just people are very interested in these. A couple of years ago, when, for example, 23andMe was releasing results without any polygenic results, without any mention that these scores might, or certainly would, have differential performance across ancestry groups. So, we thought that this was a really important topic to spend some time exploring.

So, to give a bit of background, as early as 2009, the ISC paper, Purcell et al., demonstrated that polygenic risk scores that are derived from European ancestry populations have poor performance in African American samples. And this poorer performance in non-European ancestry samples has been shown many times since in lots of different polygenic risk scoring publications. So, the new question that we wanted to look at was, we wanted to quantify the decrement of performance between European ancestry populations and other ancestries.

We looked at a couple of ways of doing this. I’ll just show one bit of information here. We looked at 10 years’ worth of polygenic scoring studies. First, to just see who was included in these studies. As expected, these studies also have primarily European ancestry. But then, getting to the performance of polygenic scores across different ancestry groups, we made comparisons. For example, one of the results is that we found that polygenic scores performed about three times better in European ancestry individuals than in African American samples. I don’t know if I think Alicia may or may not mention additional data that she has here that has, I think, even better analyses addressing this question.

So, in summary regarding polygenic scores, the performance is worse for non-European ancestry populations. This is not surprising; this was known for quite some time. But efforts are underway to quantify how much worse the performances are in different populations. I think it’s important to know what to expect both for scientific research, where we want to have accurate power calculations, and to the extent that anyone is using these scores in clinical practice ultimately, or just in their own lives getting 23andMe reports, it’s important to know how well they might work. Efforts are also underway to improve prediction across populations.

But getting to a bigger picture question, I think that the question a lot of us have on our minds is, what do we do about the substantial underrepresentation of most populations in genetic studies? So, this gets me to my last topic. And at first, or as a point of background information, there are many potential solutions to this problem, and I know that many researchers within the PGC have been working in this area for a considerable time. And so we are just mentioning one additional approach. But, you now, if I had more time, I would mention other efforts that have been ongoing. Something that we did, with Hailiang Huang, who is now junior faculty in Mark Daly’s group at the Broad, is that we started a cross-population special interest group within the PGC, with the goals of improving the applicability of genetic results, conducting analyses, and supporting the involvement of researchers from different parts of the world.

In one of our very first meetings, Roseann Peterson had a really great idea to write a best practices paper describing the ways in which genetic analyses ought to be modified for samples of diverse ancestry, in particular admixed samples and mixtures of different samples. Some of the topics addressed in this paper include specific recommendations for how to modify quality control steps, which parameters should be modified and how and why. As well, there are recommendations for imputation and how to analyse samples that are GWAS or a mixed model context. And this is absolutely a huge team effort, and I want to just say thank you to this group. I think that the reason such a best practices paper doesn’t already exist is that there isn’t really any one person in the world who could have written a best practices paper. It’s thanks to the analysts in this group especially, that have expertise in particular areas, that we were able to cover these different components of analysis and really write about the various approaches, and what are the pros and cons are for taking one approach over another. So that’s what’s in this best practices paper.

Hailiang spoke to an editor at Cell, and they were very interested in this paper. They suggested that we present it as a primer, so that’s the current format of the paper. We’ll be submitting it on Monday, so that’s our first work product. The special interest group is open to anyone; it’s technically within the stats group. If you’re interested, we have meetings on the first Wednesday of the month at 1:00 p.m. currently. You can get in touch with us. Thanks to all these groups for their incredible collaborative efforts. So, thank you.

7.2: Ancestry-Specific PRS

Title: Clinical use of current polygenic risk scores may exacerbate health disparities

Presenter(s): Alicia Martin, PhD (The Broad Institute of MIT and Harvard)

Caroline Nievergelt:

Our third speaker is Alicia Martin. Alicia is a instructor at MTH and an affiliate researcher at the Broad in Harvard. She received her PhD in genetics from Stanford University, and her current research focus is on developing novel statistical methods to improve the generalizability of genetic risk prediction from your Eurocentric genetic studies. So, her talk today is entitled “Clinical Use of Current Polygenic Risk Scores may exacerbate Health Disparities.” Go ahead, Alicia.

Alicia Martin:

Thanks, Caroline. Um, sorry, try to move this other way. Well, hopefully this goes away. So I’m excited to talk to you guys today, and thanks for having me here. Can you guys all see this part of the zoom screen, on the screen?

Caroline Nievergelt:

We can see your slides.

Alicia Martin:

You can see my slides, okay? Perfect.

So, I think an important point to start at is in thinking about human population history. So all of our genetic differences across populations are sort of shaped by how we originated and how we migrated and mixed as we moved out of Africa. So of course, humans originated in Africa. This has been shown through genetic evidence as well as archaeological and linguistic evidence. So many sources showing this, and then humans migrated Out of Africa, and as they did so, they took a subset of genetic diversity with them as they populated Europe, Asia, Australia, and into the Americas.

Another point that I think is worth talking about—sorry, here is the fact that GOS are becoming super increasingly powerful, and so that’s very, very exciting because we’re making so many more biomedical discoveries these days. Some drugs are even being developed out of this, and this has been growing at such an exponential rate that it’s been really, you know, impressive and somewhat challenging even to keep track of all of this progress. So that’s been really, really awesome to watch.

Unfortunately, however, as Laramie mentioned earlier, and Elizabeth did as well, genetics has this diversity problem. So now, shading the same growth and progress in genetics by the populations that are represented in these genetic studies, at the individual level, you can see, of course, that the vast majority of participants in genetic studies, about 80% these days, are of European descent. And this is far out of step with the global population, where about 16% of the world is of European ancestry. And perhaps even more troublingly, if we look at the fraction of individuals that have been in these GWAS studies as an overall proportion, the progress in diversifying genomics has somewhat stalled or perhaps slid a little bit since about 2014. And so this is a really big issue if we’re trying to generalize studies to everybody.

And so one of the nuances I want to mention here is that it’s not that the studies have been getting smaller; studies have generally been either staying the same or have been growing in different populations. But one thing that we’ve seen is that studies have been growing much, much, much more rapidly in European descent populations than elsewhere. And so one question that I really focus on is in understanding how to do ancestry study biases in genetics impact the generalizability of the knowledge that we can learn from these studies. And so I break that down in a few ways in different aspects of our work. But in general, I want to highlight a few key points that I think are worth keeping in mind throughout all of these questions.

So one of these is that the fundamental biology is really shared across different populations. So one person from one population will probably have, if they’re gonna have a heart attack, it’s probably for the same underlying reasons, for example, as an individual with another ancestry. And that’s true in genetics, too; in genetics, this is true as well. So when we’ve looked across many different biomedical domains and tried to understand what causal variant effects are in different populations, these really tend to be mostly shared. So it’s not that there’s anything special about genetics that’s different from other biology in general. The causal genetic variants seem to be the same and shared across populations.

But there are some complications in interpreting genetics across populations for several different reasons. So, of course, there’s—it’s also worth keeping in mind that there’s more genetic variation within than between populations. So populations are not substantially genetically differentiated to the point where we’re finding completely distinct genetic populations. In fact, most of the genetic variation is shared; that’s common across populations, and there’s more genetic variation within than between.

Another point that I think is really worth keeping in mind is the LD structure, this correlation structure of the genome, is one driving factor that’s created a lot of challenges because this really differs across population as a function of human history going back to the first map that I showed you.

So, to try to address this question of how generalizable our genetic studies are, I started with computing polygenic risk scores. So Laramie talked about this earlier, this great enthusiasm in polygenic risk score space, and how it’s been really impressive to see the growth in this area in the past few years. Just so we’re all on the same page, I think most of you are familiar with a polygenic score. But in general, this is just predicting an individual’s phenotype from genotype. So we’re basically taking genotypes from some target individual, some effect size estimates from GWAS that exist, multiplying these together, summing them up across the genome, and that’s basically our phenotypic prediction. That’s a very simplistic method; there are other methods. But some considerations that sort of cut across all of these different methods for computing polygenic scores are which SNPs we should include, what weights we should use, and one thing that we always need to address in terms of the utility of our polygenic scores is how accurate is the score. And this is really going to vary a lot with sample size, the heritability, the genetic architecture of the trait, and a number of different factors. So definitely worth keeping in mind all of these complexities and interpreting polygenic scores across the literature.

So a study that we had done previously showed that population history really impacts genetic risk prediction across diverse populations. And so a few points here are the genetic prediction accuracy decays with increasing genetic divergence between the discovery and target populations. So, in this figure on the right, I’m showing you a polygenic score distribution computed in the thousand genomes project. So you can see that we predicted European populations, for example, to be taller than American and South Asian populations, and we predicted East Asian and African populations to be the shortest globally. But these differences really don’t line up with anthropometric studies and they’re rather misleading. These distributional shifts are really, really massive, and so this is clearly not necessarily reflecting reality.

We also set up some coalescence simulations alongside some statistical genetics simulations and showed that these polygenic scores can differ across populations arbitrarily, and these are not necessarily meaningful. And we also showed that neutral human evolution alone can be sufficient to explain these differences. We don’t necessarily rule out selection, but we do say that neutral evolution and drift, in particular, may be driving some of these differences. There’s a really interesting couple of papers that I want to point out by Michelle Sohail, Jeremy Berg, and colleagues that were published in Elife yesterday on this topic.

So we wanted to look at this in a large-scale setting, and so to do that, we looked at the UK Biobank data. The UK Biobank is, of course, mostly consisting of European descent individuals, and so we used those individuals to conduct GWAS for several different traits, 17 traits that were all quantitative, so things like height and BMI, and then a number of blood panel traits. And then we used the diversity in the UK Biobank in this non-European subset to try to understand how generalizable prediction accuracy is.

In general, what we saw is that if we normalize prediction accuracy across all of these 17 traits to how accurately we predicted in Europeans, we saw a pretty substantial drop-off in prediction accuracy across these different populations. So, for example, on the rightmost part of this graph, we saw a four and a half-fold improvement in prediction accuracy in Europeans with respect to how well we’re doing in African descent populations. We’re also doing about twice as well predicting in European populations as in East Asians, and you can see how well we’re doing in these other populations. So these are really, really quite large disparities here.

So why is this happening? A lot of people have written about this before, and it’s been well-covered in the literature, but in general, there’s a pretty predictable basis of polygenic risk for disparities, and it obviously relates to who we’re studying. So why, though? Well, as you know, GWAS are best powered to discover variants that are common in the population. So if we’re studying European populations over and over again, in general, we’re detecting variants that are most common in European populations, which are then able to explain more of the phenotypic variation in European populations than those variants that are less common in other populations. Additionally, the LD differences across populations mean that we are probably getting better tagged SNPs in the GWAS in European populations than in other populations. And then there are other really much more complex topics that are influencing this generalizability—differences in environment, selection, and other more complicated differences.

But I want to emphasize that there’s a lot of consistent promise from diversifying efforts so far. So, for example, the PGC schizophrenia working group, led by Hi-Liang Wong and with a first author Max Lam, have been working to really increase the sample size of East Asian individuals in schizophrenia studies. So right now, the European case-control studies in schizophrenia are still about threefold larger than the East Asian Studies, but the progress in the East Asians has been really, really rapid and massive lately. And so when we looked at how well we were able to predict East Asian schizophrenia risk in these case-control cohorts. In general, what we saw was that we were much better able to predict East Asian schizophrenia risk using the ancestry-matched East Asian data, despite the fact that the European training data was about threefold larger. So that indicates that there’s really a lot of promise and value in doing ancestry-specific studies here.

So lastly, I want to turn to this effort that we did to try to compare some biobank-scale analyses. So generally, what we were really interested in was doing equal-size GWAS in the UK Biobank and in the Biobank Japan, where we have a lot of overlapping traits that have been deeply studied. Then, we’re interested in predicting in the European populations and in the Japanese population to try to understand what prediction accuracy looks like, to see if this was symmetric and comparable across populations. We also tested prediction accuracy in the UK Biobank African descent populations.

So in general, what we saw was that ancestry-matched disease prediction was most accurate. So on the Left, I’m showing you along the x-axis the disease, and on the y-axis, I’m showing you the prediction accuracy, and the ancestry-matched results are indicated by matching colors. So on the y-axis is the target of prediction, and the summary statistics that we’ve used to generate the predictors are shown in blue for the Biobank Japan individuals and in red for the UK Biobank individuals. And so in general, we did better in both scenarios with the ancestry-matched disease prediction. One thing that’s sort of interesting that we learned was the overall disease prediction was more accurate in Japan, and that was sort of a consequence of how these cohorts were constructed. So the Biobank Japan cohort is a more hospital-based disease-ascertained cohort, whereas the UK Biobank is a healthier and wealthier than average population-based cohort.

A similar finding emerges from the quantitative traits. So for these general health measures—anthropometric and blood panel traits—in general, the ancestry-matched quantitative trait prediction is also most accurate. Taking a look at these y-axes, you can again see notable differences in how accurate prediction is in each of these populations, and the quantitative traits are predicted most accurately overall in general in the UK European samples, and that’s again because of this ascertainment in the UK Biobank.

Okay, so I want to stop here and wrap up and think about next steps. So in general, I think polygenic scores are really exciting and interesting and may have some power to improve clinical models, but right now I’m a little concerned or kind of a lot concerned that they’re likely to increase health disparities due to these vast Eurocentric GWAS cohorts. So I see this as a call for a few pushes that we really need to make concerted efforts. One is we need much more diverse GWAS studies, and another is that we need new statistical methods to address these major issues. And on this topic, since it can be a bit sensitive, I just want to urge everyone that’s working on these cross-cultural, you know, widespread topics to communicate their research responsibly and widely and anticipate the implications of your research, thinking about the potential negative consequences that will happen.

I also wanted to point out that there are large efforts, for example, going on in Africa to scale up GWAS there, especially in psychiatric space, for, for example, developmental disorders and for schizophrenia, and I think this is really exciting and it’s taught us a lot about how to do cross-cultural research and an ethically responsible way that accompanies some capacity-building efforts.

So I want to acknowledge that this is a huge effort involving a ton of really awesome people, including my advisor and mentor, Marc Daly.

7.3: Local Ancestry and Admixed Populations

Title: Tractor: A framework enabling the well-calibrated genomic analysis of psychiatric traits across admixed populations

Presenter(s): Elizabeth Atkinson, PhD (Department of Molecular and Human Genetics, Baylor College of Medicine)

Caroline Nievergelt:

Elizabeth Atkinson - Postdoc at MGH in that Broad with Mark Daly and Peniel. Elizabeth did her PhD at Washington University in St. Louis and is currently developing resources that allow for improved genetic analysis in admixed populations. She’s a member of the PGC PTSD group, and she has developed local ancestry analysis framework that enables well-calibrated genomic analysis of psychiatric traits across admixed populations.

Elizabeth Atkinson:

All right, thank you so much again for inviting me to talk about this project. I’m really excited to hear the feedback. So, yeah, this Caroline said, “Today, I’m going to tell you about a project I’ve been working on, hoping to improve the ability to do sophisticated genomic analyses on admixed populations.”

So, to start off, there we go. A whole reiterate a point that you know, we’re hoping is now becoming common knowledge in the GWAS community, and that Laramie just spoke deeply about. The vast majority of our association studies are conducted on European cohorts. And if we look more closely at this sort of wedge of the pie that is non-European, we’ll notice that only a few handful of percent are populations that are admixed. Just to make sure that we’re all on the same page, starting off when I say “admixed,” I’m talking about an individual whose ancestry is not homogeneous but rather is comprised of several ancestral populations.

So importantly, there are actually many more samples out there who have been genotyped or sequenced alongside phenotypes, but they’re not actually making it into this figure. They’re being excluded from analysis due to being too admixed. And there are also several significant large-scale efforts to collect psychiatric phenotypes alongside genomic data in diverse populations, some of them led right here at the Stanley Center. So there really is a pressing need for the development of novel methods that will allow the easy inclusion of admixed people into association studies. We really can’t afford to kind of leave large swaths of our data on the table anymore, and you know, let alone the other problematic aspects of leaving entire ethnic groups understudied.

So, admixed individuals are generally removed due to the challenges of accounting for their complex ancestry. So there are kind of concerns about population structure infiltrating analysis and biasing results, which can lead to false positive associations. Studies that do include admixed individuals tend to generally only correct for PCs. So PCs, actually, I think I can make a laser pointer here. Let me do that. All right, so PCs generally only correct for kind of average ancestry. So if somebody is, say, you know, 75% African, 25% European, or vice versa, and they actually, you know, can’t really consider a lot of the more fine-scale population structure that can still be present on the data, and this is important which, you know, this sort of fine-scale pattern might be different, for example, in one case in control cohorts and still leave the door open for false positives.

So here, I’m going to present a novel analytical framework to hopefully rectify this issue and allow for the easy incorporation of admixed people into association studies. We specifically do this by accounting for local ancestry, which takes into account this finer scale of population structure. So to get a little bit more into the actual kind of method, we’re calling this method “tractor” for the moment. The core feature of the proposed framework relies on inferring population structure, as I said, informed by local ancestry. So the first step is this automated pipeline to call local ancestry tracts in your sample. Just in case you know you haven’t seen any painted karyogram figure before, here’s an example of a Latino individual. The autosomes are along the x-axis, position along the chromosomes on the ,Y and then the two strands of each chromosome are painted according to the ancestral origin of that tract.

So we use this information after we collect it to kind of improve long-range phasing and haplotype tract recovery, which I’ll go into in a lot more detail in coming slides. And then the end goal is basically to be able to extract out the component ancestries of interest. For example, if you have a large European cohort and you have this admixed population, you could use tractor to scoop out the European bits of your admixed cohort to include alongside your European individuals. And, you know, the same goes for the African and Native American components in this example. So, in this way, you can kind of leverage the information in admix people; you don’t have to exclude them from analysis anymore.

Since the first step in this pipeline involves calling local ancestry, I wanted to validate that it was performing well and kind of the target populations that we had in mind. So the use case I’ll be talking about for the rest of this talk is the African-American demographic context, and we modeled this after PGC PTSD African-American cohorts. So to test local ancestry inference using one of the very worthy existing methods to call local ancestry, RFmix, I simulated a truth dataset that resembles these realistic PGC PTSD populations using real haplotype data. This will retain the LD patterns and, you know, other genomic features present in real data that are sometimes hard to simulate. And using this truth dataset with known phase and local ancestry, I then ran tractor step one, calling, you know, local ancestry, and quantified how often we got it correct. And, hardeningly, about 98% of the time, no matter how you slice it, we were getting correct slope lines, which recall, so this seems to perform very well.

Next, I wanted to see how the pipeline would work in kind of a more real-looking dataset. So usually, we don’t have perfect phase; we have statistically phased data. So I took our truth dataset and ran it through a standard phasing pipeline, SHAPEIT2, using a thousand genomes as a reference panel. And I noticed something that we hadn’t initially expected to see, which is that a lot of the originally long tracks were really broken up by these switch errors in phasing. So even though local ancestry inference is performing really well, you know, if it’s calling well if we have zero, one, or two African tracks at a given position, these long-range haplotypes are being disrupted due to switch errors. So we ended up building in an additional step in tractor to find and unkink, as we’re calling it, this garden hose, to recover these long-range haplotypes.

So here’s our karyogram again. Now, after this unkinking step has been implemented, we noticed that there were still sort of stretches where due to the local ancestry, you know, we painted the chromosomes on the statistically phased data. So there are some overhang areas that were still not able to be recovered. So we decided to implement one more iteration of RFmix on this sort of ironed-out dataset to see if this could help recover these full haplotypes. And indeed, this did dramatically improve the situation.

So just to convince you that not only did this, you know, make our karyograms look prettier, but this actually made the data look more realistic. I modeled the track distributions using a poisson waiting time centered at nine, which was the number of generations ago that the pulse of admixture occurred. So given this, this is what we would expect the European tracts in our data set, in this, you know, the European tracks of these African American individuals to look like, you know, given all things being correct. And we can use this to put a p-value on how likely it is to get the distributions in the various treatments of the data. So in our truth dataset, again focusing on these red tracks, which are the European segments, we do see pretty close to expectations distribution here, almost around nine. After statistical phasing, however, the tracks get extremely short; they’re actually P times 10 to the negative 28 likely to observe a distribution of this many tiny tracks, so very, very unlikely. After unkinking, it’s about twice as improved, and with one iteration of the pipeline, we have gotten things almost back up to where they should be. And just to sort of zoom out and put this all into perspective another way you can see, you know, how we really dramatically improved things from this original purple phased situation back down to this yellow line, which is as close to the truth dataset in black here as we’ve gotten it so far. So we are indeed making the situation look statistically significantly more realistic by recovering these long-range haplotypes, which can be important too if you’re concerned with things like LD, for example, in your dataset.

We’ve also tested this in a bunch of other demographic scenarios, and it performs, you know, similarly well. So with different admixture fractions, putting a different, you know, demographic models in, so pulse times at different points in the past and also between different ancestries with varying kind of divergence times.

So in the last couple of minutes here, I’d like to kind of take a step back and look at one of the main applications of this framework, which is really to show that it improves performance in GWAS context. So not only should we better correct for population structure by using this local ancestry information, but we can actually identify a new locus through an increase in power from local ancestry as well. The statistical model that’s built into tractor basically tests each SNP for an association with the phenotype using the logistic regression model that I’ve shown here, where X is the number of copies of the risk allele from the, you know, the first of your ancestries, X2 is the number of copies from your second ancestry, and then you can put in whatever other covariates, importantly including global PCs, you know, as for the rest of your parameters. You also notice that this is currently written as a two-way admix context, but it could be scaled up to an arbitrary number of ancestries, and you can further test if the risk allele is ancestry-specific by evaluating the difference between beta1 and beta2 using AZ test.

So we wanted to see if there were any kind of power gains using this model, and I want to acknowledge Adam here, who heroically ran a whole lot of these simulations with me. So to do this, we simulated again a realistic African-American demographic scenario. We drew a biallelic disease variant with the probability of each genotype copy drawn from the minor allele frequency. We then simulated a disease phenotype assuming a 10% disease prevalence, and an individual’s risk of developing the phenotype was modified basically by their percent admixture and the presence of the minor allele on the African genetic background. So this is basically assuming no effect in a European genetic context but an effect on an African genetic context, which would be, you know, equivalent to modifying effect sizes due to a tagged SNP being present in African background and not in Europeans for a shared causative mutation.

So we ran a whole lot of simulations varying all of these kind of parameters to sort of characterize the landscape of how tractor would improve your power. This is the simplest context, just showing the allele frequency held constant at a 20 percent minor allele frequency. In all of these plots, the dotted line or the dash line is tractor, and the solid line is sort of the traditional GWAS method which does incorporate PCs. And you’ll notice that both the same sample size is shown in blue and black, we do see a significant power gain using local ancestry, and this can get very large, and, you know, this gain can get really big in different genomic contexts. So, for example, when you introduce a minor allele frequency difference between the ancestries, now shown in this red line, this sort of gap between the methods grows very dramatic. And I kind of want to dwell in this for a moment just to show how much, you know, how significant this power improvement is. So if you pick your favorite power, let’s say 80%, you’d be able to detect variants that are of an odds ratio of about 0.1 lower using our local ancestry incorporating method compared to your traditional GWAS. Or if you pick your favorite odds ratio, let’s say 1.2, you would basically have very little power to find this variant in a normal context, but we’d be able to detect it at very high power from using our method.

So we’ve also done a lot of sort of tests characterizing other aspects of the genomic landscape. Since I’m running a little low on time, I’ll just sort of briefly mention the most significant one, which is the effect size difference across ancestries. So now we’ve introduced an effect in both backgrounds, not only in the African genetic context, and the sort of major take-home point is that, you know, if you have the effects, so on the right-hand side here, we have the effect now swapped only in the European background rather than the African background, and you’ll notice that we would have effectively zero power to detect it using a traditional model but can detect it at reasonable power with Tractor. This is because European ancestry only makes up about 20% of the sample, so the signal would really have gotten swamped out if we hadn’t deconvolved the local ancestry tracts. And importantly, in a context where we would actually expect no improvement from local ancestry, the effect is the same, everything is the same in both ancestries, we don’t see much of a power loss, a very minimal power loss from using Tractor. So there doesn’t seem to be, you know, a major hit from including this in your analyses. So, you know, if you’re uncertain of the effect of ancestry, it seems like it’s not a massive problem to incorporate this. And importantly, we’re talking about, you know, the perceived effect of this tag SNP, not the real causative mutation. So, you know, even if the shared causative mutation is the same across ancestries, what you actually would detect in your GWAS could differ depending on the minor allele frequency or LD patterns or environmental interactions or many other factors that could be different across these ancestries.

Alright, to sum up, today I’ve talked about this readily implemented pipeline that should allow you to be able to include admixed individuals in your GWAS studies in a well-calibrated manner. I am currently optimizing it with members of the Hale team from the Broad Institute, so it should be very quick and work in all systems once it’s finalized. But it’s already written up in Python, and I can, you know, distribute it to whoever’s interested.

We have also built in these extra features to improve long-range phasing, and I’ve shown you today that we not only boost your sample size by allowing inclusion of admixed individuals, but we can actually further boost your power by leveraging this information from admixed populations to find novel variants. We’re also hoping to take this in a few kind of directions. I’m starting testing on an empirical dataset that has a very well-established, you know, ancestry-specific hit, so blood lipids in African-American individuals, to make sure this really does work not only in simulation but in real empirical data as well.

We’re hoping that, since recently admixed individuals have disrupted LD blocks, we should be able to leverage this to more kind of fine-scale pinpoint causal variants, so we’re sort of diving into this now too and developing a heterogeneity test to try to automatically suggest and eliminate candidate sites based on your admixed populations. And I’m also working with Alicia Martin, who’s speaking immediately next, and some other collaborators here at the Broad and MGH to try to build a polygenic risk score framework that would actually produce reliable estimates for admixed populations, which remains another kind of notable gap in current methods.

And I’ll just close with saying, not only is this procedure useful in the GWAS context, but it could really be applied to any situation where you need to control for population structure at a very fine-scale manner. So, for example, even things such as, you know, evolutionary studies doing genome-wide scans of selection.

So with that, I’ll acknowledge my collaborators and advisers, and I guess I’ll take questions at the very end. But thank you for your attention.