Chapter 8.1: SNP Heritability (Video Transcript)

Title: Heritability and SNP Heritability

Presenter(s): Alkes Price, PhD (Department of Epidemiology, Harvard School of Public Health)

Host:

Again, good morning, everyone, and welcome to our first virtually presented MPG session. We’re fortunate today to have Dr. Alkes Price from Harvard School of Public Health speaking. He’s a professor of statistical genetics in the Department of Epidemiology, and today, we’ll be talking about heritability and SNP heritability and the relationship to the genetic architecture of disease.

A few notes about questions: Dr. Price has kindly offered to answer questions both at the end and also, importantly, during the talk itself, to sort of make sure that those are voiced in this virtual format. Please plan on typing them into the Q&A box, which, at least on my screen, is at the very bottom. There, you can type your question and also upvote questions posted by others that are of particular interest to you, and I will do my best to keep an eye on that and to interject as needed during the talk. Also, if you have questions that you prefer to save until the end of the talk, I will keep an eye on that, and we can aim to address those then. So, again, welcome!

Alkes Price:

Thanks very much for joining, and I think we can get started with the talk. All right, well, good morning, everybody. I’m Alkes Price from the Harvard School of Public Health, and welcome to this first virtual MPG primer session. I’m going to talk about heritability and SNP heritability, which is basically an introduction to the genetic architectures of disease. And as Diane mentioned, I do encourage everybody to jump in with questions throughout the talk in the format that Diane communicated. So, let’s get started.

I think that everybody in this audience will be aware that genome-wide association studies have already been very successful at producing important biological insights. This is just one example from the schizophrenia landmark study. Schizophrenia GWAS was published in the year 2014 in Nature, with a lot of contributions from people at Broad. This landmark study had a huge number of interesting and important findings. And yet, at the same time, I think everybody in the audience will be aware that even though genome-wide association studies have found a lot of things, they certainly have not found everything. And in the case of schizophrenia, we know that the findings of the 2014 paper explained about 3% of heritability, whereas the heritability of schizophrenia has been estimated at about 64% from twin studies.

So, there’s this big gap between what we found and what we believe is out there, and this gap is classically known as the “missing heritability”. And this story about missing heritability goes back to around 2008 or so. This is a commentary of Maher 2008, Nature, and in the year 2008, people didn’t really know what the cause of the missing heritability was. Now, I think it’s much better understood, and this talk introducing the concept of SNP heritability will also provide a review of what we know about the answer to this mystery of missing heritability.

So, with that in mind, here’s an outline of my talk: I’ll start with an introduction and a definition of heritability. Then, I’ll talk about genome-wide association studies and missing heritability. And then, we’ll delve into this idea of heritability explained by SNPs, also known as SNP heritability. And if we happen to have extra time, there are some extra bonus topics pertaining to heritability that we might have time to scratch the surface on. So, let’s start with an introduction to heritability.

So, heritability is generally defined as the proportion of phenotypic variance that is due to genetic effects. And most of the time when people are talking about heritability, they’re talking about narrow-sense heritability, which is the proportion of phenotypic variance due to additive genetic effects. You could also talk about broad-sense heritability, which could include epistatic or recessive-dominant effects, but those are generally harder to estimate. So, generally, when people are talking about heritability, they’re usually talking about narrow-sense heritability denoted as lowercase h squared.

People have been trying to estimate narrow-sense heritability for a long time, at least back to the year 1886. One way that you could do this is you could take some relatives and see if they are somewhat phenotypically similar because if a trait is genetically heritable, then you would expect that relatives are going to be somewhat phenotypically similar. And there’s a method called Haseman-Elston regression, but I’m not going to describe in detail. But here, you’re basically regressing the phenotypic similarity on the genetic similarity or the expected genetic relationship amongst a particular pair of relatives. So, you know, a parent-child pair or a sib-sib pair have an expectation of about 50% shared genetics, and you could ask how phenotypically similar are those pairs? What’s their phenotypic correlation?

And so on this graph, this is sort of an amalgamation of different results that were compiled in the paper of Visscher et al. 2010, in which each point represents one study. You can kind of see, this is a study that generally people who are very closely related to each other tend to have very correlated values of height. This is generally sex-adjusted standardized height that people are studying. Whereas on the other hand, people who are a little bit genetically similar to each other tend to have height that’s a little bit, you know, typically correlated, and that sort of stands to reason for what you’d expect for a trait that’s largely but not completely heritable.

The slope of this line in this paper is estimated at 0.747, which might mean that height is something like 75 percent heritable in terms of narrow-sense heritability. But there’s a little bit of a surprising finding here, which is that there’s an intercept. You know, as you’d think, as your familial genetic relationship goes to zero, the phenotypic correlation ought to go to zero, right? If you’re sharing close to zero genetics, you should have close to zero phenotypic correlation. So, you’d really expect this red line to go through the point (0, 0), and yet, surprisingly, this red line seems to not go through the point (0, 0). There’s this large intercept, and there’s a lot of speculation, different hypotheses why what could this be due to. Probably the simplest explanation is shared environments; that even people who are just cousins, they come from a similarly type of socio-economic background or something like that that makes them more disposed to be taller or more disposed to be shorter as a consequence of environmental effects. And that is one possible explanation for why this line does not go through (0, 0).

And there are more complicated explanations as well, involving genetic ancestry and assortative mating, and so on and so on. But we just have to keep in mind as we think about narrow-sense heritability that it is complicated, and there’s a lot of room for different sorts of confounding and complex effects as we try to estimate this quantity that I have defined.

Probably the most popular way right now to try to estimate narrow-sense heritability is using the classic monozygotic (identical) and dizygotic (fraternal) twin study. The idea here is, the hope (not the guarantee, but the hope) is that monozygotic twins versus dizygotic twins have basically the same amount of shared environment, and really the only thing that differs between monozygotic twins and dizygotic twins is that monozygotic twins share 100% of their genetics, whereas dizygotic twins share only 50% of their genetics.

So, the difference in how much phenotypic correlation you see between monozygotic twins, on one hand, versus dizygotic twins, on the other hand, should give you a sense of how, you know, how heritable the trait is. And this twin-based approach is really the, it’s widely considered as the gold standard in ways to estimate narrow-sense heritability. And it should work unless there’s a difference in the amount of shared environment between monozygotic twins, on one hand, and dizygotic twins, on the other hand, for example, due to effects in the womb or due to sort of societally, you know, effects or family influences, differences in the ways that monozygotic twins versus dizygotic twins are treated. And we have to take seriously this possibility that there could be still some confounding due to differences in the amount of shared environment between monozygotic twins and dizygotic twins, whereby monozygotic twins have more shared environments, and that might actually inflate the twin-based estimates of heritability, and I will provide some evidence later in the talk that this is, in fact, likely to be the case, that the twin-based estimates may, in fact, be inflated.

All right, and so, keeping in mind that these estimates might be inflated, there have been a lot of work published. You know, that probably the broad and recent reference worth looking at is Polderman et al. 2015, Nature Genetics. But, for example, for height, the most quoted value from twin studies is 0.8. And then, you know, recent studies of cancer have produced estimates of around 0.3 to 0.5. Generally speaking, most of the diseases and complex traits that people tend to be interested in (height is a bit of an outlier, that’s extremely heritable), but most of the diseases and complex traits that people tend to be interested in seem to have narrow-sense heritability estimates from twin studies that might be on the order of around 0.5 or a little bit less than 0.5, typically somewhere in that range.

All right, so that’s a brief introduction to narrow-sense heritability, and now I’m going to delve just a little bit more into the missing heritability problem from genome-wide association studies that I’ve already defined early in this talk.

So, again, the missing heritability originally was defined as this gap between what we discover from the genome-wide significant loci for GWAS, versus the estimated narrow-sense heritability from twin studies. And again, using schizophrenia as an example, 3% is the heritability explained by the 108 GWAS loci from PGC 2014, Nature, versus, on the other hand, 64% the narrow-sense heritability that’s been estimated from twin studies. That’s a very large gap. And in the early days of GWAS, people were really very interested to try to understand what the cause of this gap is. Why doesn’t GWAS find everything? Why is it so incredibly far away from finding everything?

And there are a lot of explanations out there, but these are the four that I think are most worth discussing in this MPG primer format. So, I’m gonna go over these four explanations one at a time.

So, the first explanation is common causal variants of exceedingly low effect size. And so, we could imagine some different possible values of the genetic architecture of a disease or trait. One possible genetic architecture is that you have ten common risk variants, which each explained about 1/10 of the heritability, and this is what people thought, you know, a long time ago, way back, maybe early 2000s. This is what people thought disease architectures might look like. There’d be 10 loci you’d run a GWAS, you’d find the 10 loci, end of story, and that’s what people were expecting.

But it might be more complicated than that. You know, there could be 10, there could be 100, or a thousand, or even 10,000 common risk variants that each explained a tiny, tiny, tiny, tiny proportion of heritability. And in this extreme case at the bottom of this slide where we have 10,000 common risk variants that each explained on the order of one ten-thousandth of narrow-sense heritability, maybe more in some cases and less in some other cases, you can imagine that even at a very large sample size, GWAS is going to be very underpowered to find them. And you might only find a very, very small fraction of the true common causal risk variants, and that’s why you’re only going to explain a very, very small proportion of the narrow-sense heritability, with the ones that you’re actually lucky enough or well powered enough to actually find. And this is what a lot of people believe, that schizophrenia happens to be an example of a particularly polygenic trait with a lot of causal variants, and so this may be what we’re looking at. We may be looking at something like 10,000 common risk variants that each need to explain a tiny, tiny, tiny proportion of heritability. And even at very large sample sizes, you’re going to be underpowered to detect most of them. You’re only going to detect the ones with the largest effects, or you’re going to detect a few because you get lucky or something like that, but most of those common risk variants, you’re just not going to detect them as being genome-wide significant in the GWAS.

We can even go to a greater extreme, the so-called the infinitesimal model. Nobody believes that this is realistic, but it can be useful to think about as a theoretical construct. And so, the infinitesimal model is a model in which all of the common SNPs in the genome are causal risk variants with causal effect sizes, something like, you know, normally distributed with mean 0 and variance h squared divided by M, where M is the number of SNPs. And so, there you’re sort of spreading the narrow-sense heritability across literally all the common SNPs in the genome. So, this is one very plausible explanation as to what is going on with missing heritability. We’ve got a lot of causal, a lot of common causal SNPs of tiny effects. That GWAS, even at large sample sizes, are not finding most of them. And we do know that there’s a lot of traits that are extremely polygenic. I’ve already said that schizophrenia is a particularly polygenic trait. This is a Manhattan plot from the blood pressure GWAS of Evangelou et al. 2008, Nature Genetics, and this is another illustration of an extremely polygenic trait which clearly has an extraordinarily large number of causal loci.

So, a second explanation that people have been interested in for a while is the explanation of rare and low-frequency causal variants. And we know that GWAS are not well powered to identify rare and low-frequency causal variants because the power sort of scales with the allele frequency. If you have a really rare variant, it doesn’t occur very often in the sample, and so you’re not going to be well powered to detect it. Mathematically, we know that at a specific per-allele fixed effect size, the power scales something like with the sample size times P times 1 minus P, where P is the minor allele frequency. And so, if you have a really rare variant, then you’re definitely not going to be well powered to detect it as being genome-wide significant in a GWAS, where you’re conducting single-marker tests. I mean, I’m not going to get into the topic of multi-marker gene-level tests, which is a topic for a different MPG primer session. And so, GWAS are not well-powered to identify these rare and low-frequency causal variants as being genome-wide significant, and maybe that’s where a lot of the missing heritability is, and that’s an explanation people have been interested in.

But recent work from Zeng et al. 2018, Nature Genetics, as well as other work from our group (Schoech et al.), are suggesting that there’s not really a lot of heritability coming from rare and low-frequency SNPs. So, according to Schoech et al., it’s less than 10% of SNP heritability (I know I haven’t defined SNP heritability yet, we’ll get to that later) but less than 10% of SNP heritability is coming from SNPs with minor allele frequency less than 1%. Now, there’s a little bit of a complicated story here involving negative selection. Under simplified assumptions where you don’t have any selection and you might have to also assume equal a constant effective population sizes across time. But under some assumptions (I’m not going to get into), you might expect that SNPs with MAF less than 1% ought to explain about 1% of SNP heritability. So, there is some excess here where they actually explain, according to this paper about 9% of SNP heritability, which is a lot more than 1%. And that excess, whereby 9% is more than 1%, is a consequence of the action of negative selection, which sort of causes really important SNPs that have important effects to, generally, those important effects are usually bad for the organism, and because they’re bad for the organism, those SNPs cannot rise to high frequency and stay as rare SNPs. So, you might expect that rare SNPs will tend to have larger effects, and in fact, that’s exactly what you see, and that’s why SNPs with minor allele frequency less than 1% explain somewhere around 9% of SNP heritability, which is a lot more than 1% of SNP heritability. But even so, at the end of the day, according to these papers, these rare and low-frequency SNPs are not really explaining a lot of heritability, and they are probably not the primary explanation for missing heritability.

Now, to be clear, I don’t, I specifically do not want to say that rare and low-frequency causal variants are not important and we shouldn’t study them, rather, what’s going on here is that if you’re interested in heritability or in explaining a lot of heritability, then rare and low-frequency causal variants don’t contribute a lot. And by extension, if you’re interested in polygenic prediction, which kind of piggybacks off of heritability, then rare and low-frequency causal variants aren’t very important. But on the other hand, if what you’re interested in is discovering interesting disease biology that could lead to a drug target, rare and low-frequency causal variants may be really important. You may identify a rare coding variant that explains a minuscule amount of heritability, but it has a really biologically interesting sort of mechanism behind it that might lead to a drug target. Totally fine if it explains a minuscule amount of heritability if it’s going to lead to an actionable drug target. So, let’s just keep in mind that even though they’re not so relevant for heritability, they could be very important for disease biology and developing drug targets.

Ah, the next explanation that people… I see that there’s a Q&A. Diane, do you want to read out the Q&A?

Diane: Thank you. Yes, um, I will do that now. So, the question from Now Son is: “How is the heritability of rare variants estimated for a GWAS study that is underpowered?”

Alkes: Okay, so the question is, “How is the heritability of rare variants estimated for a GWAS study that is underpowered?” And the topic of SNP heritability that I will delve into in the third part of this talk delves into estimating the aggregate heritability contributed by all the variants in the genome. Maybe, I mean mostly what I’m going to talk about is estimating the heritability explained by collectively explained by all common variants in the genome. But something related to that that you could do is you could estimate the heritability explained by all rare variants in the genome. And so, these studies that I’ve quoted on this slide, Schoech et al., Zeng et al., and others, they’re doing something sort of like that. They’re doing something, they’re sort of extending the methods to estimate SNP heritability or heritability explained by all SNPs in the genome, which I will be talking about in the third part of this talk. They will be extending that to estimate the heritability explained by all SNPs in a particular minor allele frequency class, such as the heritability explained by all rare variants. And I will aim to return to this question in the third part of the talk when I talk about SNP heritability, and hopefully, what I just said will become more clear after I delve into what’s going on with SNP heritability.

All right, so moving on to explanation number three on my list, which is copy number variation, and it is in principle possible that copy number variants are biologically important and contribute a lot to heritability but are not well tagged by common SNPs. And there’s another question.

Diane: I don’t see the question in the Q&A box.

Alkes: Okay, in that case, I will keep going.

Diane: Thank you.

Alkes: It’s possible that the copy number variants could be important for disease, but their effects may not be so well tagged by SNPs. So, if you’re only looking at SNPs, you won’t see them. And we have to take that possibility seriously. Way back in the year 2010, this paper that I’ve cited suggested that common copy number variants that could be typed on existing platforms (and that’s an important qualification) did not contribute much, but that may be more about the technology than about the biology. And more recent work of Sudmant et al. 2015 is suggesting that structural variants are enriched on haplotypes identified by GWAS. So some of the common SNPs that we identify as genome-wide significant in GWAS may be tagging causal copy number variants. So, I don’t think we have a conclusive answer to this question right now as to whether untyped and untagged or only partially tagged copy number variants are responsible for a lot of the missing heritability.

I would like to go out on a limb and hypothesize if this story might maybe be sort of similar to this story with rare variants. And some of these copy number variants, of course, are likely to be rare because you can’t, you know, you can’t knock out a big chunk of the genome and have it not have a huge effect, which would then, due to negative selection, keep the variant rare. I hypothesize that it might be true that, just like rare variants, copy number variants might not explain a lot of heritability and might not be important for quantifying heritability or, by extension, for polygenic risk prediction. But, on the other hand, they might be really important for understanding disease biology, and they might lead us to examples where we understand the biological mechanism and can develop drug targets. So, I hypothesize that they might be not so important for heritability, but at the same time, very important for disease biology and drug targets.

And then finally, the fourth explanation that I wanted to talk about is the possibility that narrow-sense heritability was overestimated in the first place. And I already alluded earlier in this talk to the possibility that narrow-sense estimates of narrow-sense heritability might be inflated due to shared environment. And there are some other complicated explanations whereby if you have G by G interaction (which is not supposed to be included in estimates of narrow-sense heritability, which is defined as including only additive effects and not G by G interaction effects), then according to Zuk et al. 2012, that could inflate your twin-based estimates. So, there are people out there who believe that the twin-based estimates may be inflated, despite the fact that they’re sort of the best, well, until recently at least, they’re the best thing we have.

I’d like to highlight this paper of Young et al. 2018, Nature Genetics, which introduced a new method called relatedness disequilibrium regression, which is predicated on having at your disposal a really large data set with a lot of related individuals, which those authors happened to have because they analyzed the deCODE Genetics data set from Iceland. And they claimed that they have an estimation procedure that is robust to these types of effects of shared environment. And they claim that the narrow-sense heritability of height is only about 0.55. And we have to take those claims seriously. They are, generally, fairly consistent with a couple earlier studies of Zaitlen et al. from our group. And I think people at this point in time tend to believe that 0.8 for height, for the twin-based estimates of narrow-sense heritability of height, that 0.8 probably was an overestimate, and the truth might actually be some number closer to about 0.6 or so. I think that’s what people, most people, believe.

And so, summing it all together, I think, in terms of heritability, the two explanations for missing heritability that are most prevalent are: Number one, you’ve got common variants of exceedingly low effect size that cannot be detected by GWAS as being genome-wide significant. And number two, the twin-based narrow-sense heritability estimates are somewhat too high. Obviously, this is not going to get you a slightly overestimated narrow-sense heritability is not going to explain a difference between 0.03 and 0.64 for schizophrenia, which is a humongous difference, and that’s probably more dominated by number one.

So where does this leave us? Well, there’s a sort of fundamental question that was asked originally in a landmark paper of Yang et al. 2010, Nature Genetics, from Peter Visscher’s group. Maybe we can try to estimate the heritability explained by all the SNPs in the genome, or maybe all the common SNPs in the genome, not just the SNPs that are genome-wide significant, but in fact, all the SNPs collectively. Even if we don’t know which ones are the causal ones, we can still try to estimate the heritability jointly explained by all of those SNPs together. And so, that’s this concept of SNP heritability, which is really the main focus of this primer. And I will start to now delve into that. And so, the distinction between, on the one hand, narrow-sense heritability, and on the other hand, SNP heritability, the heritability explained by SNPs specifically, really sort of rests on the distinction between two important ideas:

IBD or identity by descent, and IBS or identity by state. So, identity by descent or IBD means two people are related; they have similar genetics. IBS or identity by state means two people who are not related may still, by chance, have genetics that are a little bit similar, and you might be able to do something with that to learn about complex trait architectures.

So let me delve into a little bit of detail. Let’s start with IBD. Let’s suppose you take two individuals who are related. So, the two individuals I depicted on this slide are brothers, and we could ask ourselves, what is the proportion of the genome that these two brothers share, identical by descent, which means inherited from a recent common ancestor? The answer, well, the answer is not exactly 0.5. The answer is approximately 0.5 because it can vary a tiny bit from one SNP to the next, but an expectation is 0.5. And you can imagine if you have sort of a large set of related individuals who you’re analyzing genetic data from, you can build an IBD matrix, quantifying the IBD for each pair of individuals. And for a pair of individuals who are brothers, the entry would be 0.5 because 0.5 is the IBD of those two brothers. And this K matrix or this IBD matrix, you could use it to estimate narrow-sense heritability.

And no worries if you don’t feel like sorting through all these equations, but this is what it would look like in terms of math. That you have a vector Y of phenotypes, and you’re sort of decomposing the phenotypic variance into the part coming from genetic effects, called U, and the part coming from environmental effects, called epsilon. And the variance of U, the genetic effects, is proportional to the IBD matrix. And where’s the environmental effects? Well, that’s captured in the epsilon. And the variance of U, the genetic effects, isn’t it sort of proportional to the IBD matrix? And where’s the environmental effects? On the other hand, if we make a strong assumption of no shared environment, then the environmental variance is proportional to the identity matrix. That is saying we have no cross terms at all between distinct individuals. And then you sort of parameterize the overall phenotypic variance of V in that way, estimate these parameters, and estimate the narrow-sense heritability.

There are methods, you know, max likelihood methods or restricted maximum likelihood methods for estimating that parameter, or actually, you’re estimating the two parameters, Sigma square G and Sigma square E from the previous slide. I’m going to just kind of skip most of the mathematical details. You can read about it in these various papers. And you know, no worries if you’re not following all the math.

So, on the other hand, we could have two unrelated individuals, such as the two individuals depicted on this slide. And even though these two individuals are completely unrelated, it is possible that they might be just a little bit more genetically similar than average or just a little bit less genetically similar than average, just by statistical chance. And we can sort of quantify that using the equation on the right half of this slide. Where we can compute something, this is sometimes called the genetic relationship matrix or it’s sometimes called could be called an IBS matrix. But it’s basically you just sort of like compute the correlation across SNPs between these individuals’ genotypes, suitably normalized. Typically, if you have two unrelated individuals, you can practically guarantee that that number is going to be really, really close to zero. But in these standardized units, in which the mean is going to be zero on average, it might be a tiny bit bigger than zero, like 0.004, or it might be a tiny bit less than zero, like -0.004. And you might expect that if you have a heritable trait, then two individuals who just by chance are a tiny bit more genetically similar than average ought to have slightly concordant phenotypes, whereas on the other hand, individuals who are slightly less genetically similar than average ought to have slightly less concordant phenotypes. And this is something that you can use to estimate the heritability explained by SNPs or heritability explained by genotyped SNPs in the way that it was originally employed. And the real question here is, what is the set of SNPs? Well, the answer is related to the SNPs that you used to compute this genetic relationship matrix.

If you only use SNPs on chromosome 1 to compute the genetic relationship matrix, then you’ll get an answer that has something to do with SNPs on chromosome 1. If you use all common SNPs to compute the genetic relationship matrix, then you’ll get an answer that has something to do with all common SNPs, and so on, and so on, and so forth. And so, once again, we can model the phenotypic covariance, V, as a sort of linear combination of this genetic relationship matrix or IBS matrix, which is the genetic part coming from SNPs, and then everything else, which is the environmental part, or strictly speaking, it’s everything except the genetic part coming from SNPs.

And then we can estimate this quantity called SNP heritability. Again, reverting to a little bit of math here, I feel that if we talk about estimating a quantity, we really want to define that quantity in the entire population. And this is a formal definition of that entire quantity in the population. We can skip the mathematical symbols here, and in words, this is just the maximum amount of phenotypic variance that you could explain using any linear combination of SNPs. And that’s the definition of SNP heritability. That’s a definition in the entire population. That definition does not depend on a particular sample, although it does depend on which set of SNPs you’re looking at. If you’re looking at, you know, just SNPs on chromosome 1, the answer will be smaller than if you’re looking at all SNPs in the genome. Or you might be looking at just a few hundred thousand genotyped SNPs or a large number of imputed SNPs or common SNPs or common and rare SNPs or whatever, and the answer will be different in each case. But this is a quantity that you can define in the entire population. And then, after you’ve defined it in the entire population, then you could obtain an estimate of that quantity, an estimate with noise, of that quantity in a finite sample.

And this is what was done in this landmark paper by Yang et al. 2010, Nature Genetics. And this is the same math that I showed on an earlier slide for IBD. This is just the same math with the IBD matrix K replaced with the IBS or GRM matrix called capital A. And we’re now intuitively thinking about this in terms of unrelated individuals and SNP heritability, but mathematically, all the math is the same, and all the maximum likelihood or restricted maximum likelihood computations are the same.

So, I’m going to choose to not focus on the math in this talk, and I’m gonna focus more on intuition. Now, we have these two quantities: H squared, that’s total narrow-sense heritability, and that corresponds to the question: How phenotypically similar are two relatives?

There’s a question?

Diane: Yes, thank you. I think the question is asking, or the question from Anna Lewis is asking, the following: When you say ‘the entire population,’ do you mean the entire human population?

Alkes: Okay, so this is a good point. This is something I’ve really glossed over, and you know if I wanted to delve deep into the population genetics, that could be a separate MPG primer. But generally, when people talk about a population, they’re talking about a population of a particular continental ancestry, one example: the population would be European Americans. Another example: the population would be the British ancestry individuals from the UK Biobank, and so on and so forth. Now, if I want to be a strict population geneticist, I might define a population as a set of pandectically randomly mating individuals. Now, nobody believes that European Americans are a set of ethnically randomly mating individuals. Nobody believes that British ancestry individuals from the UK, or you know, East Asians from Japan, or you know, Nigerians, or whatever population you’re talking about, nobody believes that that’s strictly speaking a pandectically mixing set of individuals. We might just choose to pretend that’s the case as a sort of approximation. And I mean, I mentioned earlier at the beginning of this talk that there are opportunities for all sorts of confounding. One sort of confounding has to do with confounding due to population stratification. Do the differences in genome-wide ancestry amongst different individuals? That, again, could be a topic for another MPG primer. We should be aware of the possibility that if you are studying a population, such as European Americans or any population in which there are differences in genome-wide ancestry amongst different individuals in that population, we have to be a little bit careful, because there is the possibility for confounding due to population stratification. But the short answer is we’re thinking about a specific population of a specific continental ancestry.

All right, and so getting back to this slide, narrow-sense heritability corresponds to the question, you know, how phenotypically similar are two relatives? And it’s kind of implicit that two relatives could be phenotypically similar due to, like, whatever genetics they happen to be carrying. Maybe they’re carrying the same rare variant, maybe they’re selling the same copy number variant, maybe they’re carrying the same common variant; it’s going to include all of that. On the other hand, we have a smaller quantity, SNP heritability. Originally, this was called the heritability explained by genotyped SNPs because people liked to estimate this just using genotyped SNPs before imputation became kind of universally popular. So that’s why you’ll see this terminology on some of these slides, but you can apply this to any set of SNPs.

And this, this corresponds to the idea: if I have two unrelated individuals and I use a specific set of SNPs (and the set of SNPs is important to quantify how genetically similar they are just by chance), then how phenotypically similar will they be? And again, this is a function of a very specific set of SNPs. And because it’s only capturing the heritability causally explained by a very specific set of SNPs, in general, it’s expected to be lower than the total narrow-sense heritability, which is capturing additive effects from all possible genetic variants.

All right, and then finally, there’s a third quantity that has already appeared in this talk, and I’m going to call that H squared GWAS. H squared GWAS is the heritability explained by genome-wide significant SNPs. And I mean if you’re not worried about LD (linkage disequilibrium), then it’s basically just the sum of the variants contributed by each of your genome-wide significant SNPs in turn. But it’s a little bit more complicated if you have LD, and strictly I can define it as the maximum proportion of phenotypic variance that you can explain with any linear combination of genome-wide significant SNPs. So, just focusing on those SNPs that are genome-wide significant, and that’s the number that was in a particular study at 0.3 for schizophrenia. And I have to be a little bit careful when I say that I’ve defined this; it’s not really a true population-level parameter because it’s a function of which SNPs pop up as being genome-wide significant in a particular GWAS at a particular sample size. So it’s really a function of a specific GWAS that identifies a specific set of genome-wide significance SNPs. It’s really a function of what the set of genome-wide significance SNPs is in a particular study.

And so now we have, we have sort of an inequality with three different quantities. We have H squared GWAS, that’s the smallest number, which is just the genome-wide significant SNPs, with H squared G, that’s the SNP heritability, that’s all the SNPs in the genome. Maybe all the genotyped SNPs, maybe all the common SNPs, whatever flavor of SNPs you want to choose to study at a particular point in time, but it’s, in some level, the heritability explained by all the SNPs in the genome, including all the SNPs that are not genome-wide significant but might contain some signal. And then finally, the largest quantity, total narrow-sense heritability, that’s the additive heritability explained by all genetic variants, which includes not only every possible type of SNP but other types of variants as well, copy number variants, and so on and so forth. So this has been a little bit of a dry technical discussion.

Now, we might want to look at some real data to sort of get a sense of having some intuition of how this works in practice. I’m going to start with height, which is probably the trait most well-studied by geneticists. In terms of narrow-sense heritability, well, from the twin studies, we have 0.8 (as I mentioned earlier, that’s probably an overestimate, and we’re probably closer to 0.6, but for now, I’m just going to say 0.8 from the twin-based studies). Then we have SNP heritability. This is the quantity that Yang et al. (2010 Nature Genetics) in their landmark paper estimated at 0.45. Then just the heritability explained by GWAS SNPs, well, way back in the year 2010, that was at about 0.10, although it’s gone up a little bit with some studies (Wood et al., 2014 Nature Genetics and Lango et al., I forget which year, 2018 or 2019, Human Molecular Genetics). It’s a little bit higher now, but you can see that this concept of SNP heritability can explain most, not all, but it can actually explain most of the missing heritability. And this goes back to explanation number one of my four explanations that I quoted earlier, where, there’s just a lot of SNPs with really small effects that GWAS do not detect as being genome-wide significant, and if you define and estimate this quantity H squared G, which quantifies the heritability explained by all SNPs or all SNPs of a particular category, such as genotype SNPs, not just the ones that are genome-wide significant, then you get a much, much, much larger number, like 0.45, than just the genome-wide significant ones, such as 0.10.

I like to use the terminology “hidden heritability,” as do others, to explain this gap between H squared GWAS and H squared G. So, heritability, we know it’s there, we know it’s in those SNPs, we just don’t know which SNPs it is because we don’t have enough sample size to do an infinitely powered GWAS to figure out which are all the causal SNPs in the genome. But we know that heritability is there; it’s just kind of hiding. And then, on the other hand, there’s heritability that’s still missing, which is this gap between 0.45 and 0.8, although, as I said a moment ago, maybe it’s really a gap now between 0.45 and 0.6 if we believe 0.6 for height now, and that’s heritability that’s still missing, and then we don’t currently have an explanation for.

So, this is just a generalization published by Yang et al. 2011 to some other quantitative traits, and I’m not going to go over this in detail other than to say that, of course, these traits are less heritable than height, but qualitatively the story is pretty similar to what’s going on with height.

Diane: We do have one question in the Q&A that I think it’s maybe timely at this moment, which is from Matthew Worman who says, “Hey Alkes, if one were to use a p-value of 10 to the minus 5 instead of 10 to the minus 8 for GWAS, what fraction of SNPs exceeding this less stringent threshold are false positives?”

Alkes: Okay, so I do think it is of interest to say, “Hey, what happens if I choose some other threshold, let’s say 10 to the minus 5? And I could define a set of SNPs which are SNPs that come in at P less than 10 to the minus 5 in my GWAS”. And I guess I would like to say three different things about that.

Well, the first thing that I want to say is that, just as is the case with the set of genome-wide significant SNPs at five times ten to the minus eight, the nature and characteristics of the SNPs that would come in at ten to the minus five is very much a function of your sample size. I mean, it’s a function of a lot of things, like the genetic architecture of the trait and so on and so forth, but it’s really very much a function of your sample size. So that’s sort of like very dependent on a particular study, you know, the SNPs that come in at P less than ten to the minus five have a particular set of characteristics. That’s the first thing that I wanted to say.

The second thing that I wanted to say is that, although it might depend on that particular study, on the particular genetic architecture, and so on and so forth, sample size, intuitively, it’s appropriate for us to think intuitively that most of the SNPs that come in at P less than ten to the minus five are going to be false positives. In other words, they’re not truly causal variants or they’re not truly tagging causal variants. In general, that is what I suggest is likely to be the case.

And the third thing that I want to say is that even if most of them are false positives, a very important fraction of them will be true positives. Those true positives will be contributing more heritability. So if I define something called H squared GWAS less than ten to the minus five, which is the heritability explained by a particular set of SNPs that comes in at P less than ten to the minus five in a particular GWAS, then that will lie somewhere in between H squared GWAS, which is just the sum of genome-wide significant ones, and H squared G, and it would probably be substantially greater than each squared G was, because even though, I would say that in general, we expect that most of those coming in at less than ten to the minus five would be false positives, a very, very important fraction of them would be true positives, and that would get you a lot closer to H squared G, the true SNP heritability. Although I’m still going to say that, in general, you’d expect that there’s still a lot more signal that is not even captured by P less than ten to the minus five, and you’d still have a gap between H squared GWAS 10 to minus 5 and H squared G. So, H squared GWAS 10 to the minus 5 might lie somewhere in the middle between H squared GWAS and H squared G, with the details that depend on how polygenic the architecture is and what the sample size is and so on and so forth.

Alright, very good. So, generally speaking, H squared G is less than H squared (it’s the narrow-sense heritability), that could be due to rare and low-frequency variants, as well as other types of variants like copy number variants, that are not captured by the SNP heritability of a very specific set of SNPs, usually either genotyped common SNPs or genotyped and imputed SNPs that are mostly common or something like that, that you’re estimating the SNP heritability of.

And then, on the other hand, we have a completely different phenomenon: H squared GWAS less than H squared G. So that’s H squared GWAS less than SNP heritability, where we need larger GWAS sample sizes, or as Matt alluded to, maybe we need just a less stringent genome-wide significance threshold, or whatever. I mean, H squared G is just sort of in the limit of, you know, any SNP with P less than or equal to 10 to the 0 in your GWAS, right? And so in the limit of large sample sizes, we might hope that as our GWAS gets hugely large, then we’ll identify all the associated SNPs, and H squared GWAS will approach H squared G. But there is a bit of a caveat here that even if you’ve got a humongous sample size, like 758,000 samples for blood pressure, if it’s a really polygenic trait like blood pressure, 758,000 samples is not enough, and it’s not even remotely close to being enough, and H squared GWAS at 0.06 may still be a lot lower than this SNP heritability of 0.21.

Alright, so I’ve been saying repeatedly that SNP heritability is a function of the set of SNPs, and that’s definitely true, but we should also keep in mind that it’s a function of the particular population that we’re looking at. And again, as I mentioned earlier, one population that some people sometimes study is British ancestry individuals from UK Biobank, which is mentioned on this slide. And another population that people sometimes study is European Americans, and that happens to be the population that was analyzed in reference two on this slide.

And it turns out that if you estimate SNP heritability in British ancestry UK Biobank samples, you consistently get higher numbers than you do if you estimate heritability in other cohorts, such as European American cohorts. There are different possible explanations for this, but probably the most likely explanation is that SNP heritability is truly higher in a British ancestry UK Biobank population than it is in a European American population. That might be due to just less environmental variance in a set of British ancestry individuals residing in the UK, or more precisely, less environmental variance in a set of British ancestry individuals residing in the UK that the UK Biobank captures, which is actually not a perfectly random subset of British individuals, it’s British individuals who choose to respond to surveys and might have higher than average SES and so on and so forth. And there seems to be less environmental noise and thus larger SNP heritability in that population than in, for example, a general European American population. So let’s just keep in mind that SNP heritability depends on the population studied as well as on the set of SNPs. And, of course, I should say that total narrow-sense heritability also depends on the particular population that you’re studying.

One question that comes up a lot is, which assumptions do we need for estimates of SNP heritability to be valid? And one thing that the world now understands very well is that LD (Linkage Disequilibrium) dependent architectures can lead to bias in estimates. And because I see that I’m running out of time, I’m going to choose to just gloss over the details of that. I want to say very carefully what I mean by LD-dependent architectures. LD-dependent architectures - we’re not talking about tagging here. Of course, we know that a SNP that’s not a causal SNP can tag a different SNP that is a causal SNP. But LD-dependent architectures mean that causal effects vary with the amount of LD a SNP has. This could be just due to minor allele frequency - that’s kind of trivial - that rarer SNPs explain less heritability per SNP because they’re rare and they also have less LD, that’s kind of trivial. But even if you condition on MAF (minor allele frequency), even in a specific fixed MAF, we know that low LD SNPs actually have larger causal effect sizes than high LD SNPs. The reasons for this are quite complicated, we believe they have something to do with negative selection, but it turns out that that violates some of the assumptions of some of these estimation methods and it can lead to biases in estimates. It’s something that we have to watch out for, and it’s something that the world now understands well and knows that we have to watch out for. I’m just going to leave it at that and gloss over some of these other slides.

And then a second question that comes up, which is the second bullet point now on this slide over here, is the question: Is it okay if effect sizes have a non-infinitesimal distribution? And so I haven’t really emphasized this point, but the estimation procedures that I’ve described, that were used to compute the genetic relationship matrix from unrelated individuals and then use restricted maximum likelihood to fit variance components to estimate heritability, there are some assumptions underlying that about an infinitesimal architecture, where infinitesimal, as I defined earlier, refers to a genetic architecture in which all SNPs are causal with a normal or Gaussian-distributed distribution of causal effects. The methods assume that. But we now know that even though that assumption is clearly an incorrect assumption, that does not lead to any bias. It’s merely the case that you might be leaving some precision on the table, whereby sophisticated methods that account for the fact that the distribution of causal effects is a little bit more sparse than that can actually produce more precise estimates. But that’s an issue of precision, of getting a lower standard error, more precise estimate. It’s not an issue of bias, and this does not lead to bias, and that is well understood.

All right, um, I do want to mention at least briefly the important point that whereas I’ve been talking about the math in terms of quantitative traits like height, people are very interested in case-control traits because most of the disease traits that we’re really ultimately interested in terms of medical actionability and drug targets and treatment or whatever it is we’re aiming for, you know, even polygenic prediction, we’re generally aiming for it in the context of medically important disease traits like schizophrenia or type 2 diabetes or whatever disease you like to study. And we need a little bit more math to get this to work right, and the type of math that is most commonly used is called the liability threshold model. And again, I’m going to try to avoid going over the mathematical symbols here, but I want to provide the intuition because this is so important. The liability threshold model models that there’s some unobserved underlying continuous number called the liability, and if the liability is above some number, you have the disease. Fortunately, we have an example which is type 2 diabetes, which provides some intuition here. Where maybe you’re doing a GWAS of type 2 diabetes, and all you know is that somebody has told you whether or not they have type 2 diabetes or whether or not they’ve been told by a doctor that they have type 2 diabetes. But there’s an underlying continuous quantity which we call the liability, which is your fasting blood glucose level, which is one of the diagnostic criteria for type 2 diabetes, and we could think of it as, you know, we don’t get to see the liability, but if your fasting blood glucose is above some level, then you have type 2 diabetes. And so what we want to do is we want to do some math that operates on this unobserved continuous-valued liability, and that’s what liability threshold modeling is about.

And I’m not going to go into the details, but all the work that’s been done in this space on SNP heritability has relied on the liability threshold model. And contingent on using the liability threshold model, it has generally produced a qualitatively similar story to the story that I communicated earlier for height. And I’m not going to go into those results in detail.

I do want to mention briefly that I’ve been talking mostly about methods that use individual-level genotypes. There’s been a lot of interest dating back to the year 2012 in methods that instead only require summary association statistics as input, and there are methods now that do input summary association statistics to learn about various flavors of heritability and SNP heritability.

I just want to provide one point of intuition that this is related to the fact, an important observation made by Yang et al. in 2011, that the average chi-square statistic in a GWAS is very closely related to SNP heritability. You should not be surprised if your average chi-square statistic in a GWAS is above 1; that does not imply that you have confounding, but rather, that is directly related to the amount of polygenic signal in that GWAS. An increasingly sophisticated set of methods can use that observation to estimate heritability from summary statistics, and I’m going to gloss over all of the details of that.

So, conclusions: we can use family data, such as twin studies, to estimate the narrow-sense heritability of a trait. For even now, in the year 2020, most GWAS have the property that the genome-wide significant loci that they discover do not explain all the heritability, we have missing heritability, and there’s this concept of SNP heritability that can really close that gap.

So, these are some of the bonus topics, really within the heritability space, that I think are super interesting and that I have some bonus slides on, and, as expected, I’m not going to have the time to delve into in detail, but you can check out those bonus slides if you want to. As you know, the slides were sent out to the whole group, and I’d like to acknowledge all the members of my group who have figured all this stuff out and explained it to me. Also, I’d like to put in a brief advertisement for my course, “Advanced Population Medical Genetics,” which would be offered in spring 2021. With that, I’ll take any additional questions that people may have.

Diane: Thank you very much. OK, as there are no questions in the Q&A box at the moment, please do post to the whole group. Please do post any if you have further questions.

Alkes: There was a question that was just asked.

Diane: Oh great, um, it’s from Matthew Worman who’s commenting: “Thanks, Alkes. I really appreciate the clarity your presentation, so I will second to that, and thank you. And I do have one question, which is, I’m curious about the with the identity by state definition, the concept of unrelated individuals. Does this require that the variant shared by the two individuals be of independent origin or …?”

Alkes: Okay, so thank you for raising that point, and unrelated individuals, that this is another term, just like the term “population,” where it’s a little bit tricky to define it, and if you define it strictly, then it’s probably nothing like what you have in real life. So if I want it to be real strict about it, I could say unrelated individuals is like, let’s think, I have a simulation, and in my simulation, I’ve got some minor allele frequencies, and I’m, you know,or haplotype frequencies, generating the genome of individual number one, and then completely independently and generating the genome of individual number two, and the whole process is completely independent. But in real life, of course, it’s not like that. You know, you take any two people who think they’re unrelated to each other, no, you could probably go back, you know, five or ten generations and find some common ancestor; that’s usually the way that it works. So the detailed answer to that question is, if you’ve got cryptic relatedness, you’ve got this set of individuals from a GWAS; they’re supposed to be unrelated to each other, you don’t think there are any siblings in there, but there’s probably some cryptic relatedness. I mean, I believe that Yang et al. (2010) Nature Genetics came up with some threshold that I think might have been 0.05, or maybe they later they changed it to 0.025 or something like that. And, as long as you apply this threshold where you stringently check that there are no two individuals in the data set who are related at a level greater than 0.05, or greater than 0.025, then it is hypothesized that the method is sort of robust to that. And then, even though the people are not strictly unrelated, you should be okay.

Now, on the other hand, what if you have a dataset that consists of like related and unrelated individuals, and you don’t want to throw away half of them because then you’d be zapping your power? Then, what do you do? Well, it turns out you can fit two variance components to jointly estimate narrow-sense heritability and SNP heritability. I’m not going to go over the details; there’s a reference (Zheitlin et al., 2013, PLoS genetics) from our group, so I think that, in summary, the shortest answer to your question is, as long as you make sure that there, you know, that any cryptically related people are only a little little bit related, then you’re kind of okay, and everything that I said still kind of holds.

Diane: Thank you very much. There were three more questions posted, but I’m afraid we’re out of time because MPG talk will be starting shortly. So, I think we’ll keep those questions in reserve and share them with you Alkes if you, if you wish,

Alkes: All are welcome to email me offline at aprice@hsp.harvard.edu.

Diane: Great, thank you so much for your talk, and thanks to the audience for joining in today. So, thanks, Alkes.