Chapter 5: GWAS Analysis

Chapter goals:

  1. Understand what are the main analysis steps in a Genome-Wide Association Study (GWAS).
  2. Understand the GWAS quality control metrics and considerations for each step.
  3. Understand what imputation is and why it is useful for GWAS.
  4. Gain understanding of the statistical background to GWAS association testing and considerations for those association tests.
  5. Understand how meta-analyses are important and useful for GWAS, and various methods for implementing them.

5.1 Quality Control

Goals of this section:

  1. Understand the basis of genotyping quality control and why it is important.

  2. Understand the basic pre-imputation quality control steps (missingness, Hardy-Weinberg outliers, heterozygosity, relatedness, population stratification).

  3. Understand various considerations for each step of quality control, based on your data.

The first video by Drs. Grasby and Colodro Conde provides a brief introduction to the logic of the steps of QC (how genetic data are generated for subsequent QC, briefly what each step of the QC does, and why these steps are useful and necessary for future analyses of cleaned-up data. The second video by Dr. Coleman is a nice QC guide which provides overviews of dealing with missingness, relatedness, and population stratification. The third video from Dr. Demirkan provides key steps of QC for the GWAS data, similar to the first two videos, but it additionally discusses steps of quality controlling imputed data and factors affecting imputation (starting at 25:57).

Quality Control: Introduction

Title: Quality control

Presenter(s): Katrina Grasby, PhD (The Psychiatric Genetics Group, QIMR Berghofer Medical Research Institute)

Coauthor(s): Lucía Colodro Conde, PhD (The Psychiatric Genetics Group, QIMR Berghofer Medical Research Institute)

Level: Beginner friendly

Length: 16:34

Link to video transcript here. Tradução para o português disponível aqui.


Running Quality Control on Genotype Data

Title: How to run Quality Control on Genome-Wide Genotyping Data

Presenter(s): Jonathan Coleman, PhD (Social, Genetic, and Developmental Psychiatry Centre, King’s College London)

Level: Beginner friendly

Length: 16:19

Link to video transcript here. Tradução para o português disponível aqui.


Considerations for Genotyping QC

Title: Considerations for genotyping, quality control, and imputation in GWAS

Author: Ayşe Demirkan, PhD (School of Biosciences, Surrey Institute for People-Centred Artificial Intelligence, University of Surrey)

Level: Beginner friendly

Length: 33:21

Link to video transcript here. Tradução para o português disponível aqui.


5.2 Imputation

Goals of this section:

  1. Understand what imputation is and why it is important for genotyping
  2. Understand the steps of imputation, including the concept of phasing, reference panels, imputation servers, and imputation accuracy parameters.
  3. Understand some of the math behind imputation.

The first video by Dr. Mészáros gives an introduction to haplotypes and imputation and explains the concepts of phasing and imputation in an easily understandable way. The second video by Dr. Sarah Medland talks about the steps needed for imputation, including the imputation reference panels, common software for phasing and imputation, imputation servers, content in the output files from imputation, and brief explanation of imputation accuracy parameters (r2 or INFO). The third video by Dr. Browning discusses the details and math behind imputation: for example, why choose to perform imputation when you can sequence, as well as the Hidden Markov Model for imputation. Note: The imputation-related content ends at 32:07. After this, the speaker talks about best programming practices.

Imputation Introduction

Title: Haplotypes and Imputation

Presenter(s): Dr. Gábor Mészáros, PhD (Institute of Livestock Sciences (NUWI), University of Natural Resources and Life Sciences)

Level: Beginner friendly

Length: 19:25

Link to video transcript here. Tradução para o português disponível aqui.


Imputation Steps

Title: Imputation

Presenter(s): Sarah Medland, PhD (The Psychiatric Genetics Group, Queensland Institute of Medical Research)

Level: Beginner friendly

Length: 16:21

Link to video transcript here. Tradução para o português disponível aqui.


Imputation Deep-Dive

Title: An Introduction to Genotype Imputation

Presenter(s): Brian Browning, PhD (Department of Medicine, University of Washington)

Level: Intermediate/Advanced level.

Length: 44:37

Link to video transcript here. Tradução para o português disponível aqui.


5.3 Association Testing

This section highlights the statistical theory behind association testing in a GWAS, and discusses the considerations for evaluating associations between genetic variants and a phenotype. The first video by Dr. Neale discusses the background and history of genetic epidemiology, heritability, and the methods for uncovering genetic associations with traits. The second video by Dr. Verhulst gives an introduction to the statistics of GWAS, including association testing, effect sizes, statistical power, and how to calculate power.

The Biometrical Model & Basic Statistics

Title: Biometrical Model and Basic Statistics

Presenter(s): Benjamin Neale, PhD (Harvard Medical School; Analytic and Translational Genetics Unit, Massachusetts General Hospital)

Level: Beginner friendly

Length: 34:09

Link to video transcript here. Tradução para o português disponível aqui.


Hypothesis Testing, Effect Sizes, and Statistical Power

Title: Hypothesis Testing, Effect Sizes, and Statistical Power

Presenter(s): Brad Verhulst, PhD (Department of Psychiatry and Behavioral Sciences, Texas A&M University)

Level: Intermediate

Length: 35:52

Link to video transcript here. Tradução para o português disponível aqui.


5.4: Meta-Analysis

In this video, Dr. Peloso gives a comprehensive look into the different steps of GWAS and highlights best practices for GWAS and meta-analyses combining GWAS data across multiple studies. 

Title: Genome-wide association study design and interpretation

Presenter(s): Gina Peloso, PhD (Department of Biostatistics, Boston University)

Level: Beginner friendly

Length: 55:39

Link to video transcript here. Tradução para o português disponível aqui.