Chapter 9: Advanced Topics
Chapter goals:
- Gain a more extensive understanding of advanced genetic analyses.
- Understand what a copy number variant is and methods for defining CNVs in your data.
- Understand the concept behind Mendelian Randomization and methods for running MR.
- Gain a comprehensive understanding of Genomic Structural Equation Modeling (Genomic SEM) and methods for performing Genomic SEM.
- Understand gene x environment interactions and how they are measured.
- Gain understanding of Twin studies and how a twin-based analysis is run.
- Understand how drug target analyses and association tests are run on genetic data, and how this can help with the development of therapeutics for various disorders.
9.1 Copy Number Variation
This video by Dr. Howrigan explains copy number variation: what is a copy number variant, and how is it detected in genetic data. Additionally, examples of CNV analyses demonstrate what a CNV file format looks like, as well as output from CNV analyses, and how to perform CNV burden and association testing on that data.
The paper referenced in this talk:
Marshall CR, Howrigan DP, et al. Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat Genet. 2017 Jan;49(1):27-35. doi: 10.1038/ng.3725. Epub 2016 Nov 21.
Title: How to run Copy Number Variation (CNV) analysis
Presenter(s): Daniel Howrigan, PhD (Broad Institute of Harvard and MIT)
Level: Intermediate
Length: 20:03
Link to video transcript here. Tradução para o português disponível aqui.
9.2 Mendelian Randomization
This video from Dr. Smith gives a very brief and simple explanation of Mendelian Randomization, and how it is an important tool in detecting causality.
Title: A two minute primer on Mendelian Randomization
Presenter(s): George Davey Smith, PhD (MRC Integrative Epidemiology Unit, University of Bristol)
Level: Beginner
Length: 2:16
Link to video transcript here. Tradução para o português disponível aqui.
9.3 Genomic Structural Equation Modeling
The following six videos by Drs. Grotzinger and Nivard give a comprehensive look at Genomic Structural Equation Modeling (Genomic SEM) for examining shared and distinct genetic relationships across multiple disorders. The first two videos give an introduction to the concept and basic usage of Genomic SEM, while the latter four videos dive deeper into the methods and analyses that can be performed with this tool.
Title: Genomic Structural Equation Modeling: A Brief Introduction
Presenter(s): Andrew Grotzinger, PhD (Institute for Behavioral Genetics, University of Colorado Boulder)
Level: Beginner
Length: 10:43
Link to video transcript here.
Title: Short Primer on Structural Equation Modeling (SEM) in Lavaan
Presenter(s): Andrew Grotzinger, PhD (Institute for Behavioral Genetics, University of Colorado Boulder)
Level: Intermediate
Length: 11:34
Link to video transcript here.
Title: GenomicSEM: Input/Explaining how S and V are estimated and what they are
Presenter(s): Michel Nivard, PhD (Deptartment of Biological Psychology, Vrije Universiteit Amsterdam)
Level: Intermediate
Length: 23:18
Link to video transcript here.
Title: Working through examples on the Genomic SEM wiki one by one: munge, ldsc, usermodel functions
Presenter(s): Michel Nivard, PhD (Deptartment of Biological Psychology, Vrije Universiteit Amsterdam)
Level: Intermediate
Length: 22:54
Link to video transcript here.
Link to tutorial GitHub here and Wiki here.
Title: Multivariate GWAS in Genomic SEM
Presenter(s): Andrew Grotzinger, PhD (Institute for Behavioral Genetics, University of Colorado Boulder)
Level: Intermediate
Length: 30:08
Link to video transcript here.
Title: Using Genomic SEM to Understand Psychiatric Comorbidity
Presenter(s): Andrew Grotzinger, PhD (Institute for Behavioral Genetics, University of Colorado Boulder)
Level: Intermediate
Length: 1:01:02
Link to video transcript here.
9.4 Interactions with Environmental Factors
In this first part of Section 9.4, Dr. Lambert gives an introduction to the concept of interaction terms for statistical analyses: understanding how and when to use them, and how to interpret these terms. The first video gives a brief introduction, and the second video gives an introduction to interaction terms when using continuous variables.
In the second part of this section, Dr. Westerman gives an introduction to gene-environment interactions in the context of psychiatric genetics, describing how GxE interactions can be detected in genetics data, how statistical power can impact the detection of GxE interactions, and describing the difference between additive and multiplicative interactions.
Title: Dummy variables: interaction terms explanation
Presenter(s): Ben Lambert, PhD (College of Engineering, Mathematics and Physical Sciences, University of Exeter)
Level: Beginner friendly
Length: 4:35
Link to video transcript here. Tradução para o português disponível aqui.
Title: Continuous variables: interaction term interpretation
Presenter(s): Ben Lambert, PhD (College of Engineering, Mathematics and Physical Sciences, University of Exeter)
Level: Beginner friendly
Length: 4:53
Link to video transcript here. Tradução para o português disponível aqui.
Title: Gene-environment interaction analysis
Presenter(s): Kenny Westerman, PhD (Broad Institute of Harvard and MIT)
Level: Intermediate (content will be easier to those with a statistical genetics background)
Length: 42:28
Link to video transcript here.
9.5 Family-Based Analysis
In this video Dr. Maes describes univariate (one phenotype) methods of Twin modeling to determine familial genetic and environmental traits using OpenMx software.
Title: Univariate/MonoPhenotype Twin Modeling in OpenMx
Presenter(s): Hermine H. Maes, PhD (Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University)
Level: Intermediate
Length: 18:21
Link to video transcript here.
Link to OpenMx software.
9.6 Therapeutic Implications
In the below video, Dr. Whirl-Carrillo gives a comprehensive overview of pharmacogenetics methods using the example of the Pharmacogenomics Knowledge Base (PharmGKB) database, which is a large database of clinical information, drug label annotations, and curated pathways for annotating genes. This talk describes how this database was put together, and how it can be used for performing pharmacogenomics analyses and augmenting genomics analyses.
Title: Pharmacogenomics knowledge for personalized medicine
Presenter(s): Michelle Whirl-Carrillo, PhD (Department of Biomedical Data Science, Stanford University School of Medicine)
Level: Intermediate
Length: 44:27
Link to video transcript here.
Link to the PharmGKB website.