Datasets

Large genetic and genomic datasets are becoming more publicly available for research use. In the following videos, Drs. Palotie, Liu, and O’Donnell-Luria give introductions to large datasets available from the FinnGen, PsychENCODE, and gnomAD resources respectively.

FinnGen is a large consortia of both public and private institutions (universities, hospitals, biobanks, pharmaceutical companies, etc.) to collect and analyse data from over 500,000 Finnish biobank participants, with the ultimate goal of finding novel insight into prevention, treatment, and diagnosis of disease.

The PsychENCODE consortium is composed of many academic institutions whose goal is the generation of large-scale genomic data (gene expression and regulation) from individuals with psychiatric disorders across different brain regions. This includes generation of both bulk and single-cell RNA sequencing, ChIPseq, Hi-C, ATACseq, and other measures of chromatin activity for multiple brain regions across different developmental stages.

The Genome Aggregation Database, or gnomAD, is a large reference of human genetic variation from whole exome data, covering the allele frequency spectrum from rare and de novo to common variation.

FinnGen

Title: Intro to FinnGen

Presenter(s): Aarno Palotie, MD, PhD (Institute for Molecular Medicine Finland (FIMM), University of Helsinki)

Level: Beginner

Length: 19:24

Link to video transcript here.

Link to FinnGen website.


PsychENCODE

Title: Introduction to PsychENCODE

Presenter(s): Chunyu Liu, PhD (Department of Psychiatry, SUNY Upstate Medical University)

Level: Beginner

Length: 19:34

Link to video transcript here.

Link to PsychENCODE website and PsychSCREEN website.


GnomAD

Title: gnomAD: Using large genomic data sets to interpret human genetic variation

Presenter(s): Anne O’Donnell-Luria, PhD (Broad Institute of Harvard and MIT)

Level: Beginner

Length: 54:20

Link to video transcript here.

Link to gnomAD website.