Below are common terms and acronyms used within this book and across the field of Psychiatric Genetics.
A
| Admixture |
Genetic ancestry that comes from distinct populations |
|
| Allele |
Variant forms of a gene that occupy a specific position on a chromosome. |
|
| Antisense |
A molecule or strand of nucleic acid that is complementary to a specific RNA sequence, often used in gene regulation. |
|
| Area under the ROC curve (AUC) |
A measure of a classifier’s ability to distinguish between classes in a binary classification problem. AUC is between 0.5 (no prediction) and 1 (full prediction) |
|
| Assembly (Genome) |
|
|
| Autosome |
Any chromosome that is not a sex chromosome, responsible for carrying genetic information unrelated to sex determination. |
|
B
| Base pair |
The fundamental unit of DNA and RNA, consisting of two nucleotides held together by hydrogen bonds. |
|
| Biallelic (SNP) |
Single Nucleotide Polymorphism with two possible alleles at a specific genomic position. |
|
| Bonferroni adjustment |
Statistical correction to account for multiple hypothesis testing, by applying a p-value correction that corrects for the number of statistical tests performed |
|
| Build (Genome) |
|
|
C
| Candidate Gene Study |
Research focusing on specific genes to determine their association with a trait or disease. |
|
| Chromatin |
Complex of DNA, RNA, and proteins forming the chromosome’s structure. |
|
| Chromatin Immunoprecipitation Sequencing (ChIPseq) |
Technique to identify DNA regions bound by specific proteins using antibodies. |
|
| Chromosome |
Thread-like structure in cells containing DNA and genes. |
|
| Codon |
Three-nucleotide sequence in mRNA that encodes a specific amino acid during protein synthesis. |
|
| Common variant |
Frequently occurring genetic variant in a population. |
|
| Congenital |
Present at birth, often referring to medical conditions or traits. |
|
| Contig |
|
|
| Copy DNA (cDNA) |
DNA synthesized from an mRNA template, used in molecular biology research. |
|
| Copy Number Variant (CNV) |
Variation in the number of copies of a DNA segment between individuals. |
|
| Crossover |
Exchange of genetic material between homologous chromosomes during meiosis. |
|
D
| Deletion |
Removal of a segment of DNA, leading to the loss of genetic material. |
|
| Duplication |
Copying of a DNA segment, resulting in extra genetic material. |
|
E
| Effect estimate |
Statistical measurement of the impact of a factor on an outcome in research (e.g. odds ratio, beta value) |
|
| Elastic Net Regression |
Statistical learning method combining Lasso and Ridge regressions to select relevant variables |
|
| Enhancer |
DNA region influencing gene transcription. |
|
| Epidemiology |
Study of the patterns and causes of disease and health in populations. |
|
| Epigenetics |
Study of heritable changes in gene expression not involving alterations in DNA sequence. |
|
| Epigenome |
Overall epigenetic modifications in an organism’s DNA. |
|
| Epigenome-Wide Association Study (EWAS) |
Investigation of epigenetic variations associated with traits or diseases acros the genome |
|
| Epistasis |
Interaction between different genetic variants affecting a trait’s expression. |
|
| Exome |
The portion of the genome containing protein-coding genes. |
|
| Exon |
Part of a gene that will form mRNA. Some exons are coding, with information for making a protein. |
|
F
| False Discovery Rate (FDR) |
Method for correction for multiple testing, by defining an acceptable false discovery rate - proportion of falsely identified significant results in multiple testing. |
|
| FASTA file |
Format for representing nucleotide or amino acid sequences in bioinformatics. |
|
| Frameshift mutation |
Insertion or deletion altering the reading frame during translation. |
|
G
| Gene-Environment Interaction (GxE) |
Joint effect of genetic and environmental factors on traits. |
|
| Genetic Correlation |
Proportion of variance that two traits share from common genetic causes |
|
| Genome |
Complete set of genetic material in an organism. |
|
| Genomic Inflation factor (lamda) |
Measure of inflation in test statistics due to population stratification. |
|
| Genotype |
Genetic makeup of an individual at a specific locus. |
|
| Genome-Wide Association Study (GWAS) |
Investigation of genetic variants across the genome to identify associations with traits. |
Chapter 1.2; Chapter 5; Software Tutorial: GWAS |
H
| Haplotype |
Set of alleles on a single chromosome inherited together. |
|
| Hardy-Weinberg Equilibrium (HWE) |
Model predicting allele frequencies remain constant in a non-evolving population. |
|
| Heterogeneity |
Variability or diversity in a population or dataset. |
|
| Heterozygosity |
Possessing different alleles at a specific genetic locus. |
|
| Histone modification |
Chemical alteration of histone proteins influencing gene expression. |
|
| Homozygosity |
Possessing identical alleles at a specific genetic locus. |
|
| Horizontal pleiotropy |
Situation in which a genetic variant affects multiple traits independently. |
|
| Hyperparameters |
Parameters set before a machine learning algorithm runs. |
|
I
| Identity-by-decent (IBD) |
Shared ancestry for a specific genomic region between individuals. |
|
| Imputation |
Predicting missing genetic information based on known data. |
|
| INFO score |
Measure of imputation quality for genetic variants |
|
| Insertion |
Addition of a DNA segment into a genome. |
|
| Insertion/Deletion (Indel) |
Genetic variation involving insertions or deletions of nucleotides. |
|
| Intergenic |
DNA regions between genes. |
|
| Intron |
Region of DNA between the exons in a gene |
|
K
| Kilobase (Kb) |
Unit of length used in molecular biology, equal to 1000 base pairs. |
|
| Kinship |
Genetic relatedness between individuals. |
|
L
| Lasso regression |
Regression method promoting sparsity in feature selection. |
|
| LD Score Regression (LDSC) |
Technique to estimate genetic correlation from genome-wide summary statistics. |
|
| Linkage |
Physical proximity of genetic loci on a chromosome. |
|
| Linkage disequilibrium (LD) |
Non-random association of alleles at two or more loci. |
|
| Locus |
Specific position on a chromosome. |
|
M
| Manhattan plot |
Graph showing results of association tests in a genome-wide analysis, where x-axis is genomic position, y-axis is -log10(p-value) |
|
| Mendelian Randomization |
Method using genetic variants as instrumental variables to infer causality. |
|
| Mendelian Trait |
|
|
| Messenger RNA (mRNA) |
RNA molecule transcribed from DNA and carrying protein-coding information. |
|
| Methylation |
Addition of a methyl group to DNA, often affecting gene expression. |
|
| Microbiome |
Community of microorganisms in a specific environment. |
|
| Minor allele frequency |
Frequency of the less common allele at a genetic locus in a population. |
|
| Missense mutation |
Point mutation altering a codon, resulting in a different amino acid. |
|
| Mitochondrial DNA (mDNA) |
DNA located in mitochondria, inherited maternally. |
|
| Mosaicism |
Presence of genetically distinct cell populations within an organism. |
|
N
| Nagelkerke P |
Measure of explained variance in logistic regression. |
|
| Nonsense mutation |
Point mutation leading to a premature stop codon. |
|
| Nucleosome |
DNA wrapped around histone proteins, forming chromatin. |
|
| Null hypothesis |
Hypothesis stating no significant effect or difference. |
|
O
| Observational study |
Research investigating associations without intervention. |
|
| Odds ratio |
Measure of association in case-control studies. |
|
| Open Reading Frame |
DNA sequence potentially encoding a protein. |
|
P
| Pedigree |
Family tree showing genetic relationships. |
|
| Phasing |
Determining the parental origin of alleles in an individual. |
|
| Phenome |
Complete set of an individual’s traits and characteristics. |
|
| Phenome-wide Association Study (pheWAS) |
Exploration of genetic variants across multiple traits. |
|
| Phenotype |
Observable traits and characteristics of an individual. |
|
| Pleiotropy |
Single gene influencing multiple traits. |
|
| Point mutation |
Single nucleotide change in DNA. |
|
| Polygenic Risk Score (PRS) |
Combined effect of multiple genetic variants on a trait. |
|
| Polygenic trait |
Trait influenced by multiple genes. |
|
| Polymorphism |
Genetic variation within a population. |
|
| Population stratification |
Population substructure leading to confounding in genetic studies. |
|
| Power (Genomic) |
Probability of detecting an effect in a study. |
|
| Power analysis |
Calculating required sample size based on desired statistical power, and effect size estimates. |
|
| Precision Medicine |
Tailoring medical treatment to an individual’s genetic and health characteristics. |
|
| Principal Component Analysis (PCA) |
Dimensionality reduction technique, used to control for population stratification. |
|
| Proband |
Index individual in a genetic study. |
|
| Promoter |
DNA region controlling gene transcription initiation. |
|
Q
| Quantile-quantile plot (qqplot) |
Plot showing how observed test statistics in a genome-wide analysis depart from expected distribution. |
|
R
| R2 |
Coefficient of determination indicating the proportion of variance explained. |
|
| Rare variant |
Infrequently occurring genetic variant in a population. |
|
| REF/ALT |
Reference and alternative alleles at a genetic locus. |
|
S
| Scaffold |
|
|
| Sensitivity |
True positive rate in diagnostic testing. |
|
| Single Nucleotide Polymorphism (SNP) |
Single base-pair variation in DNA sequence. |
|
| Single Nucleotide Variant (SNV) |
Single base-pair genetic variation, including SNPs. |
|
| SNP Heritability (h2SNP) |
Proportion of trait variance explained by SNPs. |
|
| Specificity |
True negative rate in diagnostic testing. |
|
| Structural variant |
Genetic variation involving larger DNA segments. |
|
T
| Transcriptome |
Total RNA |
|
| Transcriptomic Imputation |
The process of predicting missing gene expression data using available information. |
|
| Transcriptomic Risk Score (TRS) |
A score calculated from transcriptomic data to assess the risk of a specific outcome. |
|
| Type I error |
False positive result in hypothesis testing, where a null hypothesis is incorrectly rejected. |
|
| Type II error |
False negative result in hypothesis testing, where a null hypothesis is incorrectly accepted. Type II error = 1-power. |
|
V
| Variant |
A specific form of a genetic locus differing from the reference sequence. |
|
| Variance explained |
The portion of trait variability accounted for by a given factor, often a genetic variant. |
|
| Variant Call Format (VCF) |
A standard file format for storing genetic variant information. |
|
| Vertical pleiotropy |
Situation in which a genetic variant affects multiple traits due to a shared biological pathway. |
|
W
| Whole Exome Sequencing (WES) |
Technique for sequencing only the protein-coding regions of the genome. |
|
| Whole Genome Sequencing (WGS) |
Method for sequencing an individual’s entire genome. |
|
XYZ
| X-linked |
Genetic trait located on the X chromosome, leading to sex-specific inheritance patterns. |
|
*This Glossary was constructed with the help of ChatGPT.
OpenAI. (2023). ChatGPT (Mar 14 version) [Large language model]. https://chat.openai.com/chat
For other glossaries of genetic terms, please see: