Glossary

Below are common terms and acronyms used within this book and across the field of Psychiatric Genetics.


A

Term Definition Reference chapter
Admixture Genetic ancestry that comes from distinct populations
Allele Variant forms of a gene that occupy a specific position on a chromosome.
Antisense A molecule or strand of nucleic acid that is complementary to a specific RNA sequence, often used in gene regulation.
Area under the ROC curve (AUC) A measure of a classifier’s ability to distinguish between classes in a binary classification problem. AUC is between 0.5 (no prediction) and 1 (full prediction)
Assembly (Genome)
Autosome Any chromosome that is not a sex chromosome, responsible for carrying genetic information unrelated to sex determination.

B

Base pair The fundamental unit of DNA and RNA, consisting of two nucleotides held together by hydrogen bonds.
Biallelic (SNP) Single Nucleotide Polymorphism with two possible alleles at a specific genomic position.
Bonferroni adjustment Statistical correction to account for multiple hypothesis testing, by applying a p-value correction that corrects for the number of statistical tests performed
Build (Genome)

C

Candidate Gene Study Research focusing on specific genes to determine their association with a trait or disease.
Chromatin Complex of DNA, RNA, and proteins forming the chromosome’s structure.
Chromatin Immunoprecipitation Sequencing (ChIPseq) Technique to identify DNA regions bound by specific proteins using antibodies.
Chromosome Thread-like structure in cells containing DNA and genes.
Codon Three-nucleotide sequence in mRNA that encodes a specific amino acid during protein synthesis.
Common variant Frequently occurring genetic variant in a population.
Congenital Present at birth, often referring to medical conditions or traits.
Contig
Copy DNA (cDNA) DNA synthesized from an mRNA template, used in molecular biology research.
Copy Number Variant (CNV) Variation in the number of copies of a DNA segment between individuals.
Crossover Exchange of genetic material between homologous chromosomes during meiosis.

D

Deletion Removal of a segment of DNA, leading to the loss of genetic material.
Duplication Copying of a DNA segment, resulting in extra genetic material.

E

Effect estimate Statistical measurement of the impact of a factor on an outcome in research (e.g. odds ratio, beta value)
Elastic Net Regression Statistical learning method combining Lasso and Ridge regressions to select relevant variables
Enhancer DNA region influencing gene transcription.
Epidemiology Study of the patterns and causes of disease and health in populations.
Epigenetics Study of heritable changes in gene expression not involving alterations in DNA sequence.
Epigenome Overall epigenetic modifications in an organism’s DNA.
Epigenome-Wide Association Study (EWAS) Investigation of epigenetic variations associated with traits or diseases acros the genome
Epistasis Interaction between different genetic variants affecting a trait’s expression.
Exome The portion of the genome containing protein-coding genes.
Exon Part of a gene that will form mRNA. Some exons are coding, with information for making a protein.

F

False Discovery Rate (FDR) Method for correction for multiple testing, by defining an acceptable false discovery rate - proportion of falsely identified significant results in multiple testing.
FASTA file Format for representing nucleotide or amino acid sequences in bioinformatics.
Frameshift mutation Insertion or deletion altering the reading frame during translation.

G

Gene-Environment Interaction (GxE) Joint effect of genetic and environmental factors on traits.
Genetic Correlation Proportion of variance that two traits share from common genetic causes
Genome Complete set of genetic material in an organism.
Genomic Inflation factor (lamda) Measure of inflation in test statistics due to population stratification.
Genotype Genetic makeup of an individual at a specific locus.
Genome-Wide Association Study (GWAS) Investigation of genetic variants across the genome to identify associations with traits. Chapter 1.2; Chapter 5; Software Tutorial: GWAS

H

Haplotype Set of alleles on a single chromosome inherited together.
Hardy-Weinberg Equilibrium (HWE) Model predicting allele frequencies remain constant in a non-evolving population.
Heterogeneity Variability or diversity in a population or dataset.
Heterozygosity Possessing different alleles at a specific genetic locus.
Histone modification Chemical alteration of histone proteins influencing gene expression.
Homozygosity Possessing identical alleles at a specific genetic locus.
Horizontal pleiotropy Situation in which a genetic variant affects multiple traits independently.
Hyperparameters Parameters set before a machine learning algorithm runs.

I

Identity-by-decent (IBD) Shared ancestry for a specific genomic region between individuals.
Imputation Predicting missing genetic information based on known data.
INFO score Measure of imputation quality for genetic variants
Insertion Addition of a DNA segment into a genome.
Insertion/Deletion (Indel) Genetic variation involving insertions or deletions of nucleotides.
Intergenic DNA regions between genes.
Intron Region of DNA between the exons in a gene

J

K

Kilobase (Kb) Unit of length used in molecular biology, equal to 1000 base pairs.
Kinship Genetic relatedness between individuals.

L

Lasso regression Regression method promoting sparsity in feature selection.
LD Score Regression (LDSC) Technique to estimate genetic correlation from genome-wide summary statistics.
Linkage Physical proximity of genetic loci on a chromosome.
Linkage disequilibrium (LD) Non-random association of alleles at two or more loci.
Locus Specific position on a chromosome.

M

Manhattan plot Graph showing results of association tests in a genome-wide analysis, where x-axis is genomic position, y-axis is -log10(p-value)
Mendelian Randomization Method using genetic variants as instrumental variables to infer causality.
Mendelian Trait
Messenger RNA (mRNA) RNA molecule transcribed from DNA and carrying protein-coding information.
Methylation Addition of a methyl group to DNA, often affecting gene expression.
Microbiome Community of microorganisms in a specific environment.
Minor allele frequency Frequency of the less common allele at a genetic locus in a population.
Missense mutation Point mutation altering a codon, resulting in a different amino acid.
Mitochondrial DNA (mDNA) DNA located in mitochondria, inherited maternally.
Mosaicism Presence of genetically distinct cell populations within an organism.

N

Nagelkerke P Measure of explained variance in logistic regression.
Nonsense mutation Point mutation leading to a premature stop codon.
Nucleosome DNA wrapped around histone proteins, forming chromatin.
Null hypothesis Hypothesis stating no significant effect or difference.

O

Observational study Research investigating associations without intervention.
Odds ratio Measure of association in case-control studies.
Open Reading Frame DNA sequence potentially encoding a protein.

P

Pedigree Family tree showing genetic relationships.
Phasing Determining the parental origin of alleles in an individual.
Phenome Complete set of an individual’s traits and characteristics.
Phenome-wide Association Study (pheWAS) Exploration of genetic variants across multiple traits.
Phenotype Observable traits and characteristics of an individual.
Pleiotropy Single gene influencing multiple traits.
Point mutation Single nucleotide change in DNA.
Polygenic Risk Score (PRS) Combined effect of multiple genetic variants on a trait.
Polygenic trait Trait influenced by multiple genes.
Polymorphism Genetic variation within a population.
Population stratification Population substructure leading to confounding in genetic studies.
Power (Genomic) Probability of detecting an effect in a study.
Power analysis Calculating required sample size based on desired statistical power, and effect size estimates.
Precision Medicine Tailoring medical treatment to an individual’s genetic and health characteristics.
Principal Component Analysis (PCA) Dimensionality reduction technique, used to control for population stratification.
Proband Index individual in a genetic study.
Promoter DNA region controlling gene transcription initiation.

Q

Quantile-quantile plot (qqplot) Plot showing how observed test statistics in a genome-wide analysis depart from expected distribution.

R

R2 Coefficient of determination indicating the proportion of variance explained.
Rare variant Infrequently occurring genetic variant in a population.
REF/ALT Reference and alternative alleles at a genetic locus.

S

Scaffold
Sensitivity True positive rate in diagnostic testing.
Single Nucleotide Polymorphism (SNP) Single base-pair variation in DNA sequence.
Single Nucleotide Variant (SNV) Single base-pair genetic variation, including SNPs.
SNP Heritability (h2SNP) Proportion of trait variance explained by SNPs.
Specificity True negative rate in diagnostic testing.
Structural variant Genetic variation involving larger DNA segments.

T

Transcriptome Total RNA
Transcriptomic Imputation The process of predicting missing gene expression data using available information.
Transcriptomic Risk Score (TRS) A score calculated from transcriptomic data to assess the risk of a specific outcome.
Type I error False positive result in hypothesis testing, where a null hypothesis is incorrectly rejected.
Type II error False negative result in hypothesis testing, where a null hypothesis is incorrectly accepted. Type II error = 1-power.

U

V

Variant A specific form of a genetic locus differing from the reference sequence.
Variance explained The portion of trait variability accounted for by a given factor, often a genetic variant.
Variant Call Format (VCF) A standard file format for storing genetic variant information.
Vertical pleiotropy Situation in which a genetic variant affects multiple traits due to a shared biological pathway.

W

Whole Exome Sequencing (WES) Technique for sequencing only the protein-coding regions of the genome.
Whole Genome Sequencing (WGS) Method for sequencing an individual’s entire genome.

XYZ

X-linked Genetic trait located on the X chromosome, leading to sex-specific inheritance patterns.

*This Glossary was constructed with the help of ChatGPT.

OpenAI. (2023). ChatGPT (Mar 14 version) [Large language model]. https://chat.openai.com/chat

For other glossaries of genetic terms, please see: