A canonical correlation analysis- based approach to identify causal genes in atherosclerosis

University essay from Högskolan i Skövde/Institutionen för biovetenskap

Abstract: Genome-wide associations studies (GWASs) have identified hundreds of loci that are strongly associated with coronary artery disease and its risk factors. However, the causal variants and genes remain unknown for the vast majority of the identified loci. Zebrafish model systems coupled with clustered regularly interspaced short palindromic repeats-C–associated 9 (CRISPR Cas-9) mutagenesis have enabled the possibility to systematically characterize candidate genes in GWAS-identified loci. In this thesis, canonical correlation analysis (CCA) was used to identify putative causal genes in multiplexed genetic screens for atherogenic traits in zebrafish larvae in an efficient manner. The two datasets used in this thesis contained genes and phenotypes obtained through sequencing and high-throughput imaging of fish larvae. Dataset 1 contained (7 genes, 11 phenotypes, n = 384) and dataset 2 (4 genes, 11 phenotypes, n = 384). CCA’s multiple genes vs. multiple phenotype analysis in dataset 1 identified the genes met, pepd, timd4 and vegfa to have an association with the total cholesterol, triglycerides, glucose, corrected lipid disposition, as well as co- localization of (macrophage and lipid deposition,) (neutrophils and lipid deposition) and (macrophage and neutrophils). In dataset 2, CCA found previously reported correlation of genes apobb1 and apoea with total cholesterol, low-density lipoprotein and triglycerides as well as co localization of neutrophils and lipids. In comparison with hierarchical linear model, CCA represents a powerful and promising tool to identify causal genes for cardiovascular diseases in data from zebrafish model systems. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)