Date Published: April 30, 2019
Publisher: Public Library of Science
Author(s): Alicia D. Howard, Xiaochun Wang, Megana Prasad, Avinash Das Sahu, Radhouane Aniba, Michael Miller, Sridhar Hannenhalli, Yen-Pei Christy Chang, Arnar Palsson.
For most complex traits, the majority of SNPs identified through genome-wide association studies (GWAS) reside within noncoding regions that have no known function. However, these regions are enriched for the regulatory enhancers specific to the cells relevant to the specific trait. Indeed, many of the GWAS loci that have been functionally characterized lie within enhancers that regulate expression levels of key genes. In order to identify polymorphisms with potential allele-specific regulatory effects, we developed a bioinformatics pipeline that harnesses epigenetic signatures as well as transcription factor (TF) binding motifs to identify putative enhancers containing a SNP with potential allele-specific TF binding in linkage disequilibrium (LD) with a GWAS-identified SNP. We applied the approach to GWAS findings for blood lipids, revealing 7 putative enhancers harboring associated SNPs, 3 of which lie within the introns of LCAT and ABCA1, genes that play crucial roles in cholesterol biogenesis and lipoprotein metabolism. All 3 enhancers demonstrated allele-specific in vitro regulatory activity in liver-derived cell lines. We demonstrated that these putative enhancers are in close physical proximity to the promoters of their respective genes, in situ, likely through chromatin looping. In addition, the associated alleles altered the likelihood of transcription activator STAT3 binding. Our results demonstrate that through our approach, the LD blocks that contain GWAS signals, often hundreds of kilobases in size with multiple SNPs serving as statistical proxies to the true functional site, can provide an experimentally testable hypothesis for the underlying regulatory mechanism linking genetic variants to complex traits.
Genome-wide association studies (GWAS) have thus far identified polymorphic loci underlying hundreds of diseases and traits. For blood lipid levels alone, approximately 180 loci have been reported for traits such as low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and total cholesterol (TC) and triglycerides (TG) [1–9]. With ever-increasing sample sizes from multiple studies of trans-ethnic participants, meta-analyses of previous GWAS, and even GWAS based on electronic health records , an increasing proportion of the variability and estimated heritability of these traits is accounted for by these loci, collectively. However, the molecular mechanism underlying these associations is largely unknown.
The most important finding in the present study is the identification of allele specific enhancers in ABCA1 and LCAT that may effectively modulate HDL metabolism. Our study, like many others (recently reviewed and discussed by Catarino and Stark ), showed that not all predicted or known enhancers will consistently demonstrate enhanced transcription when tested ectopically in a reporter gene system. In this study, the putative enhancer near GALNT2 containing rs17315646 did not demonstrate in vitro enhancer activity in Huh-7 and HepG2, even though the construct containing the G allele showed substantial (nearly 10-fold higher than the promoter-only construct) activity in HEK293 cells (S3 Fig). Another study  using a similar approach, found that a segment within the sequence we characterized that contains rs4846913, rs2144300, and rs6143660, and a nearby segment, containing SNP rs2281721, showed allele-specific increases in enhancer activity (S8 Fig). Sequences containing 2 SNPs, rs4846913 and rs2281721 also showed possible binding to nuclear proteins CEBPB and USF1, respectively . In fact, this work demonstrated that multiple SNPs in this HDL-C associated locus act synergistically to influence GALNT2 expression. In addition, these data point to the difficulties in predicting the boundaries of a regulatory element.