Research Article: Family-based exome-wide association study of childhood acute lymphoblastic leukemia among Hispanics confirms role of ARID5B in susceptibility

Date Published: August 17, 2017

Publisher: Public Library of Science

Author(s): Natalie P. Archer, Virginia Perez-Andreu, Ulrik Stoltze, Michael E. Scheurer, Anna V. Wilkinson, Ting-Nien Lin, Maoxiang Qian, Charnise Goodings, Michael D. Swartz, Nalini Ranjit, Karen R. Rabin, Erin C. Peckham-Gregory, Sharon E. Plon, Pedro A. de Alarcon, Ryan C. Zabriskie, Federico Antillon-Klussmann, Cesar R. Najera, Jun J. Yang, Philip J. Lupo, Jeffrey S Chang.


We conducted an exome-wide association study of childhood acute lymphoblastic leukemia (ALL) among Hispanics to confirm and identify novel variants associated with disease risk in this population. We used a case-parent trio study design; unlike more commonly used case-control studies, this study design is ideal for avoiding issues with population stratification bias among this at-risk ethnic group. Using 710 individuals from 323 Guatemalan and US Hispanic families, two inherited SNPs in ARID5B reached genome-wide level significance: rs10821936, RR = 2.31, 95% CI = 1.70–3.14, p = 1.7×10−8 and rs7089424, RR = 2.22, 95% CI = 1.64–3.01, p = 5.2×10−8. Similar results were observed when restricting our analyses to those with the B-ALL subtype: ARID5B rs10821936 RR = 2.22, 95% CI = 1.63–3.02, p = 9.63×10−8 and ARID5B rs7089424 RR = 2.13, 95% CI = 1.57–2.88, p = 2.81×10−7. Notably, effect sizes observed for rs7089424 and rs10821936 in our study were >20% higher than those reported among non-Hispanic white populations in previous genetic association studies. Our results confirmed the role of ARID5B in childhood ALL susceptibility among Hispanics; however, our assessment did not reveal any strong novel inherited genetic risks for acute lymphoblastic leukemia among this ethnic group.

Partial Text

Acute lymphoblastic leukemia (ALL) is the most common malignancy among children, with B-cell ALL (B-ALL) accounting for the majority (80% to 85%) of cases [1,2]. Genome-wide association studies (GWAS) have identified several inherited genetic variants associated with childhood or adolescent ALL risk, including but not limited to single nucleotide polymorphisms (SNPs) in ARID5B, IKZF1, CEBPE, CDKN2A, PIP4K2A, GATA3, LHPP, and ELK3 [3–11]. Another important risk factor for childhood ALL is Hispanic ethnicity. Children of Hispanic ethnic background have a 10% to 30% higher incidence of ALL than do non-Hispanic whites, and a rate almost two times higher than among non-Hispanic blacks [12,13]. Hispanic children with ALL also have a lower 5-year survival rate and a higher incidence of relapse than do non-Hispanic whites [14,15].

The study cohort consists of a total of 733 individuals from 332 families: 287 families (628 individuals) from Unidad Nacional de Oncología Pediátrica (UNOP) in Guatemala, and 45 families (105 individuals) from Texas Children’s Hospital (TCH) in the United States. We excluded nine families from analysis because they did not meet Mendelian error rate criteria. The full cohort used in analyses included 323 families with 710 individuals. For 24 of these families, parental genetic data were used in the analysis, but genetic information was missing for the ALL cases themselves. Demographic and clinical characteristics of the 299 ALL cases included in this study are shown in Table 1. Of the 237,436 exonic SNPs available for analysis from the Illumina chip, 32,175 SNPs met the MAF and call rate criteria and were subsequently used in analyses (S1 File). The vast majority (99.6%) of excluded SNPs were omitted from analyses because they had a MAF of <1%. In this family-based EXWAS of childhood ALL among those of Hispanic ancestry, we observed two inherited SNPs in ARID5B, rs10821936 and rs7089424, that were associated with childhood ALL. These two SNPs are in strong linkage disequilibrium with one another (r2 = 0.84, D’ = 0.92), most likely representing a single susceptibility locus (Fig 2). Both ARID5B rs7089424 and rs10821936 have been identified in previous ALL GWAS [3–5,7,8]. These two SNPs also have had some of the strongest and most statistically significant associations with ALL in GWAS analyses [4,7,8]. We reviewed whole-exome sequence data from a subset of the cases (n = 41 from TCH) described here and did not identify any recurring ARID5B coding variants in patients with the risk allele.   Source:


