Date Published: February 26, 2019
Publisher: Public Library of Science
Author(s): Simon W. M. Eng, Florence A. Aeschlimann, Mira van Veenendaal, Roberta A. Berard, Alan M. Rosenberg, Quaid Morris, Rae S. M. Yeung, Aziz Sheikh
Abstract: BackgroundJoint inflammation is the common feature underlying juvenile idiopathic arthritis (JIA). Clinicians recognize patterns of joint involvement currently not part of the International League of Associations for Rheumatology (ILAR) classification. Using unsupervised machine learning, we sought to uncover data-driven joint patterns that predict clinical phenotype and disease trajectories.Methods and findingsWe analyzed prospectively collected clinical data, including joint involvement using a standard 71-joint homunculus, for 640 discovery patients with newly diagnosed JIA enrolled in a Canada-wide study who were followed serially for five years, treatment-naïve except for nonsteroidal anti-inflammatory drugs (NSAIDs) and diagnosed within one year of symptom onset. Twenty-one patients had systemic arthritis, 300 oligoarthritis, 125 rheumatoid factor (RF)-negative polyarthritis, 16 RF-positive polyarthritis, 37 psoriatic arthritis, 78 enthesitis-related arthritis (ERA), and 63 undifferentiated arthritis. At diagnosis, we observed global hierarchical groups of co-involved joints.To characterize these patterns, we developed sparse multilayer non-negative matrix factorization (NMF). Model selection by internal bi-cross-validation identified seven joint patterns at presentation, to which all 640 discovery patients were assigned: pelvic girdle (57 patients), fingers (25), wrists (114), toes (48), ankles (106), knees (283), and indistinct (7). Patterns were distinct from clinical subtypes (P < 0.001 by χ2 test) and reproducible through external data set validation on a 119-patient, prospectively collected independent validation cohort (reconstruction accuracy Q2 = 0.55 for patterns; 0.35 for groups).Some patients matched multiple patterns. To determine whether their disease outcomes differed, we further subdivided the 640 discovery patients into three subgroups by degree of localization—the percentage of their active joints aligning with their assigned pattern: localized (≥90%; 359 patients), partially localized (60%–90%; 124), or extended (<60%; 157). Localized patients more often maintained their baseline patterns (P < 0.05 for five groups by permutation test) than nonlocalized patients (P < 0.05 for three groups by permutation test) over a five-year follow-up period.We modelled time to zero joints in the discovery cohort using a multivariate Cox proportional hazards model considering joint pattern, degree of localization, and ILAR subtype. Despite receiving more intense treatment, 50% of nonlocalized patients had zero joints at one year compared to six months for localized patients. Overall, localized patients required less time to reach zero joints (partial: P = 0.0018 versus localized by log-rank test; extended: P = 0.0057).Potential limitations include the requirement for patients to be treatment naïve (except NSAIDs), which may skew the patient cohorts towards milder disease, and the validation cohort size precluded multivariate analyses of disease trajectories.ConclusionsMultilayer NMF identified patterns of joint involvement that predicted disease trajectory in children with arthritis. Our hierarchical unsupervised approach identified a new clinical feature, degree of localization, which predicted outcomes in both cohorts. Detailed assessment of every joint is already part of every musculoskeletal exam for children with arthritis. Our study supports both the continued collection of detailed joint involvement and the inclusion of patterns and degrees of localization to stratify patients and inform treatment decisions. This will advance pediatric rheumatology from counting joints to realizing the potential of using data available from uncovering patterns of joint involvement.
Partial Text: Juvenile idiopathic arthritis (JIA) is the most common chronic inflammatory rheumatic disease in childhood, characterized by joint inflammation of unknown etiology lasting at least six weeks starting at <16 years of age. The International League of Associations for Rheumatology (ILAR) criteria classify JIA into seven subtypes . Our analysis, outlined as a flow chart in S1 Fig, can be divided into the following phases: data collection and filtering, dimensionality reduction, clustering, and postclustering analysis. As indicated in our transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist (S1 Checklist), our postclustering analysis plan was developed after we completed the clustering and observed that some patients defied identification with a single joint grouping. This observation prompted us to include a variable describing this in our multivariate prediction model. We explored patterns of articular involvement in JIA using unsupervised data-driven pattern recognition techniques. We initially observed joints co-occurring in logical and localized groupings without same-side skewing (Figs 1 and 2 and S2 Fig). To better understand these signals in a clinically applicable manner, we conducted a modified version of multilayer NMF, identifying seven high-level groupings of joints (S6 Fig and Fig 3) describing arthritis foci anchored by distinct subsets of joints. The resulting seven patient groups subdivided the ILAR subtypes into distinct subgroups based on patterns of arthritis (Fig 4). Patients with localized involvement often remained in the same patient group after baseline visit (Fig 5A, 5B and 5C) and reached zero joint involvement faster than patients with nonlocalized involvement (Fig 5D). These patterns generalized to an independent validation cohort, supporting their applicability beyond the discovery cohort (S12 Fig and S13 Fig). Source: http://doi.org/10.1371/journal.pmed.1002750