Date Published: February 14, 2019
Publisher: Public Library of Science
Author(s): P. J. Moore, T. J. Lyons, J. Gallacher, Stephen D Ginsberg.
Time-dependent data collected in studies of Alzheimer’s disease usually has missing and irregularly sampled data points. For this reason time series methods which assume regular sampling cannot be applied directly to the data without a pre-processing step. In this paper we use a random forest to learn the relationship between pairs of data points at different time separations. The input vector is a summary of the time series history and it includes both demographic and non-time varying variables such as genetic data. To test the method we use data from the TADPOLE grand challenge, an initiative which aims to predict the evolution of subjects at risk of Alzheimer’s disease using demographic, physical and cognitive input data. The task is to predict diagnosis, ADAS-13 score and normalised ventricles volume. While the competition proceeds, forecasting methods may be compared using a leaderboard dataset selected from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and with standard metrics for measuring accuracy. For diagnosis, we find an mAUC of 0.82, and a classification accuracy of 0.73 compared with a benchmark SVM predictor which gives mAUC = 0.62 and BCA = 0.52. The results show that the method is effective and comparable with other methods.
Alzheimer’s disease (AD) is an irreversible brain disorder which progressively affects cognition and behaviour, and results in an impairment in the ability to perform daily activities. It is the most common form of dementia in older people affecting about 6% of the population aged over 65, and it increases in incidence with age. The initial stage of AD is characterised by memory loss, and this is the usual presenting symptom. Memory loss is one constituent of mild cognitive impairment (MCI) which can be an early sign of Alzheimer’s disease. MCI is diagnosed by complaints of subjective memory loss (preferably corroborated by a close associate or partner of the individual), impairment of memory function, unimpaired general cognition and behaviour but with no evidence of dementia . MCI does not always progress to dementia or to a diagnosis of Alzheimer’s disease, but those with amnestic mild MCI, the type of MCI characterised by memory impairment, are more likely to develop dementia than those without this diagnosis. In cases where an individual does develop Alzheimer’s disease, the phase of MCI ends with a marked decline in cognitive function lasting two to five years in which semantic memory (the recall of facts and general knowledge) and implicit memory (the long-term, nonconscious memory evidenced by priming effects) also becomes degraded.
We trained the random forest only on time series with at least 4 points, this minimum having been determined during training. The optimum value for the number of trees was 60, and for the minimum number of data points in a leaf it was 5. The average cross-validation accuracy for the evaluation set for diagnosis was AUC = 0.82 (SD 0.09). For ADAS-13 prediction the error was MAE = 5.56 (0.58) and for VENTS-ICV, MAE = 0.0021 (0.001).