Research Article: Predicting neurological recovery with Canonical Autocorrelation Embeddings

Date Published: January 28, 2019

Publisher: Public Library of Science

Author(s): Maria De-Arteaga, Jieshi Chen, Peter Huggins, Jonathan Elmer, Gilles Clermont, Artur Dubrawski, Xiao Hu.


Early prediction of the potential for neurological recovery after resuscitation from cardiac arrest is difficult but important. Currently, no clinical finding or combination of findings are sufficient to accurately predict or preclude favorable recovery of comatose patients in the first 24 to 48 hours after resuscitation. Thus, life-sustaining therapy is often continued for several days in patients whose irrecoverable injury is not yet recognized. Conversely, early withdrawal of life-sustaining therapy increases mortality among patients who otherwise might have gone on to recover. In this work, we present Canonical Autocorrelation Analysis (CAA) and Canonical Autocorrelation Embeddings (CAE), novel methods suitable for identifying complex patterns in high-resolution multivariate data often collected in highly monitored clinical environments such as intensive care units. CAE embeds sets of datapoints onto a space that characterizes their latent correlation structures and allows direct comparison of these structures through the use of a distance metric. The methodology may be particularly suitable when the unit of analysis is not just an individual datapoint but a dataset, as for instance in patients for whom physiological measures are recorded over time, and where changes of correlation patterns in these datasets are informative for the task at hand.

Partial Text

Cardiac arrest is the most common cause of death in high-income nations [1]. In the United States alone, over 350,000 people suffer out-of-hospital cardiac arrest each year [2]. Despite advances in care, only a minority of those that are resuscitated and survive to hospital admission are discharged alive, and even fewer enjoy a favorable neurological recovery [2, 3]. Among non-survivors, the most common proximate cause of death is withdrawal of life-sustaining therapy based on perceived poor neurological prognosis [3, 4]. This decision may be motivated by the rarity of favorable recovery, the emotional and financial hardship placed on families faced with the prospect of even a few days of intensive care, or fear of survival with severe disability.

Canonical Correlation Analysis (CCA) is a statistical method first introduced by [19], useful for exploring relationships between two sets of variables. It is used in machine learning, with applications to medicine, biology and finance, e.g., [20, 21, 22, 23]. Sparse CCA, an ℓ1 variant of CCA, was proposed by [23, 24]. This method adds constraints to guarantee sparse solutions, which limits the number of features being correlated. Given two matrices X∈Rn×p and Y∈Rn×q, CCA aims to find linear combinations of their columns that maximize the correlation between them. Usually, X and Y are two disjoint matrix representations for one set of objects, so that each matrix is using a strictly different set of variables to describe them. Assuming X and Y have been standardized, the constrained optimization problem is shown in Eq 1. When c1 and c2 are small, solutions will be sparse and thus only a few features are correlated.

Our principal goal is to help improve care given to comatose survivors of cardiac arrest through a decision support system that can boost the accuracy and timeliness of prognostication. To do so, we propose a new way to characterize patients using their latent multivariate correlation structures, and use the resulting featurization of data to build predictive models. The results presented in this section leverage data collected over a five year period at an academic medical center to provide a proof of concept of the proposed methodology.

Recall that the proposed decision-support system is one that only makes recommendations when it has strong indications that the patient is likely to have a positive neurological recovery, and defers in all other cases. We evaluate the performance of the algorithm at the thresholds at which it would make recommendations, which we characterize through low false positive rates (FPR). The true positive rate (TPR) at a given FPR indicates what portion of positives–TPR—would be retrieved while assuring that no more than a given rate of negatives–FPR—will be incorrectly labeled as positive.

Cardiac arrest is a leading cause of death around the world, coma after cardiac arrest is common, and good neurological recovery is rare. Everyday, clinicians are tasked with making a prediction that determines whether they will continue life-sustaining therapies for their patients in coma or not. Motivated by the emphasis the clinicians place on potential informativeness of the correlation structures in EEG data, we have proposed a way to characterize and compare patients based on the latent structures of multivariate correlations, and use such information to predict positive neurological recovery. To do so, we have proposed a new formulation of Canonical Autocorrelation Analysis (CAA), a method that automatically finds subsets of features of data that form strong multiple-to-multiple correlations. We have also introduced Canonical Autocorrelation Embeddings (CAE) to enable the comparison of discovered correlation structures. CAE makes powerful and well established machine learning methodologies that rely on the use of distance metrics applicable to the task at hand.




Leave a Reply

Your email address will not be published.