Research Article: Topological data analysis of high resolution diabetic retinopathy images

Date Published: May 24, 2019

Publisher: Public Library of Science

Author(s): Kathryn Garside, Robin Henderson, Irina Makarenko, Cristina Masoller, Christophe Letellier.


Diabetic retinopathy is a complication of diabetes that produces changes in the blood vessel structure in the retina, which can cause severe vision problems and even blindness. In this paper, we demonstrate that by identifying topological features in very high resolution retinal images, we can construct a classifier that discriminates between healthy patients and those with diabetic retinopathy using summary statistics of these features. Topological data analysis identifies the features as connected components and holes in the images and describes the extent to which they persist across the image. These features are encoded in persistence diagrams, summaries of which can be used to discrimate between diabetic and healthy patients. The method has the potential to be an effective automated screening tool, with high sensitivity and specificity.

Partial Text

The world population is aging, which carries a high risk of eye diseases that significantly affect the quality of life. Thus, many efforts are nowadays focused on the development of reliable data analysis tools able to improve early diagnosis and follow-up of eye diseases. Diabetic retinopathy (DR) is a particularly relevant disease. In 2013, 382 million people had diabetes and this number is expected to rise to 592 million by 2035 [1]. DR is a complication of diabetes and the most frequent cause of new cases of blindness in developed countries among adults aged 20–74 years [2]. Timely identification is crucial for the success of the treatment. DR identification is based on the inspection of the fundus color image, which in the early stage of the disease has no visible symptoms, but as DR advances, the structure of the blood vessels changes and microaneurysms, exudates and new blood vessels can be seen. Depending on the population size among other factors, remote screening for the detection of the disease is not always cost-effective [3], partially due to the cost of the eye physicians that remotely analyse the images [4].

Variables for two of the optimal models identified via the LASSO method are presented in Table 1 and the subsequent sensitivity and specificity results are given in Table 2. The two LASSO-informed models were selected based on the minimum error and where the error is within one standard error of this minimum. Other SVM models were tested using alternative combinations of the summary statistics, for example only using landscape or convex peel statistics. However they did not match the LASSO-informed models in terms of sensitivity and specificity. For the two LASSO-informed models, perfect classification was achieved for leave-one-out cross validation, but we note that this is a small dataset and there is a risk of over-fitting. Therefore in order to provide confidence in the methods and results we performed further cross validations, using test sets of sizes 2, 3, 4 & 5. Sensitivity and specificity were reduced with the size of the training set but still a high level of accuracy was maintained.

We propose an application of supervised SVM learning using summary statistics of retinal fundus image persistence diagrams as an effective method of diabetic retinopathy detection. We recommend, however, that these methods be explored further once larger datasets of high-resolution images become available.




Leave a Reply

Your email address will not be published.