Research Article: Weakly supervised lesion localization for age-related macular degeneration detection using optical coherence tomography images

Date Published: April 5, 2019

Publisher: Public Library of Science

Author(s): Hyun-Lim Yang, Jong Jin Kim, Jong Ho Kim, Yong Koo Kang, Dong Ho Park, Han Sang Park, Hong Kyun Kim, Min-Soo Kim, Alfred S Lewin.


Age-related macular degeneration (AMD) is the main cause of irreversible blindness among the elderly and require early diagnosis to prevent vision loss, and careful treatment is essential. Optical coherence tomography (OCT), the most commonly used imaging method in the retinal area for the diagnosis of AMD, is usually interpreted by a clinician, and OCT can help diagnose disease on the basis of the relevant diagnostic criteria, but these judgments can be somewhat subjective. We propose an algorithm for the detection of AMD based on a weakly supervised convolutional neural network (CNN) model to support computer-aided diagnosis (CAD) system. Our main contributions are the following three things. (1) We propose a concise CNN model for OCT images, which outperforms the existing large CNN models using VGG16 and GoogLeNet architectures. (2) We propose an algorithm called Expressive Gradients (EG) that extends the existing Integrated Gradients (IG) algorithm so as to exploit not only the input-level attribution map, but also the high-level attribution maps. Due to enriched gradients, EG can highlight suspicious regions for diagnosis of AMD better than the guided-backpropagation method and IG. (3) Our method provides two visualization options: overlay and top-k bounding boxes, which would be useful for CAD. Through experimental evaluation using 10,100 clinical OCT images from AMD patients, we demonstrate that our EG algorithm outperforms the IG algorithm in terms of localization accuracy and also outperforms the existing object detection methods in terms of class accuracy.

Partial Text

Deep learning is of growing importance in many applications, such as image recognition and image localization. A number of efforts have been made to classify medical images with regard to disease using deep learning models, and accordingly, the explainability of such models has become an important topic of research in the medical imaging field. Because of the critical nature of medical applications, not all decisions should be left to models. As in previous research such as [1–3], it is safer to use the models only to support clinicians’ decisions.

In this section, we present our CNN model for predicting the presence of AMD from OCT images. Since our EG algorithm is solely based on the weights and gradients of the CNN model used for explaining the lesions, it is important to use a concise CNN model of a higher accuracy for better explainability. However, most of the existing CNN models for OCT images are built using the CNN architectures for general image datasets like ImageNet, and so, tend to be very large and contain unnecessary weights and features for explaining the lesions of AMD. Thus, we propose a concise and accurate CNN model for OCT images.

EG is a fully weakly supervised localization algorithm for finding suspected AMD lesions in OCT images. The conventional guided-backpropagation method [14] and the IG algorithm [15] exploit the backpropagation of gradients, in particular, the gradients with respect to the input image. However, this approach tends to lose a considerable amount of gradient information during backpropagation as a neural network model has more ReLU and maxpooling layers or becomes deeper. ReLU solves the gradient vanishing problem of the sigmoid activation function, but has a dying ReLU problem [19] where a neuron is not longer learned once its value becomes zero. The zero values of dead neurons are propagated to the next layers, and so, their gradients are not available during backpropagation, which can degrade the explainability of guided-backpropagation and IG that exploit only the gradients with respect to the input image. The low quality of OCT images, that is, a relatively small amount and variety of information, worsens this tendency. To alleviate this problem, our EG algorithm exploits not only the gradients with respect to the input image, but also the gradients with respect to all the intermediate feature maps. This proposed approach can be very useful for conjugating the gradient backpropagation as much as possible even for the medical images of low quality.

We set b in Eq (2) to black image and set the hyperparameters {βi} (0 ≤ i ≤ 6) to the values of [1, 0.166, 0.166, 0.166, 0.166, 0.166, 1]. If we give a high weight to the input attribution map, i.e., β0, the overlay visualization option tends to highlight broader areas of low intensity, and so, it becomes difficult to find clear and distinct bounding boxes. By contrast, giving a high weight to the high-level attribution map, e.g., β6, results in the overlay images that highlight biased intense areas and large bounding boxes. We give high weights to both β0 and β6 since it shows overall good results. We leave the optimization of hyperparameters for future work.

In this paper, we have proposed a weakly supervised deep learning-based method for predicting the class of AMD and locating its lesions in OCT images. Our proposed CNN model for OCT images achieves a higher accuracy for AMD detection than the existing large CNN models. The compactness of our model is beneficial to the gradient-based methods such as EG algorithms since it can reduce the loss of gradients during backpropagation. Our EG algorithm outperforms the conventional guided-backpropagation method and IG algorithm in terms of coverage and hit rate due to its exploitation of high-level attribution maps. Our method also can localize lesions only using class labels without ground-truth bounding boxes. To the best of our knowledge, our method is the first method to localize AMD lesions for OCT images in a weakly supervised manner. It has an advantage of low cost over the existing object detection methods that explicitly require preparing ground-truth bounding boxes, which might be very expensive. Since the number of ground-truth bounding boxes is usually limited due to its high cost, the object detection methods tend to show bad performance in terms of the class accuracy and cannot detect any lesions in some cases. We have shown that our model outperforms object detection methods such as SSD and Faster R-CNN using 1,057 bounding boxes annotated by ophthalmologists for the dry AMD and wet AMD cases in terms of predicting the class labels. For future work, we will investigate the optimization of hyperparameters {βi} for better performance and the real-time EG algorithm for supporting real-time CAD systems.