Research Article: PPCD: Privacy-preserving clinical decision with cloud support

Date Published: May 29, 2019

Publisher: Public Library of Science

Author(s): Hui Ma, Xuyang Guo, Yuan Ping, Baocang Wang, Yuehua Yang, Zhili Zhang, Jingxian Zhou, Lixiang Li.


With the prosperity of machine learning and cloud computing, meaningful information can be mined from mass electronic medical data which help physicians make proper disease diagnosis for patients. However, using medical data and disease information of patients frequently raise privacy concerns. In this paper, based on single-layer perceptron, we propose a scheme of privacy-preserving clinical decision with cloud support (PPCD), which securely conducts disease model training and prediction for the patient. Each party learns nothing about the other’s private information. In PPCD, a lightweight secure multiplication is presented and introduced to improve the model training. Security analysis and experimental results on real data confirm the high accuracy of disease prediction achieved by the proposed PPCD without the risk of privacy disclosure.

Partial Text

With sharp growth of electronic data, machine learning has impacted on human’s lifestyle by predicting human’s behavior and future trends on everything [1], [2], [3]. To overcome the limitations of storage and computing resource, how to outsource pricey tasks of machine learning to the Cloud has attracted much more attention. For instances, data of the client can be transmitted to the Cloud for either model training and predicting [4], [5], [6]. As a popular machine learning algorithm, single-layer perceptron (SLP) is simple yet efficient and has been widely used in disease prediction [7], [8], [9]. It is more appropriate for real-time disease predicting than some complex techniques such as naïve bayesian [10], decision trees [2] and support vector machines (SVMs) [11], [12] and so on. Clinical decision support system (CDSS), which uses various data mining techniques to help physicians make proper disease diagnosis and provide health services for patients, has received considerable attention [7], [13], [14],[15]. However, for privacy concerns, users don’t want to submit their medical data to an unauthorized institution [16], [17], [18]. At the same time, due to classifier being considered as own asset of the medical service provider, there is a risk of exposing the prediction model to third-party. Otherwise, third-party will use the model to make disease prediction for a patient who could damage the profile of medical service provider. Therefore, the confidentiality of both medical data and disease model are crucial for the CDSS. How to achieve secure disease prediction without compromising the accuracy of the result becomes a challenging issue.

In this section, a brief glimpse of the Paillier cryptosystem, SLP and secure multiplication (SM) are given. Table 1 summarizes the key notations.

In this section, we analyze the correction and security of the proposed PPCD scheme. Notably, we focus on how PPCD achieve the privacy preserving of medical information of patient and disease model.

Without sufficient storage, computation or knowledge of the clinical decision, the clients frequently prefer outsourcing their data to the Cloud for model training and disease predicting. Ledley and lusted [24] firstly proposed a clinical decision support system which can help physicians to solve diagnostic problems. Later, a large number of disease prediction system based on various data mining techniques have been presented. For example, a fast prediction disease system based on SVM was proposed by [25] to predict the risk of progression of adolescent idiopathic scoliosis. Wang et al. [26] gave a risk assessment for individuals with a family history of pancreatic cancer using Bayesian classification. By introducing SVM, Huang et al. [27] designed a prediction model for breast cancer diagnosis while Barakat et al. [28] focused on the diagnosis of diabetes mellitus. For heart disease analysis, Anooj et al. [29] tried to use specific fuzzy rules. Though various prediction models have been developed, privacy protection of patients medical information fails to take into account which will impede the more progress of CDSS.

In this paper, we proposed a privacy-preserving disease predicting system based SLP which can help physicians make a proper diagnosis of disease and provide health services for patients anytime anywhere in a privacy-preserving way. In PPCD, DP’s historical medical data are used to train SLP in ED, and the hospital uses the trained model to predict diseases for a UP. Towards easing the privacy concerns from DP, we suggest an additively homomorphic encryption also for simplicity and generality. Inevitable multiplications of SLP motivate us introducing LSM into PPCD. Then users’ medical information and the trained model are secret to the cloud. Compared with SLP, comparable results reached by PPCD suggest that sacrificing data precision to improve efficiency is feasible in practical use.