Date Published: January 29, 2019
Publisher: Public Library of Science
Author(s): Dhruv Grover, Sebastian Bauhoff, Jed Friedman, Ilya Safro.
Independent verification is a critical component of performance-based financing (PBF) in health care, in which facilities are offered incentives to increase the volume of specific services but the same incentives may lead them to over-report. We examine alternative strategies for targeted sampling of health clinics for independent verification. Specifically, we empirically compare several methods of random sampling and predictive modeling on data from a Zambian PBF pilot that contains reported and verified performance for quantity indicators of 140 clinics. Our results indicate that machine learning methods, particularly Random Forest, outperform other approaches and can increase the cost-effectiveness of verification activities.
Performance-based financing (PBF) is a contracting mechanism that aims to increase the performance and quality of service providers. PBF programs for health care services in low and middle-income countries typically offer financial incentives to health care facilities for the provision of a defined set of services, with an adjustment to the bonus payment based on a broad measure of quality . For example, a PBF program may offer a bonus payment of $5 for each delivery in a clinic, and scale down the bonus if the clinics’ quality is found to be low. In recent years, PBF has generated substantial interest among policy-makers in low- and middle-income countries. Donors and international organizations are actively engaged in supporting countries in developing, implementing and evaluating PBF programs; for instance, as of 2015 a dedicated trust fund at the World Bank alone supported 36 PBF programs .
We assess the performance of the four classifiers used in this study across the following five performance metrics: prediction accuracy, recall, precision, F1-score and area under the receiver-operating-characteristic (ROC) curve (AUC) . If the classifier is required to perform equally well on the positive class as well as the negative class, ROC curves that plot the true positive rate (TPR) against the false positive rate (FPR) serve well for visualization and evaluation. However, if there is greater interest in the model’s performance on the negative class, to make sure every positive prediction is correct (precision), and that we get as many positive class predictions correct as possible (recall), a precision-recall curve is better suited. F1-score is an indicator to measure the accuracy of a dichotomous model and considered as the harmonic average of model precision and recall. Algebraically:
Fig 1A shows the ROC curves and their corresponding AUC values for random forest, SVM, logistic regression and Naïve Bayes classifiers. Fig 1B shows the corresponding Precision-Recall curves for the same classifiers.
Verification of reported performance is a crucial issue in PBF in health, given that the payment incentives may not only encourage increased performance but also over-reporting. Independent verification serves to deter over-reporting and to ensure that payments reflect actual performance.
Overall our findings suggest that supervised learning approaches, such as Random Forest, could substantially improve the prediction accuracy of counter-verification in PBF and thus increase the cost-effectiveness of verification. These methods are operationally feasible, especially in settings with electronic routine reporting systems.