Research Article: Effect of model updating strategies on the performance of prevalent diabetes risk prediction models in a mixed-ancestry population of South Africa

Date Published: February 7, 2019

Publisher: Public Library of Science

Author(s): Katya L. Masconi, Tandi E. Matsha, Rajiv T. Erasmus, Andre P. Kengne, Eric Brian Faragher.


Prediction model updating methods are aimed at improving the prediction performance of a model in a new setting. This study sought to critically assess the impact of updating techniques when applying existent prevalent diabetes prediction models to a population different to the one in which they were developed, evaluating the performance in the mixed-ancestry population of South Africa.

The study sample consisted of 1256 mixed-ancestry individuals from the Cape Town Bellville-South cohort, of which 173 were excluded due to previously diagnosed diabetes and 162 individuals had undiagnosed diabetes. The primary outcome, undiagnosed diabetes, was based on an oral glucose tolerance test. Model updating techniques and prediction models were identified via recent systematic reviews. Model performance was assessed using the C-statistic and expected/observed (E/O) events rates ratio.

Intercept adjustment and logistic calibration improved calibration across all five models (Cambridge, Kuwaiti, Omani, Rotterdam and Simplified Finnish diabetes risk models). This was improved further by model revision, where likelihood ratio tests showed that the effect of body mass index, waist circumference and family history of diabetes required additional adjustment (Omani, Rotterdam and Finnish models). However, discrimination was poor following internal validation of these models. Re-estimation of the regression coefficients did not increase performance, while the addition of new variables resulted in the highest discriminatory and calibration performance combination for the models it was undertaken in.

While the discriminatory performance of the original existent models during external validation were higher, calibration was poor. The highest performing models, based on discrimination and calibration, were the Omani diabetes model following model revision, and the Cambridge diabetes risk model following the addition of waist circumference as a predictor. However, while more extensive methods incorporating development population information were superior over simpler methods, the increase in model performance was not great enough for recommendation.

Partial Text

Predictive performance is often decreased when a model is tested in a population different to that in which the model was developed. To limit the number of models redeveloped in smaller datasets due to poor performance of existent models, updating methods aim to improve the prediction performance of a model in a new setting [1]. The updating of an existent model is encouraged as it allows for the information captured during the development of the model to be incorporated with the characteristics of the validation population [1–5].

The aim of this study was to compare the effects of different updating techniques on the performance of existent diabetes risk prediction models. The performance of the existent models in their original format was not considered sufficient to recommend implementation and the updating methods were intended to aid in bettering the fit of these models. While discrimination was increased when implementing the updating methods, this was lost during internal validation. However, calibration was greatly improved and held following validation. To determine the maximum predictive ability of this population using the available predictors, model development was undertaken, to be used as a comparative. The performance in the development dataset was good, with a number of updating methods matching this performance in development, however overfitting resulted in a 0.18 drop in the c-statistic when internally validated.




Leave a Reply

Your email address will not be published.