Research Article: A Bounded Integer Model for Rating and Composite Scale Data

Date Published: June 6, 2019

Publisher: Springer International Publishing

Author(s): Gustaf J. Wellhagen, Maria C. Kjellsson, Mats O. Karlsson.


Rating and composite scales are commonly used to assess treatment efficacy. The two main strategies for modelling such endpoints are to treat them as a continuous or an ordered categorical variable (CV or OC). Both strategies have disadvantages, including making assumptions that violate the integer nature of the data (CV) and requiring many parameters for scales with many response categories (OC). We present a method, called the bounded integer (BI) model, which utilises the probit function with fixed cut-offs to estimate the probability of a certain score through a latent variable. This method was successfully implemented to describe six data sets from four different therapeutic areas: Parkinson’s disease, Alzheimer’s disease, schizophrenia, and neuropathic pain. Five scales were investigated, ranging from 11 to 181 categories. The fit (likelihood) was better for the BI model than for corresponding OC or CV models (ΔAIC range 11–1555) in all cases but one (∆AIC − 63), while the number of parameters was the same or lower. Markovian elements were successfully implemented within the method. The performance in external validation, assessed through cross-validation, was also in favour of the new model (ΔOFV range 22–1694) except in one case (∆OFV − 70). A residual for diagnostic purposes is discussed. This study shows that the BI model respects the integer nature of data and is parsimonious in terms of number of estimated parameters.

Partial Text

Many clinical trial endpoints are measured with rating scales or composite scales. Rating scales, such as the Likert scale, are typically based on a single assessment or question (e.g. “How much pain do you feel?”) while composite scales consist of several assessments or questions that generate a total score. The nature of such scale-based data is complex, and there is no fully satisfying modelling approach when the number of possible categories is many.

The BI model had fewer parameters compared with the published implementation of an ordered categorical model for the Likert data set (13 and 18, respectively). The ∆AIC was 1555 in favour of the BI model. When ∆OFV was calculated from cross-validated analyses of the two models, the difference was 1694 in favour of the BI model. When the analysis was performed without Markov elements in either the BI or CV model, the ∆AIC was 810 in favour of the BI model.

The bounded integer model provides a good description of rating and composite scale data, both in terms of fit and performance in external validation. It has shown better fit and performance in external validation to multiple data sets than models treating the same data as either ordered categorical or a continuous variable. Simulations from the model will provide real-life-like data that does not need rounding/truncation and/or transformation. Also, Markov elements can easily be added to the model.




Leave a Reply

Your email address will not be published.