Research Article: Predictive Modeling for Diagnostic Tests with High Specificity, but Low Sensitivity: A Study of the Glycerol Test in Patients with Suspected Menière’s Disease

Date Published: November 18, 2013

Publisher: Public Library of Science

Author(s): Bernd Lütkenhöner, Türker Basel, Xiaofeng Wang.


A high specificity does not ensure that the expected benefit of a diagnostic test outweighs its cost. Problems arise, in particular, when the investigation is expensive, the prevalence of a positive test result is relatively small for the candidate patients, and the sensitivity of the test is low so that the information provided by a negative result is virtually negligible. The consequence may be that a potentially useful test does not gain broader acceptance. Here we show how predictive modeling can help to identify patients for whom the ratio of expected benefit and cost reaches an acceptable level so that testing these patients is reasonable even though testing all patients might be considered wasteful. Our application example is based on a retrospective study of the glycerol test, which is used to corroborate a suspected diagnosis of Menière’s disease. Using the pretest hearing thresholds at up to 10 frequencies, predictions were made by K-nearest neighbor classification or logistic regression. Both methods estimate, based on results from previous patients, the posterior probability that performing the considered test in a new patient will have a positive outcome. The quality of the prediction was evaluated using leave-one-out cross-validation, making various assumptions about the costs and benefits of testing. With reference to all 356 cases, the probability of a positive test result was almost 0.4. For subpopulations selected by K-nearest neighbor classification, which was clearly superior to logistic regression, this probability could be increased up to about 0.6. Thus, the odds of a positive test result were more than doubled.

Partial Text

An ideal diagnostic test has a high sensitivity combined with a high specificity. However, tests that are used in daily routine do rarely conform to this ideal. Instead, it is often necessary to find a reasonable trade-off between sensitivity and specificity. This means that a high sensitivity is achieved by accepting a relatively low specificity and, conversely, a high specificity is reached by compromising sensitivity. Specific tests are typically used to confirm (or “rule in”) a suspected diagnosis [1]. Specificity is synonymous for true-negative rate, , which is related to the false-positive rate, , as . A high specificity implicates that it is unlikely to get a positive result in a patient that does not have the disease tested for. Thus, a positive outcome of a specific test is quite informative and can have substantial therapeutic consequences. A negative result, by contrast, is of little value if the sensitivity of the test is low. This follows from the fact that sensitivity is synonymous for true-positive rate, , which is related to the false-negative rate, , as . Because a low sensitivity implicates a high false-negative rate, a negative outcome of an insensitive test is a likely event that does not give a compelling reason to rule out the disease tested for.

The performance of a diagnostic test is commonly characterized in terms of its sensitivity and specificity. However, these two well-known measures are not necessarily sufficient for assessing how useful a test is in daily practice. To reach a well-founded conclusion, it is indispensable to consider additional factors such as the cost for performing the test and the benefit of a positive and a negative test result, respectively. Of crucial importance is also the prevalence of the condition tested for, which, by definition, depends on the population of patients. Thus, the utility of a test may crucially depend on how the tested patients are selected. As yet, the selection is typically based on relatively coarse criteria. The patients may, for example, be suspected of having a certain disease. Predictive modeling opens up the possibility to come to individualized decisions, with the consequence that a diagnostic test may be considered useful for a particular patient even when testing a broader population of patients with the same suspected diagnosis is discussed controversially. Moreover, if alternative tests are available, predictive modeling may help to choose the one that is most promising for the particular patient.