Research Article: Validation of the Symptom Pattern Method for Analyzing Verbal Autopsy Data

Date Published: November 20, 2007

Publisher: Public Library of Science

Author(s): Christopher J. L Murray, Alan D Lopez, Dennis M Feehan, Shanon T Peter, Gonghuan Yang, Peter Byass

Abstract: BackgroundCause of death data are a critical input to formulating good public health policy. In the absence of reliable vital registration data, information collected after death from household members, called verbal autopsy (VA), is commonly used to study causes of death. VA data are usually analyzed by physician-coded verbal autopsy (PCVA). PCVA is expensive and its comparability across regions is questionable. Nearly all validation studies of PCVA have allowed physicians access to information collected from the household members’ recall of medical records or contact with health services, thus exaggerating accuracy of PCVA in communities where few deaths had any interaction with the health system. In this study we develop and validate a statistical strategy for analyzing VA data that overcomes the limitations of PCVA.Methods and FindingsWe propose and validate a method that combines the advantages of methods proposed by King and Lu, and Byass, which we term the symptom pattern (SP) method. The SP method uses two sources of VA data. First, it requires a dataset for which we know the true cause of death, but which need not be representative of the population of interest; this dataset might come from deaths that occur in a hospital. The SP method can then be applied to a second VA sample that is representative of the population of interest. From the hospital data we compute the properties of each symptom; that is, the probability of responding yes to each symptom, given the true cause of death. These symptom properties allow us first to estimate the population-level cause-specific mortality fractions (CSMFs), and to then use the CSMFs as an input in assigning a cause of death to each individual VA response. Finally, we use our individual cause-of-death assignments to refine our population-level CSMF estimates. The results from applying our method to data collected in China are promising. At the population level, SP estimates the CSMFs with 16% average relative error and 0.7% average absolute error, while PCVA results in 27% average relative error and 1.1% average absolute error. At the individual level, SP assigns the correct cause of death in 83% of the cases, while PCVA does so for 69% of the cases. We also compare the results of SP and PCVA when both methods have restricted access to the information from the medical record recall section of the VA instrument. At the population level, without medical record recall, the SP method estimates the CSMFs with 14% average relative error and 0.6% average absolute error, while PCVA results in 70% average relative error and 3.2% average absolute error. For individual estimates without medical record recall, SP assigns the correct cause of death in 78% of cases, while PCVA does so for 38% of cases.ConclusionsOur results from the data collected in China suggest that the SP method outperforms PCVA, both at the population and especially at the individual level. Further study is needed on additional VA datasets in order to continue validation of the method, and to understand how the symptom properties vary as a function of culture, language, and other factors. Our results also suggest that PCVA relies heavily on household recall of medical records and related information, limiting its applicability in low-resource settings. SP does not require that additional information to adequately estimate causes of death.

Partial Text: Cause of death data are a critical input into debates on public health, assessment of epidemiological patterns and trends, and guidance on allocation of scarce resources for pubic health and medical care [1]. Cause of death data for these purposes are equally important for high-income and low-income countries, but the challenge of comparable measurement is different. For high-income countries, the main task is to minimize systematic differences in assignment of causes of death over time and across populations. For developing countries, where vital registration systems are generally not complete, and progress toward higher levels of coverage and certification has been unacceptably slow, the challenge is how to make better use of alternative cause-of-death measurement approaches to reliably estimate population-level mortality patterns. This paper presents a new method for assigning causes of death based on an interview of members of the decedent’s household.

From a validation cause-of-death dataset for each variable included in the VA instrument (henceforth called symptoms), we can calculate the probability of household members responding in each response category for each cause of death. To simplify the explanation, let us assume all items in the VA instrument are dichotomous. Let Sij be the probability that household members say yes to symptom i for cause of death j. If one considers symptom i like a diagnostic test, Sij is the sensitivity of symptom i for disease j. Note that unlike most of the literature on VA, we are not referring to the sensitivity of the entire VA instrument subject to physician review, but rather to sensitivity of a single symptom in the instrument.

Table 1 provides the Sij values from the China VA study for the 47 main symptoms. We have excluded from this analysis information on the duration of symptoms. We undertook extensive analysis using symptom duration data as further Sij values, but this did not improve the performance of the method over and above inclusion of just the symptoms. We can think of each entry in the table as the properties of the symptom (row) for the cause of death (column). For example, the entry at the fifth row and third column is 44%, meaning that 44% of VA respondents for deaths from lower respiratory infections responded yes when asked if the deceased wheezed. There is clearly considerable information content in the individual symptoms even without information on the sequencing, duration, and clustering of symptoms. There is also clear evidence that, like any survey instrument, there is a significant background noise for each symptom. Injuries, whose occurrence can be expected to be uncorrelated with many symptoms, provide insight into these background rates. For example, among deaths due to injuries, 21% of household members reported fever, 21% poor appetite, 17% weight loss, and 15% skin disease; only 73% reported that the decedent suffered an injury. The latter surprising finding has also been observed in South Africa [28]. Some symptoms, which medical knowledge would suggest are highly specific such as paralysis on one side, are present for more than 10% of decedents for seven of 27 causes, suggesting that this symptom has a different cultural interpretation in China.

Overall, the SP method does remarkably well. At the population level, it provides more accurate estimates of cause-specific mortality fractions than PCVA; indeed, at least in China, PCVA is not a viable method for estimating the cause composition of mortality at the population level. At the individual level, it clearly outperforms PCVA in assigning causes of death to individual responses, using the same information. The difference is most striking when the two strategies are not allowed access to information from household recall of medical records, which is the test most relevant to implementing verbal autopsy in a resource-poor setting. Based on this analysis of data from China, the superiority of the SP method for estimating CSMFs and assigning causes of death at the individual level does not appear to be affected by changing the background cause of death composition in the population sample. In contrast, in China PCVA appears to be extremely sensitive to the availability of household recall of medical records.



Leave a Reply

Your email address will not be published.