Abhi Datta, of the Department of Biostatistics at Johns Hopkins Bloomberg School of Public Health, will present:
“Mortality Estimation Using Predicted Cause-of-Death Data from Verbal Autopsies”
Abstract: In many low-and-middle-income countries (LMICS), verbal autopsy (VA) – a systematic interview of the household members of the deceased about the health history and symptoms of the deceased member, is the primary source of mortality data. Computer-coded verbal autopsy (CCVA) are automated algorithms (classifiers) that predict a cause of death (COD) from the high-dimensional reported symptom data from VA. These predicted COD are then aggregated to generate national and regional estimates of cause-specific mortality fractions (CSMF) – the proportion of individuals dying of a given cause.
Treating predicted COD data from CCVA classifiers as raw data is perilous as it ignores the imperfections in the classifiers used to predict COD. Prevalence estimation using predicted data from a classifier is known as Quantification learning. Current quantification methods assume that the sensitivities and specificities of the classifier are either perfect or transportable from the training to the test population. Either assumption is inappropriate under dataset shift, when the misclassification rates are different in the training and test populations.
We first present a parsimonious generalized Bayes approach to quantification learning (GBQL) under dataset shift using a small local labeled data from the test population. The approach is distribution-free, allows both single-category or probabilistic (compositional) predictions from classifiers, accommodates inputs from multiple classifiers weighting them by their accuracy, and is amenable to use of shrinkage priors that stabilize estimation in data-scarce settings. We establish asymptotic and finite sample guarantees of the method. Empirical performance of GBQL is demonstrated through simulations. GBQL is used to calibrate estimates of cause specific mortality rates from verbal autopsies in datasets with evident dataset shift.
All are Welcome.