Skip to Main Content

YSPH Biostatistics Seminar: “Generalized Bayes Calibration of Compositional Cause-specific Mortality Data from Verbal Autopsies"

October 19, 2023
  • 00:00<v ->And welcome.</v>
  • 00:02Today, it's my, eh.
  • 00:05Today, it is my pleasure to introduce Professor Abhi Datta
  • 00:09from Johns Hopkins University in Baltimore, Maryland.
  • 00:13Professor Datta earned his BS and MS
  • 00:15from the Indian Statistical Institute
  • 00:17in 2008 and 2010 respectively,
  • 00:20and PhD from the University of Minnesota in 2016.
  • 00:25In addition to being a well-cited researcher
  • 00:27with one publication that's almost 600 citations,
  • 00:30which is pretty nice,
  • 00:32he's also a award-winning educator,
  • 00:35having repeatedly won an excellence in teaching award
  • 00:37from his institution.
  • 00:39So let's welcome Dr. Datta.
  • 00:44<v ->Thank you, Robert,</v>
  • 00:45for the invitation to come here and give the seminar,
  • 00:48and for the very nice introduction.
  • 00:50Thank you everyone for coming.
  • 00:52My talk is about improving cause-specific mortality data
  • 00:56in low and middle-income countries
  • 00:58where the main tool to collect data
  • 01:00is something called verbal autopsies.
  • 01:02And the way I do it
  • 01:03is using a statistical approach called generalized Bayes.
  • 01:07If you have not heard
  • 01:08of verbal autopsies or generalized Bayes,
  • 01:11I can tell you that I hadn't heard of either of those things
  • 01:14when I started working on the project,
  • 01:17so don't worry about that,
  • 01:18I try to give an introduction.
  • 01:20'Cause I mostly work on a spatial and spatial temporal data
  • 01:24and this was a project that came along,
  • 01:27which is very different from what I used to work on.
  • 01:29But over the years, there's been a nice body of work
  • 01:31developed in this project.
  • 01:35So this is a joint work
  • 01:39with many different institutes and collaborators.
  • 01:44The top row is the Hopkins bio stats team,
  • 01:46which included my former students,
  • 01:48Jacob Fiksel and Brian Gilbert,
  • 01:51and my current postdoc, Sandi,
  • 01:53and my colleague, Scott Zeger, and I
  • 01:56lead the bio stats part of the team.
  • 02:00Agbessi is the PI of the project in Mozambique
  • 02:03that's sort of picked up developments for this work.
  • 02:07And there are a lot of colleagues
  • 02:09from the International Health Department
  • 02:10that helped to collaborate.
  • 02:12And then Li is the PI of a new project
  • 02:16who we're going to apply our methodology
  • 02:17for producing mortality estimates for the WHO.
  • 02:22So we're collaborating with Li there as well.
  • 02:25And then a couple of people outside Hopkins,
  • 02:27Dianna at CDC and Emory University,
  • 02:31as the director of the CHAMPS project.
  • 02:35And Ivalda in the government body at Mozambique
  • 02:39has been now currently doing the work in Mozambique.
  • 02:44So this is funded by three grants from the Gates Foundation.
  • 02:49The first one was the grant that kind of started things.
  • 02:52And then we have a grant that is kind of developing more
  • 02:55on the method side of the world.
  • 02:59So, many low and middle-income countries
  • 03:05often lack high-quality data on causes of death.
  • 03:08Often for most deaths,
  • 03:10there is no sort of medical certification
  • 03:13or like an autopsy done.
  • 03:16And without kind of high-quality data
  • 03:19on what people are dying of,
  • 03:21it's kind of hard to estimate the disease burden
  • 03:23in these countries.
  • 03:25And specifically, the quantity of interest
  • 03:27is the cause-specific mortality fraction,
  • 03:29which is basically the percentage of deaths in a age group
  • 03:34that can be attributable to a given cause.
  • 03:38So cause-specific mortality fractions
  • 03:40are key pieces of information
  • 03:42in determining the global burden of disease,
  • 03:44which in turn dictates sovereign policy,
  • 03:47as well as like resource allocations
  • 03:49for programs operating in this country.
  • 03:54So verbal autopsy is an alternate way
  • 03:57to count deaths and attribute causes
  • 03:59without actually doing a clinical autopsy.
  • 04:02So verbal autopsy is basically
  • 04:04a sort of a systematic interview
  • 04:07of the household members of the deceased.
  • 04:08So the government or the program has a set of field workers
  • 04:12who go out and go from household to household
  • 04:15and ask if anyone died in their household
  • 04:17within the last several months.
  • 04:18And if they died, what were the symptoms?
  • 04:20And the set of questions they ask is not standardized
  • 04:23by the WHO.
  • 04:24Some example questions are here.
  • 04:27Most of the questions would have binary answers
  • 04:29like yes, no, but there are some questions
  • 04:32that have more like continuous responses.
  • 04:38So they said the WHO has standardized
  • 04:41the verbal autopsy tool.
  • 04:43The 2016 version has around 200 to 350 questions,
  • 04:47depending on the age group.
  • 04:48There are separate sections of the questionnaire
  • 04:50for neonates, children deaths and adult deaths.
  • 04:54And if you're interested in more information
  • 04:56about verbal autopsy, there's a page in WHO about it.
  • 05:02So a verbal autopsy, of course,
  • 05:04doesn't give you a cause of death,
  • 05:05it just gives you a bunch of yes-no responses
  • 05:08to various questions related to the symptoms.
  • 05:14So a verbal autopsy is basically a survey questionnaire.
  • 05:17So you can pass that survey through a computer software
  • 05:20and that can give a predictive cause of death.
  • 05:23And so there are a bunch
  • 05:24of different computer software available.
  • 05:27InSilicoVA, developed by Tyler McCormick,
  • 05:31Richard Li was a postdoc here,
  • 05:34is published in "JASA" in 2016,
  • 05:36is one of the, I think,
  • 05:37most statistically-principled approaches to do it.
  • 05:40But there are other approaches and then you can,
  • 05:43this is basically a classification problem.
  • 05:45So you're basically given your data on symptoms,
  • 05:48you're kind of classifying the cause of death
  • 05:50as one of several causes.
  • 05:51So you can use standard classifiers
  • 05:54and machine learning approaches as well.
  • 05:58OpenVA is an excellent resource
  • 05:59to learn about verbal autopsies.
  • 06:00Again, openVA is,
  • 06:04I think Richard is one of the maintainers
  • 06:06and creators of openVA.
  • 06:11So the COMSA project in Mozambique,
  • 06:14one of the main goals was to generate
  • 06:17this cause-specific mortality fractions
  • 06:19for children's and under,
  • 06:21for neonates and under-five children
  • 06:24for the country of Mozambique.
  • 06:26And the data that we collected was a large dataset
  • 06:30of vocal autopsy record
  • 06:32for different households that were surveyed
  • 06:34and that was a map of Mozambique
  • 06:38and the green region show
  • 06:41where the data was collected
  • 06:43as part of the COMSA project.
  • 06:44So in statistical terms, the data just has the symptoms,
  • 06:49it doesn't have the true cause of death,
  • 06:51so we call it the unlabeled data.
  • 06:57So how to go from an unlabeled data to the labeling
  • 07:00of the causes of death
  • 07:01and then estimate these cause fractions.
  • 07:04This is the standard procedure that is typically done
  • 07:08and this is what we were supposed to do as well,
  • 07:10which is simply take each record,
  • 07:12pass it through the computer software
  • 07:14and get a cause of death.
  • 07:16And once you get a cause of death,
  • 07:18then you can sort of simply aggregate.
  • 07:19So in the story example,
  • 07:21three out of the six cases were assigned to be from HIV.
  • 07:25And so the cause-specific mortality fraction for HIV
  • 07:27would be 50% and similar for malaria and sepsis and so on.
  • 07:32So that's the basic template
  • 07:35of how to get a cause-specific mortality fractions
  • 07:38from verbal autopsies.
  • 07:39The question is can we trust this estimates?
  • 07:41Because these are not true causes of death
  • 07:43as determined by a doctor or by a clinical procedure.
  • 07:46These are cause of death predicted by an algorithm
  • 07:48based on just surveying the household members
  • 07:52of the deceased.
  • 07:57So turns out machine learning has a name
  • 08:00for this type of problems,
  • 08:01it's called quantification learning,
  • 08:04which is basically estimating population prevalence
  • 08:07using predicted levels instead of true levels
  • 08:10and the predictions are coming from a classifier.
  • 08:13And so there has been some work in quantification learning
  • 08:16and in the machine learning literature.
  • 08:19So when we were working on this problem,
  • 08:21we realized that estimating
  • 08:22cause-specific mortality fractions
  • 08:24using predicted cause of death data from verbal autopsy
  • 08:27is an example of quantification learning.
  • 08:31So just a sort of an overview of terms that we'll be using
  • 08:35and the corresponding statistical notation.
  • 08:37So our true cause of death is y which we do not observe.
  • 08:42We want to estimate the probability
  • 08:43of population prevalence of y,
  • 08:45so y is a categorical variable.
  • 08:49And so probability of y or p
  • 08:51is our cause-specific mortality fraction,
  • 08:53which is the estimand.
  • 08:55We observed the verbal autopsy, which is a,
  • 08:57think of this as a high dimensional
  • 09:00or a long list of yes-no answers
  • 09:02to the verbal autopsy questions, so that is x,
  • 09:06and this x is passed through a software
  • 09:08to give a predicted level, which is a of x or simply a.
  • 09:17So what we have in the COMSA project
  • 09:21is simply an unlabeled dataset
  • 09:25which uses these verbal autopsy responses,
  • 09:28pass it through a software and get the predicted levels.
  • 09:34We do not observe the true levels, y,
  • 09:37we may or may not retain the verbal autopsy responses
  • 09:40because those are identifiable data
  • 09:42and those are often not released,
  • 09:43so often, just the predicted cause of that is available.
  • 09:47So even these covariates, x, may or may not be available.
  • 09:50And then we are interested in estimating the probability
  • 09:53that y belongs to one of the C many cause categories,
  • 09:58so that's a quantity of interest.
  • 10:05For some reason, there is a conditional sign
  • 10:07that's missing there.
  • 10:09But you can use the law of total probability
  • 10:13to write the probability of the predicted cause of death,
  • 10:16which is the a,
  • 10:18probability of a as a sum of our probability of a given y
  • 10:22times probability of y.
  • 10:24So there's a conditional sign missing here,
  • 10:26I don't don't know what's going on here.
  • 10:32But the COMSA data,
  • 10:33we only get information on the left-hand side, right?
  • 10:36And we want to input upon the quantity probability of y
  • 10:41which would be the true CSMFs.
  • 10:44So there is only one known quantity
  • 10:46with which you can estimate the left-hand side.
  • 10:48There are two unknown quantities on the right-hand side.
  • 10:50So without making assumptions, you cannot really identify
  • 10:54probability of y, right?
  • 10:56So any quantification learning methods
  • 10:59need to either estimate those conditional probabilities,
  • 11:02probability of a given y,
  • 11:04or make some assumptions on it.
  • 11:08So again, all the conditional signs are missing.
  • 11:16The one of the most common approaches,
  • 11:19and this is what is used in the verbal autopsy world
  • 11:22is called classify and count,
  • 11:25which is you simply predict the cause of death
  • 11:28and then aggregate.
  • 11:29So you're simply claiming that probability of a
  • 11:33is same as probability of y which is equivalent to claiming
  • 11:36that this misclassification rate matrix
  • 11:39is an identity matrix, right?
  • 11:41Because you're saying that the left hand quantity
  • 11:44is the same as the rightmost quantity, which would be true
  • 11:48if there is no misclassification by the algorithm
  • 11:51and if the predicted cause of death
  • 11:53is always the true cause of death.
  • 11:56And that's what is typically done
  • 11:58in this cause-specific mortality fraction estimates.
  • 12:02But it's a very strong assumption, right?
  • 12:04Because it says assuming perfect sensitivity and specificity
  • 12:07of the algorithm.
  • 12:10So let's look at how perfect the algorithms are.
  • 12:12So these are two algorithms,
  • 12:13Tariff and InSilicoVA,
  • 12:16PHMRC data is a benchmark dataset from four countries
  • 12:20that has both the verbal autopsy data
  • 12:22as well as a gold standard cause of death diagnosis.
  • 12:26And you can see the accuracies of either method
  • 12:30is around 30%, so they're far from being
  • 12:33like fully accurate.
  • 12:36So there is large misclassification rates
  • 12:39of these algorithms and if you don't kind of adjust
  • 12:42for these misclassifications,
  • 12:44this is burden estimates
  • 12:46of the cause-specific mortality fractions you get
  • 12:48are likely going to be very biased.
  • 12:54So this is where the CHAMPS project comes into play.
  • 12:58So the CHAMPS is an ongoing project
  • 13:00in like seven or eight countries including Mozambique,
  • 13:05which is collecting data on both verbal autopsy
  • 13:07and a more comprehensive cause of death procedure
  • 13:11called minimally invasive tissue sampling.
  • 13:14So it basically takes a sample of your tissue
  • 13:17of the deceased person and then runs a bunch
  • 13:20of pathological tests and imaging analysis
  • 13:23and then gives a cause of death.
  • 13:25And the MITS cause of death assignments
  • 13:30have been shown to be quite accurate when you compare
  • 13:33to like a full diagnostic autopsy.
  • 13:36So MITS is being done in a bunch
  • 13:38of different countries including Mozambique.
  • 13:41And for the cases where MITS is being done,
  • 13:43the verbal autopsies are also collected.
  • 13:46So what you get from this CHAMPS data
  • 13:48is a labeled or paired dataset
  • 13:50where you have both the verbal autopsy
  • 13:52as well as the MITS cause of death
  • 13:54and you can pass the verbal autopsy to the software
  • 13:58to get the verbal autopsy predicted cause of death.
  • 14:00And then you can cross tabulate the two
  • 14:02and get an estimate of the misclassification rates, right?
  • 14:04Like you can say like,
  • 14:06"Oh okay, so there are 10 cases
  • 14:08that the MITS cause of death was HIV,
  • 14:11out of those 10 cases,
  • 14:12seven of them were correctly assigned to HIV
  • 14:15by verbal autopsy.
  • 14:16So then the sensitivity would be 70%
  • 14:20and the false positive would be 30%, so on."
  • 14:27So this is the broad idea of the methodology.
  • 14:29So for the COMSA data, which is the unpaired data,
  • 14:32you get only the verbal autopsy record
  • 14:34so you can get an estimate of the predicted cause of deaths
  • 14:37from the verbal autopsy.
  • 14:39From the CHAMPS data, which is the paired data,
  • 14:41you can get an estimate of the misclassification rates.
  • 14:44And then the only unknown is then the probabilities
  • 14:48of the cause of death
  • 14:50if you were able to do the MITS autopsy for every death.
  • 14:54So then this is an equation with two knowns and one unknown
  • 14:58and you can solve for it and get the calibrating message.
  • 15:01So that's the broad idea and we do it in a model-based way.
  • 15:09So here's the formal model.
  • 15:11So for the CHAMPS dataset with the unlabeled data or the U,
  • 15:15we have the predicted labels, ar,
  • 15:17and then for the,
  • 15:20that's for the COMSA data,
  • 15:21and for the CHAMPS data,
  • 15:22we have both the predicted labels from verbal autopsy, ar,
  • 15:26as well as the MITS determine labels, yr.
  • 15:29And our quantity of interest is the probabilities of yr
  • 15:34belonging to the different causes.
  • 15:41There's a conditional sign missing here.
  • 15:44But if the conditional probabilities
  • 15:48are denoted by Mij, which is if the MITS cause is i,
  • 15:52what is the probability that the via predicted cause is j?
  • 15:57Then you can use a law of total probability
  • 15:59to write down the marginal distribution
  • 16:02of the via predicted cause.
  • 16:03So that would be in terms of the misclassification rates
  • 16:07and the marginal cause distribution of the MITS-COD.
  • 16:10So that's the whole idea.
  • 16:11So you can write this in terms of a matrix vector notation
  • 16:15as probability of a as M transpose p
  • 16:18where M is the misclassification rate matrix,
  • 16:21p is the unknown quantity of interest,
  • 16:24which is probability that the cause of death
  • 16:27is coming from an unknown cause.
  • 16:31So the data model is very simple,
  • 16:34but the unlabeled data,
  • 16:36it follows multinomial with this probability
  • 16:38which is coming from this law of total probability.
  • 16:41And then for the label data,
  • 16:43this is ar given yr equals to i,
  • 16:46it follows multinomial with the i
  • 16:48throughout the misclassification matrix.
  • 16:49So if the MITS-COD is i,
  • 16:51the misclassification rates are given by the i
  • 16:53throughout the misclassification matrix,
  • 16:55so it's multinomial with that probability.
  • 16:59And then we've put priors on M and p
  • 17:01and then we can get estimates of both M and p.
  • 17:04M is a nuisance parameter, p is the parameter of interest.
  • 17:10Just to carefully go over what are the assumptions here.
  • 17:13The main assumption is that the misclassification rates
  • 17:18of verbal autopsy given MITS
  • 17:20are the same in your label data
  • 17:23as they would be in your unlabeled data.
  • 17:25This is not verifiable because we don't have
  • 17:28any true cause of death in the unlabeled data,
  • 17:30so it's an assumption.
  • 17:33Given that the verbal autopsy
  • 17:35is a function of your symptoms,
  • 17:37the assumption is essentially that given a true cause,
  • 17:42the probability of the symptoms are going to be same
  • 17:44in your unlabeled dataset as in your labeled dataset.
  • 17:49And it's a reasonable assumption
  • 17:50as if you have a cause of death,
  • 17:53it's likely that you have certain symptoms will appear
  • 17:56and some certain symptoms will not appear.
  • 17:59And that is true regardless of whether the data is coming
  • 18:02from the labeled set or the unlabeled set.
  • 18:08We do not assume that the marginal distribution
  • 18:12of the CHAMPS data of the causes in the label data
  • 18:16is representative of the population
  • 18:17because they are not, because the CHAMPS state,
  • 18:20so the CHAMPS project is done
  • 18:21at specific hospitals in the country
  • 18:24and distribution of causes in hospitals
  • 18:28are typically not same as distribution
  • 18:30of causes in the community.
  • 18:31And we are interested
  • 18:32in the cause distribution in the population.
  • 18:34So there is no assumption
  • 18:37that the marginal distribution of y in the label data
  • 18:40is same as the marginal distribution of y in unlabeled data,
  • 18:43which is our quantity of interest.
  • 18:45And the reason there is no assumption
  • 18:47is we only model a given y in the label data.
  • 18:51We never model y in the label data.
  • 18:54So we only model the conditional
  • 18:56and the assumption is the condition
  • 18:57of misclassification rates are transportable
  • 19:00from the labeled to the unlabeled side.
  • 19:06So that's the main idea.
  • 19:07And this was the first work we did,
  • 19:09we just used this top cause prediction.
  • 19:13But many of these algorithms
  • 19:15are actually probabilistic in nature in the sense
  • 19:17that if you look at their outputs,
  • 19:18they won't give a single cause of death,
  • 19:20but they will give scores to each cause.
  • 19:22So for example,
  • 19:24this would be a typical output of an algorithm
  • 19:26for like say 6%.
  • 19:28So for the first person, it will say
  • 19:3370% HIV, 20% malaria, 10% sepsis and so on.
  • 19:38And the standard procedure is to take the top cause,
  • 19:41so for the first person, it would be HIV,
  • 19:44for the second person, it will be malaria and so on.
  • 19:48So that's how you get a single cause
  • 19:50from a probabilistic prediction.
  • 19:53So that essentially ignores sort of the scores
  • 19:57assigned to the second most likely cause,
  • 20:01the third most likely cause and so on.
  • 20:04And you ignore those, you can end up with a biased estimate.
  • 20:09So you can see these are the CSMF estimates
  • 20:12using the top cause,
  • 20:14these are the CSM estimates
  • 20:15using the exact scores that are assigned
  • 20:17and those are different, right?
  • 20:18So when we kind of change this probabilistic output
  • 20:22to a single cause output, we discard information.
  • 20:30So we wanted to extend the work
  • 20:32to kind of use the full set of scores and the set of scores
  • 20:36can be thought of as a compositional data in the sense
  • 20:38that the scores sum up to one
  • 20:40because it assigns 100% probability across all causes
  • 20:45and then they're each non-negative.
  • 20:48The issue is that for the categorical data,
  • 20:51our model is based on multinomial distribution.
  • 20:53And then for compositional data,
  • 20:55the models are typically like Dirichlet
  • 20:57or log ratio based models,
  • 20:59which are very different from the multinomial distribution.
  • 21:03So if we have some cases
  • 21:05for which we have categorical output,
  • 21:07for some, we have compositional output,
  • 21:09this would lead to different models
  • 21:11for different parts of the dataset.
  • 21:15These Dirichlet or log-ratio models
  • 21:17also do not allow zeros in the data.
  • 21:20So if you have zeros or ones in the composition,
  • 21:22they don't allow that.
  • 21:23And then there are very specific models about the data
  • 21:27which are subjective model and specification.
  • 21:29So the data distribution does not look like a Dirichlet
  • 21:33assuming a Dirichlet layer
  • 21:34would lead to kind of wrong results.
  • 21:41So how do we extend the multinomial framework we had
  • 21:46for categorical data to compositional data?
  • 21:51Again, there would be a conditional sign here.
  • 21:56But the basic assumption that we had
  • 21:58for the multinomial case was probability of a given y
  • 22:02is the i throughout misclassification matrix, right?
  • 22:05And for categorical data, a probability statement
  • 22:10is same as an expectation statement, right?
  • 22:12So we can equivalently write this
  • 22:14as expectation of a given y
  • 22:16is the i throughout the M.
  • 22:19The advantage of the expectation statement
  • 22:20is that it's more generally applicable.
  • 22:23It will not be just for categorical data, right?
  • 22:27So for categorical data, there's a equivalent.
  • 22:30For other data types, this statement can be valid
  • 22:33even though the previous statement may not be applicable.
  • 22:37So we kind of write this as our model
  • 22:41for the compositional data and we make no other assumptions
  • 22:45about this distribution.
  • 22:46So only a first moment conditional expectation statement
  • 22:53without any full distributional specification.
  • 22:59So what do we do?
  • 23:00So we have expectation of a given y
  • 23:03is the i throughout the misclassification matrix.
  • 23:08We can use something called
  • 23:10the Kullback Leibler Divergence
  • 23:12or the cross entropy loss
  • 23:14between a and its model expectation.
  • 23:17So these are all the conditional signs are missing here.
  • 23:22So basically a is the data we observe,
  • 23:26this is the modeled expectation,
  • 23:29which is basically the i
  • 23:30through of the misclassification matrix
  • 23:31and we use the cross entropy loss,
  • 23:34the Kullback Leibler loss between the two.
  • 23:37What's the advantage?
  • 23:38So first of all,
  • 23:39the Kullback Leibler loss allows zeroes in the composition.
  • 23:42So it is well-defined even if you have zeroes or ones.
  • 23:45If you take the negative loss and exponentiate it,
  • 23:48it's exactly the multinomial likelihood.
  • 23:50So if your data is indeed multinomial,
  • 23:52you get back your likelihood that you're using
  • 23:54for your single class model.
  • 23:57But if your data is not multinomial,
  • 24:00you get a pseudo likelihood that you can work with.
  • 24:04If you can take the derivative of the loss function
  • 24:07and take the expectation under the two parameter,
  • 24:10you'll see that it's a valid score function
  • 24:13in the sense that you get an unbiased estimating equation
  • 24:16for your misclassification rate matrix, M,
  • 24:19based on just the first moment as option.
  • 24:23And then similarly, you can do the same thing
  • 24:25for the unlabeled data.
  • 24:27The probability statement becomes expectation statement
  • 24:30and then we have the Kullback Leibler loss.
  • 24:32This is an unbiased estimated equation for both M and p.
  • 24:36And again,
  • 24:38if the data is truly multinomial and not compositional,
  • 24:41this becomes exactly the multinomial likelihood.
  • 24:43If the data is compositional,
  • 24:45it becomes a pseudo likelihood.
  • 24:50Okay, so how do we do Bayes analysis
  • 24:52with pseudo likelihoods?
  • 24:54So this is where this idea of generalized Bayes
  • 24:57or model-free Bayesian inference comes in
  • 24:59and there have been parallel developments
  • 25:01in both computer science, econometrics and statistics
  • 25:04without much communication among the three fields
  • 25:07for the last 30, 40 years.
  • 25:10Basically, if you're given a loss function
  • 25:13without a given like a full likelihood for the data,
  • 25:15you can take negative of that loss function
  • 25:18multiplied by some tuning parameter, alpha,
  • 25:22exponentiate it and treat it as a pseudo likelihood
  • 25:26and apply your priors
  • 25:27and then your posterior is going to be proportional to this
  • 25:30as long as the normalization constant exists.
  • 25:33And there has been a lot of work that has shown
  • 25:35that this is a valid posterior,
  • 25:38it is a generalization of the Bayesian posterior,
  • 25:41like if this is an actual likelihood,
  • 25:42this is the Bayesian posterior,
  • 25:44but if it's not a actual likelihood,
  • 25:48this has been shown that it basically minimizes
  • 25:49the Bayes risk for that loss function.
  • 25:54It has nice asymptotic properties
  • 25:56shown by Victor Chernozhukov in this paper
  • 25:59and then in this JSS paper in 2016 I think
  • 26:04it showed that if you're given a loss function
  • 26:06and a prior,
  • 26:07this is the only coherent way you can get a posterior.
  • 26:12So there's now been a lot of work and it's been called
  • 26:15by different names like Gibbs posteriors,
  • 26:17pseudo posterior, Laplace-type estimators
  • 26:20and quasi-Bayesian estimators along with generalized Bayes.
  • 26:25So for our case, we have the pseudo likelihood
  • 26:28for the label data.
  • 26:29We have the pseudo likelihood for the unlabeled data.
  • 26:32We put priors.
  • 26:33If all of our data were categorical,
  • 26:35this reduces to that multinomial model we had
  • 26:38for the categorical data.
  • 26:39But if some of the data is compositional,
  • 26:41then this becomes generalized Bayes,
  • 26:44so we call it generalized Bayes quantification learning.
  • 26:47It allows sparsity of the outputs in the sense
  • 26:50that if some of the data have zeroes and ones in them,
  • 26:54this is well-defined.
  • 26:56It's the same pseudo likelihood
  • 26:58for categorical compositional predictions.
  • 27:01And then it also allows
  • 27:02a nice Gibbs sample using conjugacy.
  • 27:11One final sort of data aspect we had
  • 27:15was that this minimal tissue sampling
  • 27:18was also sometimes inconclusive in the sense
  • 27:21that they gave two causes.
  • 27:22Like often, they were ambiguous between HIV and tuberculosis
  • 27:29and they would give one as the immediate cause
  • 27:31and one as the underlying cause.
  • 27:32So sometimes, even the true cause of death is compositional.
  • 27:36So your predicted cause of death is compositional,
  • 27:39your true cause of death is also compositional
  • 27:41and we call it like b, which represents the belief.
  • 27:45And you can show that if you're only given b
  • 27:49instead of a single cause of death,
  • 27:53your conditional expectation becomes M transpose b
  • 27:56instead of the i through of the M matrix.
  • 27:59And you can do the same thing
  • 28:01using the compositional true cause of death
  • 28:05instead of the actual true cause of death.
  • 28:08And all the conditional signs are missing here
  • 28:10but you can just formulate the Kullback Leibler likelihood
  • 28:14to generate pseudo likelihood.
  • 28:19So this kind of give rise to a digression
  • 28:22where we kind of looked at this is basically
  • 28:25your true cause of death is a compositional covariate
  • 28:28and your predicted cause of death is a compositional output.
  • 28:31So we kind of looked at regression
  • 28:33of a compositional outcome on compositional predictors.
  • 28:36So this was kind of an offshoot paper
  • 28:40where we just developed this piece
  • 28:42and if you look at compositional regression,
  • 28:45most of the work has been done using Dirichlet models
  • 28:50or log ratio transformations.
  • 28:52So this was a different approach to that in the sense
  • 28:55that it's both transformation free
  • 28:57and it doesn't specify a whole distribution
  • 28:59like the Dirichlet,
  • 29:00it just uses a first moment as option.
  • 29:02And we have an R-package to do a regression on composition,
  • 29:07to do composition on composition regression called codalm.
  • 29:12But going back to the verbal autopsy work,
  • 29:16we have the loss functions
  • 29:17for the labeled and unlabeled data,
  • 29:20we do the negative pseudo likelihoods,
  • 29:23put priors on the parameters and we get posterior inference.
  • 29:28One last extension of the methodology
  • 29:31was that there are multiple different
  • 29:34verbal autopsy algorithms and there are papers
  • 29:36where every new algorithm comes out and they say
  • 29:39they're better than all the previous algorithms.
  • 29:41And in practice, you never know which is the best algorithm.
  • 29:44So we developed an ensemble method that takes in predictions
  • 29:49from multiple algorithms, estimates classifier
  • 29:54algorithm-specific misclassification rates
  • 29:57and then they're connected to the unknown estimand.
  • 30:00So we can show that it gives more weight
  • 30:04to the more accurate algorithm in a data-driven way.
  • 30:07And then you're not kind of,
  • 30:10you don't have to make the choice
  • 30:12of which is the best algorithm in advance.
  • 30:14If you have multiple candidates,
  • 30:15you can use multiple algorithms together.
  • 30:23So we looked at some theoretical properties of the method.
  • 30:26We have two log functions, one for the label data,
  • 30:29one for the unlabeled data.
  • 30:31The label data
  • 30:32doesn't even feature the estimand, which is p,
  • 30:36so it will, on its own, it cannot identify p.
  • 30:39The unlabeled data only uses p through this quantity,
  • 30:43M transpose p.
  • 30:44So again, for different combinations of M and p,
  • 30:48as long as this product is the same,
  • 30:50it will never be able to identify p on its own.
  • 30:53So each loss function on its own
  • 30:54cannot identify through parameters.
  • 30:57But using both the loss functions together,
  • 30:59you can identify the estimand, T,
  • 31:02and we were able to show that posterior has nice properties
  • 31:06in terms of asymptotic normality
  • 31:08and well calibrated interval estimate
  • 31:11and near parametric concentration rates.
  • 31:13And the theory also extends to the ensemble method
  • 31:16and we use some approximations and we give sampler
  • 31:19and theory holds for that.
  • 31:24Some empirical validations,
  • 31:27since we're estimating a probability vector,
  • 31:32the common metric that is used is called
  • 31:34this chance-corrected normalized absolute accuracy,
  • 31:38which is basically a scaled L1 error,
  • 31:42centered by the L1 error you would get if you had predicted
  • 31:46the cause of death randomly.
  • 31:47So this is the error if you predict randomly
  • 31:50and then we look at how much improvement we get
  • 31:52over random predictions.
  • 31:57So this is an illustration of what happens if the data
  • 32:01is not Dirichlet and you use Dirichlet distribution.
  • 32:03So on the left-hand side,
  • 32:05the data is generated from Dirichlet
  • 32:08and we use both our method and the Dirichlet-based model
  • 32:12and they both do well.
  • 32:14On the right-hand side,
  • 32:15the data is from an overdispersed Dirichlet
  • 32:17and we use the Dirichlet in our model.
  • 32:20And because our model doesn't specify a distribution,
  • 32:22it just uses a first moment specification,
  • 32:25it's much robust and has much higher accuracy
  • 32:29than for the Dirichlet which becomes misspecified.
  • 32:35And then we also did a bunch of evaluations
  • 32:37using the PHMRC data.
  • 32:38So what we did was we trained the classifiers
  • 32:42on three of the countries leaving one country out
  • 32:44and then used a slice of data from that left out country
  • 32:47to estimate the misclassification rates,
  • 32:50and then we apply our method.
  • 32:55The green one is our method
  • 32:56and the x axis is the sample size of the dataset
  • 33:02used from the left out country
  • 33:04to estimate the misclassification rates.
  • 33:07The blue one is sort of the uncalibrated one,
  • 33:11the red one is the one that is calibrated
  • 33:13using the training data.
  • 33:14So you can see that our method does better than both of them
  • 33:18and the higher the sample size we use
  • 33:20from the left out country of interest
  • 33:23to estimate the misclassifications, the more accurate it is.
  • 33:30And also one interesting aspect
  • 33:31was that we looked at calibration
  • 33:33using individual algorithms and the calibration
  • 33:36using the ensemble one.
  • 33:37And more often than not, the ensemble one,
  • 33:40which is the orange one,
  • 33:42tends to perform similar to the best performing algorithm,
  • 33:46and the best performing algorithm can be very different
  • 33:48across different countries.
  • 33:50For example, in Mexico,
  • 33:51InSilicoVA is one of the best performing algorithms,
  • 33:54but in Tanzania, InSilicoVA was doing very poorly
  • 33:57and then InterVA was one
  • 33:59of the better performing algorithms.
  • 34:00So the ensemble always tend to give more weights
  • 34:03to more accurate algorithms.
  • 34:07So this is an overview of what we did for Mozambique.
  • 34:10So we had the unlabeled data with only verbal autopsies.
  • 34:14We've passed it through two algorithms,
  • 34:16InSilicoVA and Expert VA, to get the uncalibrated estimates.
  • 34:21Then we had the label data with the MITS cause of death
  • 34:23with which we estimated the misclassifications
  • 34:25of those two algorithms
  • 34:28and then we combine them in the ensemble method
  • 34:30and getting calibrated estimates.
  • 34:38Some results from Mozambique.
  • 34:40We have two age groups,
  • 34:42neonatal deaths, first four weeks,
  • 34:45and children that's under five years.
  • 34:48Two algorithms, seven causes of death for children,
  • 34:52five causes of death for neonates.
  • 34:55I'm going to just show the neonatal results here.
  • 34:57So these are the misclassification matrices for neonates.
  • 35:01And ideally, you would want the matrices
  • 35:03to have large numbers on the diagonals
  • 35:05because those are the correct matches
  • 35:07and then small numbers on the off diagonals.
  • 35:09But you don't see that,
  • 35:10you see quite a bit of large numbers on the off diagonals.
  • 35:14One thing that stands out is that
  • 35:17if you look at prematurity, it has a very high sensitivity,
  • 35:20close to 90%,
  • 35:22which means that if the true cause is prematurity,
  • 35:25the verbal autopsy correctly diagnoses it.
  • 35:28But then it also has high false positives
  • 35:31in the sense that if the true cause is infection,
  • 35:3420% of time, it is assigned as prematurity.
  • 35:37If the true cause is intrapartum related events,
  • 35:40almost 30% of time,
  • 35:41it's assigned to be prematurity and so on.
  • 35:43So it tends to over count a lot of deaths
  • 35:46from different causes as prematurity.
  • 35:48So what would be the result after calibration
  • 35:52is that the percentage of prematurity comes down.
  • 35:54So this is the uncalibrated estimate of prematurity.
  • 35:58This is the calibrated estimate of prematurity.
  • 36:01You can see that it comes down
  • 36:02because we can see in the data that there is a lot
  • 36:05of over counting of prematurity deaths.
  • 36:09So after calibration, it tends to come down quite a bit.
  • 36:17And also, we looked at the model estimated sensitivities
  • 36:22using both the single cause
  • 36:24and the compositional cause of the data.
  • 36:27So this is the difference in the sensitivities
  • 36:29and you can see that using the compositional cause of death,
  • 36:33you'll always get a higher match because it kind of uses
  • 36:36information for multiple causes and stuff
  • 36:39just considering the top cause.
  • 36:41And so it generally leads to better matching
  • 36:43between the verbal autopsy and the minimal tissue sampling.
  • 36:49Some ongoing work.
  • 36:51So when we did this for Mozambique,
  • 36:53there was very little amount of payer data.
  • 36:57So even though the data was for seven countries,
  • 36:59we kind of merged them together
  • 37:01and estimated the misclassification rates.
  • 37:04Now we have more data coming in for those countries
  • 37:07so we have a chance to assess
  • 37:08whether the misclassification rates vary by country
  • 37:12because if they do,
  • 37:12we should model the misclassification rates
  • 37:17in a way that's specific to each country.
  • 37:21So these are the misclassification rates now
  • 37:26resolved by country.
  • 37:27So there are six countries, Bangladesh, Ethiopia,
  • 37:30Kenya, Mali, Mozambique and Sierra Leone.
  • 37:35You can see the estimates.
  • 37:36These are the empirical estimates
  • 37:37and the confidence intervals for each country.
  • 37:40And the horizontal black line
  • 37:42is what the pooled estimate looks like.
  • 37:44So you can see that there is for some causes like here,
  • 37:49there is not a variability across countries.
  • 37:51But then for some other cause payers like say here,
  • 37:56there's quite a bit of variability across countries.
  • 38:00And so now that we are getting more data,
  • 38:03the next step for the project
  • 38:05is to estimate country-specific misclassification rates.
  • 38:09The issue however is that even with more data,
  • 38:12there is, I think, around 600 cases here for six countries,
  • 38:17which is approximately 100 case per country.
  • 38:20And there are 25 cells of the misclassification matrix.
  • 38:23So that's like four cases per cell,
  • 38:25so that's clearly not enough to do separate
  • 38:27country specific models.
  • 38:30So we'd have to kind of do
  • 38:32a sort of a borrowing of information
  • 38:35both across the rows and columns of the matrix
  • 38:38but also across different countries.
  • 38:42So what we do first is first, we kind of borrow information
  • 38:45across the rows and columns of the matrix.
  • 38:49And to do this, we start with a,
  • 38:52instead of an unstructured misclassification matrix
  • 38:55where we estimated each cell separately,
  • 38:57we start with a structured misclassification matrix
  • 39:00using two basic mechanisms.
  • 39:02So we say that a classifier operates using two mechanisms,
  • 39:07for a given cause, it can either match that cause
  • 39:12and we call that an intrinsic accuracy
  • 39:15and that matching probability will be different
  • 39:18for different causes, so there are three causes here,
  • 39:20and you can see
  • 39:21that the matching probability can be different.
  • 39:24If it doesn't match the true cause,
  • 39:26then it randomly distributes its prediction
  • 39:29to the other causes
  • 39:31and that random distribution will also have some weights,
  • 39:36and those we call the systematic bias
  • 39:38or the pool of the classifier.
  • 39:40So if it's not matching,
  • 39:42we saw that it'll often assign a cause to prematurity
  • 39:46regardless of what the true cause is.
  • 39:48So that's kind of the basis for this model.
  • 39:51And if you have this model,
  • 39:52we kind of rearrange these three bars here
  • 39:57and then we put in the circle from there.
  • 39:59And these will give you the misclassification priorities.
  • 40:03So we can write each of the misclassification probabilities
  • 40:08in terms of just these six parameters and we can do the same
  • 40:13for the green cause and for the blue cause.
  • 40:17And so basically, these are the nine misclassification rates
  • 40:22written in terms of the six parameters.
  • 40:23So this is not that much of a dimension reduction
  • 40:26if there are three causes,
  • 40:27but if there are in general C causes,
  • 40:32this model for misclassification matrix will only have
  • 40:342C - 1 parameters as opposed to C square parameters.
  • 40:39So in practice, we use seven causes for children
  • 40:43and five causes for neonates,
  • 40:44so this leads to a lot of dimension reduction.
  • 40:49And one of the justification
  • 40:53for this dimension reduced model
  • 40:54is that if this model is true then the misclassification
  • 40:59into different causes,
  • 41:01the odds of misclassification into two causes, j and k,
  • 41:05will not depend on what the true cause is.
  • 41:08And we do see that in the data.
  • 41:10So these are different cause payers, j and k,
  • 41:13and these are the odds for what the true cause is.
  • 41:17So we are plotting the misclassification rates,
  • 41:20mij over mik.
  • 41:22So this is j and k
  • 41:24and the colors here give you i.
  • 41:26So you do see that they do not vary
  • 41:28for different choices of i,
  • 41:30it only is specific to j and k,
  • 41:32and that's an equivalent characterization
  • 41:36of that systematic preference
  • 41:39and intrinsic accuracy model that we have,
  • 41:41so we do see that reflected in the data.
  • 41:44But we don't have that as the fixed model we have.
  • 41:49So this is the best model.
  • 41:51We allow some diversion or shrinkage towards it
  • 41:54and there's a tuning parameter.
  • 41:56So then we get the homogeneous model
  • 41:58and then we have a diversion from the homogeneous model
  • 42:01to get country specific model.
  • 42:03So that's the broad idea,
  • 42:04I won't go into the modeling details.
  • 42:07And these are the predictions
  • 42:09using the country specific model.
  • 42:13I won't go into details here, but there are many cases,
  • 42:15for example, take it here,
  • 42:17star is the empirical rate,
  • 42:19angle is the heterogeneous model.
  • 42:24And you can see it does much better
  • 42:26than the horizontal line, which is the homogeneous model.
  • 42:30And we do see it throughout the classification rates.
  • 42:36These are the estimates for Bangladesh.
  • 42:38So the red density is the pooled estimate
  • 42:41of the homogeneous estimate.
  • 42:43The blue density is the Bangladesh specific estimate.
  • 42:48The dotted vertical line
  • 42:50is the empirical estimate for Bangladesh
  • 42:52and the solid vertical line
  • 42:53is the pooled empirical estimate.
  • 42:56So you can see that as we get
  • 42:59more and more data from Bangladesh,
  • 43:01the country specific estimate moves away
  • 43:03from the pooled estimate
  • 43:04towards the country specific estimate.
  • 43:06So that's basically the hope is going forward,
  • 43:12we will have much more data within each country
  • 43:14and we'll have estimates that are much closer
  • 43:16to the dotted lines than the solid lines.
  • 43:22So that's the summary.
  • 43:23So in general, these cause of death classifiers
  • 43:26are super inaccurate.
  • 43:28So we need to calibrate for that and we have limited data
  • 43:31to estimate their inaccuracy,
  • 43:32so we calibrate them innovation way.
  • 43:36The methods give probabilistic cause of death
  • 43:39instead of categorical cause of death.
  • 43:40So we develop a generalized Bayes approach
  • 43:43that is equivalent to a multinomial model
  • 43:45if the data is categorical.
  • 43:47But if it's not categorical, it becomes a pseudo likelihood
  • 43:50Bayesian approach for compositional data
  • 43:54and that allows zeroes and ones in the data
  • 43:57and is not kind of dependent on the model specification.
  • 44:02And then it kind of led to this independent development
  • 44:05of the composition on composition regression.
  • 44:09Some papers and software.
  • 44:10So the single cause paper was the first one,
  • 44:13then we extend it to compositional data
  • 44:17and develop the theory for it.
  • 44:19The package for calibration is available on GitHub
  • 44:22and then the composition on composition regression
  • 44:25were the separate piece
  • 44:26and we have the coda linear model package for it on CRAN.
  • 44:30And then we use this approach
  • 44:32to produce calibration estimates
  • 44:36for neonate and children deaths in Mozambique
  • 44:39which were published in the last three papers.
  • 44:41Thank you.
  • 44:51<v ->Questions? Yes.</v>
  • 44:53<v ->So I just had a quick question 'cause you were saying</v>
  • 44:55the model basically looks at the symptoms
  • 44:58that'll be able to predict which it would be.
  • 45:00Does it also factor in what diseases and stuff
  • 45:04are most common in those areas or does it kind of just-
  • 45:07<v ->Oh, very good question.</v>
  • 45:09It does factor it in but in a very crude way
  • 45:12in the sense that the models have some settings
  • 45:14called like high malaria, low malaria or high HIV, low HIV.
  • 45:18So depending on which country you're running it,
  • 45:21you will set the setting to like high HIV country
  • 45:24or low HIV country, the same for malaria,
  • 45:27but it doesn't do anything beyond that,
  • 45:30so only at a very close level.
  • 45:34<v ->Causes of death or.</v>
  • 45:37<v ->So the ICD-10 classification</v>
  • 45:40will have around 30 plus causes of death
  • 45:42for children's and neonates,
  • 45:44I think much more for adults.
  • 45:47There are no MITS for adults.
  • 45:48MITS was only done for children's and neonates,
  • 45:51only now adult MITS are being started,
  • 45:54but we have to kind of group them into broader categories
  • 45:57because if you have 30 causes,
  • 45:59your misclassification matrix will be 30 times 30.
  • 46:02So we don't have the data to do estimation
  • 46:05at that fine resolution.
  • 46:06So we group them into broader categories.
  • 46:08So seven for children, five for new neonates.
  • 46:11<v ->Is one of the categories, I have no idea,</v>
  • 46:14it is totally unknown.
  • 46:15And if so, is that different from the uniform distribution
  • 46:18across causes of death?
  • 46:21<v ->That would be the uniform distribution.</v>
  • 46:23There is no category which is, I have no idea,
  • 46:25but it'll be probably reflected in a score that is very flat
  • 46:28across the causes.
  • 46:30<v ->If you think there are seven causes of death</v>
  • 46:32and I'm working with the same dataset
  • 46:34and I think there are 100 causes of death,
  • 46:36will there be substantial differences in our marginal
  • 46:39estimates of probability?
  • 46:41Because our uniform posteriors
  • 46:45place such different amounts of mass across the say
  • 46:4830 versus 100 causes of death.
  • 46:51<v ->Yes, there will be differences</v>
  • 46:54and even when we are aggregating from the 30 causes
  • 46:58to seven causes, the assumption is that within each category
  • 47:02the misclassification rates are homogeneous
  • 47:04within the finer category.
  • 47:05So that is an assumption that we're working with.
  • 47:08So definitely, there will be differences.
  • 47:11<v ->Thank you.</v>
  • 47:16<v ->I have one more question.</v>
  • 47:22I'll ask a philosophical question
  • 47:23if I may. <v ->Sure, yeah.</v>
  • 47:24<v ->You commented,</v>
  • 47:26I don't know, about halfway through,
  • 47:27about how statisticians are working on a thing.
  • 47:32Computer scientists are working on the same thing.
  • 47:34There's a third group I forget.
  • 47:37And nobody talks to each other.
  • 47:40Now, many of us are,
  • 47:42many of the students here
  • 47:44are within the data science track of biostatistics.
  • 47:49By the way, love your Twitter handle.
  • 47:52But yeah, so how do we bridge those things
  • 47:56that we take advantage of these things
  • 47:57and it's not three separate versions of the same thing?
  • 48:01<v ->I don't know if there's a systematic way.</v>
  • 48:04Honestly, I came to know about much of the literature
  • 48:08going through the revisions
  • 48:09and one of the reviewer associate editors said
  • 48:11there is a lot of work here in the econometrics literature,
  • 48:14you should take a look.
  • 48:15And that's kind of the value
  • 48:16of the peer review system I guess.
  • 48:17And so we looked at it and yes, there was a lot of work
  • 48:20and they just called it different things
  • 48:22and so I had no idea
  • 48:23when I was searching for that in the literature.
  • 48:26And we did see the Victor Chernozhukov paper
  • 48:29I think is in "Journal of Economics,"
  • 48:30but it's basically an asymptotic statistics paper.
  • 48:33It kind of shows that these generalized Bayes stuff,
  • 48:36which they call as Laplace-type estimators,
  • 48:38has all these nice properties
  • 48:40that a standard vision posterior will have.
  • 48:43But yeah, I think talking to more people
  • 48:46and like interacting and telling about your work
  • 48:49will kind of,
  • 48:50and someone will say that, oh yeah, I do something similar.
  • 48:52You should look at this paper,
  • 48:55it's probably. <v ->Hopefully Twitter helps.</v>
  • 48:57<v ->Sorry?</v>
  • 48:58<v ->Hopefully Twitter helps.</v>
  • 48:58<v ->Yeah, yeah, definitely.</v>
  • 49:00Engagement through any like in-person
  • 49:02or social media platform would be useful, yeah.
  • 49:08<v ->All right, well thanks so much.</v>
  • 49:08I think we're out of time so we'll stop it there.
  • 49:12(attendant muttering indistinctly)
  • 49:15Hope everybody has a wonderful fall break.
  • 49:17See you next week.
  • 49:19(attendants chattering indistinctly)
  • 49:37<v Learner>The other organizer.</v>
  • 49:38(learner muttering indistinctly)
  • 49:39(attendants chattering indistinctly)
  • 49:53<v ->Or maybe because they're susceptible.</v>
  • 49:55(attendants chattering indistinctly)
  • 50:04<v ->Thank you. Anyone else need to sign in?</v>
  • 50:06(attendants chattering indistinctly)
  • 50:19<v ->Infection but they're also premature babies.</v>
  • 50:21(attendants chattering indistinctly)
  • 50:30<v ->Premature, but also it's that</v>
  • 50:32it's not a distinct.
  • 50:34(attendants chattering indistinctly)
  • 50:35<v ->Cause of death is very blurry in this day.</v>
  • 50:38<v ->Is that part of why like.</v>
  • 50:40(attendants chattering indistinctly)
  • 50:46<v ->'Cause a symptom given cause session</v>
  • 50:49with that much of variation across country.
  • 50:51<v Learner>Cause.</v>
  • 50:52(learner muttering indistinctly)
  • 50:53Cause.
  • 50:54<v ->Reporting depends on who is answering.</v>
  • 50:58(attendants chattering indistinctly)
  • 51:04<v ->You need to go next.</v>
  • 51:05<v ->Back to.</v>
  • 51:09<v ->I guess, yeah.</v>
  • 51:10You need one of us to let you.
  • 51:12(lecturer muttering indistinctly)
  • 51:14<v ->It might be a short answer.</v>
  • 51:16Yeah, and it's short answer.
  • 51:17(attendants chattering indistinctly)
  • 51:20<v ->I don't have to, will you? (laughs)</v>