Skip to Main Content

YSPH Biostatistics Seminar: “Marginal Structural Models for Causal Inference with Continuous-Time Treatment and Censored Survival Outcomes"

December 08, 2020
  • 00:00- We want to the last Biostatistics seminar
  • 00:03for the fall series.
  • 00:05It's my great pleasure to welcome our speaker,
  • 00:07Dr. Liangyuan Hu.
  • 00:09Dr. Hu is an Assistant Professor of Biostatistics
  • 00:13in the Department of Population Health Sciences and Policy
  • 00:16at Mount Sinai School of Medicine.
  • 00:19She received her PhD in Biostatistics from Brown University.
  • 00:23Her methods research focuses on causal inference
  • 00:25with complex longitudinal and survival data
  • 00:28and Bayesian machine learning.
  • 00:30Her independent research has been funded by NIH
  • 00:33and Patient Centered Outcomes Research Institute.
  • 00:36And her paper in Biometrics has been selected to receive
  • 00:39the 2019 Outstanding Statistical Application Award
  • 00:45by the American Statistical Association.
  • 00:48Today, she's going to share with us her recent work
  • 00:50on developing a continuous time marginal structure of models
  • 00:54for complex survival outcomes.
  • 00:56Liangyuan, the floor is yours.
  • 00:58- Well, thank you Li Fan.
  • 01:00Thank you so much Fan for your introduction,
  • 01:02for the invite also.
  • 01:05Let me just share my slides full screen.
  • 01:08I'm really excited to be here today
  • 01:10to talk about some of the projects I've been working on
  • 01:15in the causal inference field,
  • 01:17namely, how do we use marginal structure models
  • 01:21for more complex comparative effectiveness
  • 01:26research questions involving continuous-time treatment
  • 01:30and censored survival outcomes.
  • 01:32So I'd like to first acknowledge my colleagues,
  • 01:35especially Doctors Hogan and Daniels
  • 01:37who had been instrumental to me
  • 01:40during the time I was working on this project.
  • 01:44And let me just shift to the bar a little if I can.
  • 01:50Okay.
  • 01:51So this is just for those who aren't very familiar
  • 01:56with causal inference,
  • 01:58and simple slide to introduce the concept.
  • 02:01Some key concept.
  • 02:02Suppose we are interested in estimating the causal effect
  • 02:06of a binary treatment A on some outcome Y.
  • 02:11Using the potential outcomes framework,
  • 02:13we can define the average treatment effect
  • 02:16as the difference between the mean
  • 02:19of the two sets of potential outcomes.
  • 02:22So, Y1 here is the potential outcome
  • 02:26that would have been observed
  • 02:27had everyone in the population received the treatment.
  • 02:30Similarly, Y0 here is the potential outcome
  • 02:33that would have been observed
  • 02:34had no one in the population received the treatment.
  • 02:38To estimate the causal effect,
  • 02:39the gold standard is the randomized controlled file.
  • 02:43So in an RCT, we would randomly allocate patients
  • 02:48to receive either treatment or the control or placebo,
  • 02:53the randomization would make the two groups of patients
  • 02:57more or less very similar in terms of their characteristics.
  • 03:01So in a sense that these two groups are exchangeable,
  • 03:08so that an individual's potential outcome
  • 03:11to either treatment or control
  • 03:14would not depend on which treatment group
  • 03:16this person was assigned to.
  • 03:18But just depends on how the treatment works.
  • 03:21And this way we can simply look at the difference
  • 03:24and the mean of the observed outcome
  • 03:30between the two treatment groups
  • 03:31and just to estimate the causal effect.
  • 03:34But in many, many situations,
  • 03:36we cannot conduct an RCT
  • 03:39and we have to rely on observational data
  • 03:42to get the causal inference about treatment effects.
  • 03:45So in these situations,
  • 03:47the independence between the potential outcome
  • 03:49and treatment assignment would no longer hold.
  • 03:53Because there might be exists a confounder
  • 03:56that is predictive of the outcome,
  • 03:58such that the probability of receiving the treatment
  • 04:01depends on the confounder.
  • 04:03So for example, age might be such a confounder.
  • 04:06For example, younger patients may be more likely
  • 04:10to receive the treatment.
  • 04:13So in this case,
  • 04:13if you take the difference in the average
  • 04:16of the observed outcome between the two groups,
  • 04:20then this estimate would not bear a causal interpretation
  • 04:24because the difference might be confounded by age.
  • 04:29So we would have to use specialized
  • 04:31causal inference techniques to remove the confounding.
  • 04:35And there are just many, many techniques out there,
  • 04:37but today I'm just gonna focus on marginal structure model,
  • 04:41because it is simple to implement.
  • 04:46It has good statistical properties,
  • 04:49and it is versatile enough
  • 04:51to accommodate many, many complications
  • 04:54posed by observational data that I'll talk about later.
  • 04:58So we can propose a marginal structure model
  • 05:00relating the potential outcome to the treatment assignment.
  • 05:04And here theta one would capture the causal effect.
  • 05:08But in reality,
  • 05:09we can only fit a model to the observer data.
  • 05:12And as I talked earlier,
  • 05:16the parameter estimator beta one here
  • 05:19would not bear a causal interpretation,
  • 05:23it just measures association.
  • 05:26But we can get to causation
  • 05:28if by solving the weighted estimating equation,
  • 05:32using the weight as W inverse of conditional probability
  • 05:40of treatment assignment given the measured covariance.
  • 05:43And this works because the IP weighting
  • 05:46or inverse probability weighting
  • 05:47removes confounding by measured covariance X
  • 05:51in the weighted pseudo-population.
  • 05:54So that's just a simple example
  • 05:57to illustrate the use of marginal structure model.
  • 06:03And traditionally treatment assignment,
  • 06:06treatment is assigned at baseline and it's time fixed.
  • 06:10So it means that the treatment doesn't change over time,
  • 06:14but with increased availability of healthcare data sets,
  • 06:21there are increased demands for more refined
  • 06:23causal inference methods to evaluate complex
  • 06:28treatment regimens.
  • 06:29So one example is that treatment initiation
  • 06:34can actually depend on time, so it changes over time.
  • 06:40In this case,
  • 06:41it would just be impractical to conduct RCTs
  • 06:44because there are just simply too many
  • 06:46treatment initiation time points.
  • 06:48So I'm going to use two motivating examples in this talk.
  • 06:54The first example is about timing of treatment initiation
  • 07:00for patients who present both HIV and TB, tuberculosis.
  • 07:05For these patients,
  • 07:07TB treatment will be initiated immediately
  • 07:09after the diagnosis, but during the TB treatment,
  • 07:13when is the optimal time to initiate the HIV treatment
  • 07:17or ART, anti-retroviral therapy?
  • 07:20That is a very important question to answer,
  • 07:22because if you initiate the treatment too early,
  • 07:25there might be drug interactions, drug toxicity,
  • 07:29but if you delay the treatment too much,
  • 07:31then there's also increased the mortality
  • 07:33associated with AIDS.
  • 07:36The second example is timing of HIV treatment
  • 07:39for adolescents.
  • 07:42The timing now is defined with respect
  • 07:45to the evolving value of a biomarker CD4.
  • 07:49And this is also an important question to answer
  • 07:51because the WHO guideline is in the form of
  • 07:57treat this person when the person's CD4 cell count
  • 08:01drops below 350, for example,
  • 08:04and for the population of adolescents
  • 08:06currently there's no concrete evidence
  • 08:10for supporting the optimal threshold.
  • 08:15So to statistically formulate these two motivating examples,
  • 08:20the first one,
  • 08:21when is the best time to initiate a treatment?
  • 08:25So this is actually a static treatment regimen
  • 08:27with respect to time,
  • 08:32and the initiation can occur on the continuous timescale.
  • 08:35And second example is actually a dynamic treatment regimen.
  • 08:39It's dynamic because it depends on the evolving history
  • 08:45of treatment and a biomarker,
  • 08:48but initiation can also occur on the continuous timescale.
  • 08:54So marginal structure models are suitable
  • 08:56for addressing a time dependent treatment,
  • 08:59but in order to use the models,
  • 09:02we have to overcome some statistical challenges.
  • 09:05The first challenge is that we need
  • 09:08to estimate the causal effect of the actual timing,
  • 09:11not compare protocols defined by some specific intervals,
  • 09:15which is a lot of existing studies did.
  • 09:21And also a lot of RCT reported these kinds of results.
  • 09:25Because as I said earlier,
  • 09:27it's just impractical for RCTs
  • 09:30to report continuous time causal effects.
  • 09:34We would also need to address complications
  • 09:36posed by observational data.
  • 09:39This is something I'll talk about later.
  • 09:41And also we are dealing with censored survival outcomes
  • 09:45that adds another layer of complexity.
  • 09:49So these are four sensory patterns observed in our data.
  • 09:53So our goal is to estimate the causal effect of A,
  • 09:57treatment initiation time and T, death time.
  • 10:01And we have almost 5,000 patients
  • 10:05and only a very small proportion of patients
  • 10:07have both observed A and T.
  • 10:10A lot of patients don't have observed T.
  • 10:13So their death time is censored by C.
  • 10:16And we have about 20% of our patients,
  • 10:18they don't even have observed A.
  • 10:20Their treatment initiation time
  • 10:22can be censored by death time or censored by C, dropout,
  • 10:27for example.
  • 10:28So our goal is to estimate effect of A on T,
  • 10:32but we only have about 300 patients
  • 10:35have complete information.
  • 10:36Most of the patients we have incomplete information
  • 10:40on either A or T or both.
  • 10:43How do we probably use these incomplete information
  • 10:46to draw causal inference about A on T,
  • 10:49the effect of A on T,
  • 10:51that's a problem we solve in this project.
  • 10:56So three challenges.
  • 10:59First one, treatment initiation time,
  • 11:02this is observational data, so it's not randomly allocated.
  • 11:06We don't know the actual functional form of causal effect
  • 11:10of initiation timing or mortality rate.
  • 11:13And we see that, Oh, there's incomplete information
  • 11:17on either exposure or outcome or both.
  • 11:22The general solutions we proposed
  • 11:26that we first formulate a flexible structural
  • 11:29causal hazard model
  • 11:31that can capture the effects of both timing and duration
  • 11:36of the treatment.
  • 11:37And then we can derive methods
  • 11:39to consistently estimate the model parameters
  • 11:44under non random allocation and complex censoring patterns.
  • 11:48Using the model outputs we can estimate the functional form
  • 11:53of the causal relationship between our initiation timing
  • 11:56and mortality.
  • 11:59So some notation before we introduce our approach,
  • 12:02note that we have three time to events in our study,
  • 12:06we have treatment initiation time, death time,
  • 12:09censoring time.
  • 12:10We'll use T sub cap A to denote death time
  • 12:13associated with the actual treatment time.
  • 12:17And potential outcomes T sub small A,
  • 12:20this is the death time.
  • 12:21If treatment initiated at time A,
  • 12:24and we use T infinity to denote death time
  • 12:28if treatment is initiated beyond sometime point
  • 12:31of our interest.
  • 12:33Because of all the censoring,
  • 12:36all the three time to events can be censored by one another.
  • 12:40We use T star to denote the minimum of T and C.
  • 12:44Delta T is a corresponding event indicator.
  • 12:47So A star is the minimum of the three time to events.
  • 12:51Delta A is a corresponding event in the data.
  • 12:54Adopting the convention in the causal inference literature,
  • 12:58we use overbar to denote history.
  • 13:00So overbar L of T here is a covariate history
  • 13:06up to a time T.
  • 13:08Putting everything together,
  • 13:09we have a set of observed data.
  • 13:12Now back to the censoring patterns.
  • 13:15In case one, we observed both A and T.
  • 13:18So we would observe A, we would observe T sub A.
  • 13:23Case two T is censored by C,
  • 13:25so we observe A, we just know TA
  • 13:28is going to be greater than C.
  • 13:29Case three,
  • 13:31we will observe A,
  • 13:32but we know A is greater than TA.
  • 13:35And case four we don't observe A, we don't observe T
  • 13:38but we know A is greater than C and TA is greater than C.
  • 13:43Okay.
  • 13:44So now we propose a structural causal
  • 13:46proportional hazards model
  • 13:49to capture the survival effect of treatment initiation time.
  • 13:54Lambda AT here is a hazard function
  • 13:56for the potential outcome T sub A,
  • 13:59we start from lambda infinity T right here.
  • 14:01This is a reference hazard for T infinity.
  • 14:05So we start from here.
  • 14:07Once the treatment is initiated at A,
  • 14:10there is an instantaneous effect of treatment initiation
  • 14:14captured by the G1 function here,
  • 14:17and the effect of staying on the treatment
  • 14:20at any given time point T,
  • 14:22is captured by the G2 function of ART duration.
  • 14:27And the G3 function here captures the interaction
  • 14:30between treatment initiation and treatment duration.
  • 14:35So we leave this structural model relatively flexible.
  • 14:41First, the reference hazard is left unspecified
  • 14:44and the 3G functions, we also left them
  • 14:47as unspecified smooth function
  • 14:49of treatment initiation time duration and their interaction.
  • 14:54So now we can parametrize these three functions
  • 14:58using natural cubic splines,
  • 15:00and by rewriting the risk function of our structural model,
  • 15:07we can use beta this parameter
  • 15:09to include the causal effects of ART initiation time
  • 15:13on mortality hazard.
  • 15:15The problem here now,
  • 15:17our goal is to how do we obtain a consistent
  • 15:20estimate of beta using observed a data?
  • 15:24Once we have obtained that
  • 15:26we can use beta hat to estimate the 3G functions,
  • 15:30to understand the relative contribution of timing
  • 15:33versus duration and interactions.
  • 15:37And we could also estimate the causal does-response
  • 15:41of initiation time versus mortality
  • 15:43by relating the survival function to the hazard function.
  • 15:46We can derive this from our structural model.
  • 15:52And now we can also estimate the model-based
  • 15:55optimal initiation time
  • 15:57that will lead to the maximal survival probability
  • 16:01at say 52 weeks after diagnosis.
  • 16:07Okay, how to obtain a consistent estimate of beta.
  • 16:10So first let's assume if A is randomly allocated
  • 16:15and both A and T are observed,
  • 16:17then we can write the partial likelihood score function
  • 16:22of our structural model.
  • 16:24And this is a sample average of score function
  • 16:28is an unbiased estimator of the expectation
  • 16:32of the score function.
  • 16:33So E sub R here is the expectation
  • 16:37under the randomized treatment assignment.
  • 16:40So this would be an unbiased estimator function,
  • 16:47and solving this unbiased estimating equation
  • 16:50would give us a consistent estimator of beta.
  • 16:55Now, if A is still randomly allocated,
  • 16:58but T can occur before A,
  • 17:02so A may be censored by T.
  • 17:05In this case,
  • 17:06we would need to break the mean
  • 17:08of an individual score contribution into two parts.
  • 17:11In one part A is observed.
  • 17:13The second part is A is not observed.
  • 17:16And then we can apply the law of total expectation
  • 17:19to the second part.
  • 17:21The inner expectation would be conditioning
  • 17:24on the observed information.
  • 17:26Then using this strategy and taking in account
  • 17:30the survival hazard structure,
  • 17:32we can revise the estimating equation.
  • 17:37And by solving this to obtain a consistent estimate of beta.
  • 17:42In the case of non random allocation of treatment,
  • 17:46then if we want to estimate the causal effect of A on T,
  • 17:50then we would have to make a key assumption,
  • 17:56ignore ability assumption.
  • 17:57Essentially the assumption says
  • 17:59that the initiation of treatment at any given time T
  • 18:04is sequentially randomized in the sense
  • 18:06that as a potential outcome beyond this time
  • 18:09is independent of treatment initiation.
  • 18:12Conditioning on all covariate history up to T.
  • 18:16So with this assumption,
  • 18:19we will be able to use observed data
  • 18:21to derive the causal effect.
  • 18:23So say PR is the data distribution under randomized A,
  • 18:27and PO is the data distribution.
  • 18:30And they're not random allocation of A.
  • 18:33Note that in both settings,
  • 18:35there is a same set of observed data.
  • 18:39And as long as the observed data under PR
  • 18:43is absolutely continues with the observed data under PO.
  • 18:48Now we can derive a random-nikodym derivative.
  • 18:52And so Murphy's 2001 paper
  • 18:55developed a version of R-N derivative
  • 18:58that connects the distribution of the observed data
  • 19:01under PR and under PO for discrete time
  • 19:05and ordinary GEE score.
  • 19:07Johnson's 2005 paper extended this version of R-N derivative
  • 19:12to continuous time still for ordinary GEE score.
  • 19:15In this paper we extended the R-N derivative
  • 19:20for time to event setting.
  • 19:23So this is a version of R-N derivative
  • 19:26for survival data.
  • 19:28The reason why we wanted to use R-N derivative
  • 19:32is that we can then use it
  • 19:34to derive an unbiased estimating equation
  • 19:37using some weighted version of the observed data.
  • 19:41So we can estimate the causal effect.
  • 19:43So now we want to use this R-N derivative for survival data.
  • 19:49We want to apply that to Cox score
  • 19:51and to derive S rated estimating equation.
  • 19:55That's a little bit more complex than the GEE score,
  • 20:00but we can observe that the Cox score
  • 20:02can essentially be represented in three averages.
  • 20:06The one in blue,
  • 20:08the one in orange and the whole average.
  • 20:13And each average converges to its expectation.
  • 20:17And as I showed earlier,
  • 20:19we can always break the expectation into two parts.
  • 20:23In one part A is observed,
  • 20:25second part is not observed.
  • 20:27For the second part,
  • 20:28we can apply the total law of expectation,
  • 20:32the law of total expectation,
  • 20:35and recognizing the survival structure
  • 20:40to derive the second part.
  • 20:43And then we can apply the R-N derivative for survival data
  • 20:46to each piece separately,
  • 20:48to construct the unbiased score equation.
  • 20:53So after some derivation, we would arrive at the weights
  • 20:59and actually the weights come down in a very neat form.
  • 21:03Essentially, it suggests that for patients
  • 21:06who have initiated treatment by time T,
  • 21:10we would weight them by the marginals density function
  • 21:13of A divided by the conditional density of A
  • 21:18given their covariate history after time T.
  • 21:23And for those who are censored,
  • 21:25so not initiated by the time T,
  • 21:28we would weight them by some survival function
  • 21:31of the treatment initiation process.
  • 21:36And then by applying this weighting scheme,
  • 21:39we will be able to derive a weighted estimating equation.
  • 21:43And just a note that we have to apply
  • 21:46the same weighting scheme to the people
  • 21:49who are still in the risk set at any time T.
  • 21:54And so now that said, previously we have assumed
  • 21:58there's no censoring.
  • 21:59Now with censoring,
  • 22:00we need to assume another similar assumption,
  • 22:06similar to the ignore ability assumption,
  • 22:09and then using the similar strategy
  • 22:12to derive another set of weight for censoring.
  • 22:16For those who stay, remain in the study,
  • 22:18we would weight them by the survival function
  • 22:23for censoring.
  • 22:24And this would lead to the final modification
  • 22:27of the estimating equation for beta.
  • 22:29So censoring contributes information about the parameter
  • 22:33in two ways,
  • 22:35FC is observed as the person is actually censored.
  • 22:40It contributes to the risk set up to C.
  • 22:42If C is not observed, so C could be censored by T.
  • 22:46If death's occurred,
  • 22:47then it contributes to the individual partial likelihood
  • 22:50to weight for C but evaluated at death time.
  • 22:55Okay, now we know how to weight.
  • 22:56Back to the four censoring patterns.
  • 22:59The first one, both A and T are observed.
  • 23:02We would weight them by the first set of weight for A
  • 23:06evaluated at A,
  • 23:08T occurred, so the weight for C but evaluated at T.
  • 23:13Second case, T is not observed,
  • 23:17A is observed.
  • 23:19So first set of weight for A evaluated at A
  • 23:23and C just contributes information to the risks set.
  • 23:28Third case, A is not observed,
  • 23:31so second weight for A evaluated at T.
  • 23:35And weight for C, censoring evaluated at T.
  • 23:40The fourth case or final case, A is not observed,
  • 23:44again, second set of weight for A,
  • 23:47but evaluated at C, and C also contributes to the risks set.
  • 23:51Okay, so now we know how to weight.
  • 23:54We would have to estimate the weights.
  • 24:00The approach we used in the paper
  • 24:02is that we model the intensity processes
  • 24:06associated with the two counting processes,
  • 24:09one for A, one for C.
  • 24:12And then when we fit Cox proportional hazards models
  • 24:15for the two intensity processes,
  • 24:17we use fitted hazard to estimate the weights.
  • 24:21We use empirical cumulative hazards
  • 24:24to estimate the conditional density and function.
  • 24:27And for the marginal density function,
  • 24:29we use some nonparametric Nelson-Aalen estimator,
  • 24:32and use similar fashion to estimate rates for censoring.
  • 24:36Then we apply our methods to the AMPATH data.
  • 24:39AMPATH is a large HIV care program based in West Kenya,
  • 24:44our data has almost 5,000 patients
  • 24:47and for covariates, we have demographic information
  • 24:51and some disease-specific information.
  • 24:54Some of them are time varying like, weight, the CD4,
  • 24:56these are time varying variables.
  • 24:59We categorize the baseline CD4 subgroups into two groups,
  • 25:06the less than, or below 50 group,
  • 25:08this is the highest risk group.
  • 25:11So CD4 the higher, the better.
  • 25:14So below 50, this is a highest risk group.
  • 25:16And between 200 and 350,
  • 25:19there's relatively healthy patients.
  • 25:21The reason we categorize them into three groups
  • 25:23is because the program guidelines
  • 25:26are based on these subgroups
  • 25:28and RCT is reported results for below 50 group.
  • 25:33We want to compare our results to our CT findings.
  • 25:37So this plot shows the three estimated G functions.
  • 25:42The G1 A here suggests that the instantaneous effect
  • 25:47of a treatment initiation has a U shape,
  • 25:50achieving maximum benefit, or the lowest mortality hazard
  • 25:53at just about 10 weeks.
  • 25:56And after that, the longer the treatment is delayed,
  • 26:00the less the benefit of the treatment initiation.
  • 26:03And this is the effect of duration,
  • 26:06in general, it says that the longer
  • 26:08you stay on the treatment, the more benefit you get.
  • 26:11There's an upward trend for the interaction effect.
  • 26:15Essentially suggesting that delayed treatment initiation
  • 26:19would reduce the benefit associated
  • 26:22with long ART duration.
  • 26:27And so the net causal effect of treatment initiation
  • 26:31is summarized in this plot.
  • 26:33Top panel shows the mortality rate at one year
  • 26:38versus treatment initiation time.
  • 26:40Bottom panel compares immediate initiation
  • 26:44versus delayed initiation at A.
  • 26:48So we can see that the benefit of early initiation
  • 26:53is most pronounced for the CD4 below 50 group,
  • 26:57or the highest risk group.
  • 26:58And the curves here are pretty flat,
  • 27:01suggesting that there's not much benefit
  • 27:03of early initiation for relatively healthy patients.
  • 27:09Several advantages for this approach.
  • 27:12It's easy to get optimal initiation time
  • 27:17based on the model outputs.
  • 27:20And we could also use the model outputs
  • 27:22to emulate comparisons between regimens reported in RCTs.
  • 27:27So we could mimic random allocation
  • 27:32of treatment initiation time to specific intervals
  • 27:36by assuming a distribution for A,
  • 27:39for treatment initiation time A,
  • 27:41that is independent of covariates and outcome
  • 27:44and compare interval specific mortality rates
  • 27:49and draw inferences about treatment initiation.
  • 27:53But with the continuous time marginal structure model,
  • 27:56we'll also be able to conduct a higher resolution analysis
  • 28:00that can potentially generate new insights
  • 28:03in relation to a randomized control trial.
  • 28:09For the sake of timing,
  • 28:10I just gonna briefly talk about the simulation.
  • 28:14We conduct simulation to examine
  • 28:15the finite-sample properties of weighted estimators,
  • 28:24we evaluate sensitivity of our estimators
  • 28:27to the violations of the ignore ability,
  • 28:30or no unmeasured confounding assumption,
  • 28:32but we only considered confounding at baseline.
  • 28:35So the sensitivity analysis strategy
  • 28:39for time-varying confounding,
  • 28:41especially with the censored survival outcome
  • 28:44is kind of very complex topic,
  • 28:48and we were still working on this project right now,
  • 28:51but in this paper we just consider confounding at baseline.
  • 28:56Under random allocation of treatment,
  • 28:59our estimator produced a new zero bias
  • 29:02and nominal coverage probability,
  • 29:05in the presence of measured confounding,
  • 29:07it eliminated nearly all the biases
  • 29:09and provided close to nominal coverage probability,
  • 29:13but in the presence of unmeasured confounding,
  • 29:16there was bias in our estimator.
  • 29:19And the biases were in proportion
  • 29:22to the degree of measured confounding.
  • 29:26Okay,
  • 29:27so moving to the second example,
  • 29:29this is a continuous time dynamic treatment regimen
  • 29:34of the form,
  • 29:34initiate treatment when a biomarker crosses a threshold.
  • 29:40It's dynamic treatment regimen
  • 29:42because it depends on evolving history of treatment
  • 29:45and a tailoring variable.
  • 29:47So in our case, CD4 is a tailoring variable.
  • 29:50That means we make our treatment decision
  • 29:53based on this variable.
  • 29:55A little bit different from our previous motivating example.
  • 30:00The outcome interest is different.
  • 30:04This is a pediatric data.
  • 30:05So for the kids, the mortality rate is very low
  • 30:09and our data I think it's around 3%.
  • 30:12And for kids, we're also interested
  • 30:14in their CD4 measurements,
  • 30:17because CD4 is important marker of immune system function
  • 30:21and both outcomes, both mortality rate and CD4
  • 30:24are sparsely measured in our data,
  • 30:27but we are interested in both.
  • 30:30Other than that, we also have complications
  • 30:33posed by observational data.
  • 30:36So this is a picture of nine randomly selected individuals
  • 30:41from our data,
  • 30:43X axis here, follow-up time in days,
  • 30:46Y axis here square root of CD4,
  • 30:49purple line is end of follow-up,
  • 30:53two gray lines here mark one year
  • 30:57and two years post diagnosis.
  • 31:00Empty circles here mean that the patient
  • 31:04has not been treated.
  • 31:06Solid circles, mean that they're on the treatment.
  • 31:09So we can see that there's a lot of variability
  • 31:12in terms of the treatment initiation time.
  • 31:16And some people are followed much longer
  • 31:20than some other patients.
  • 31:22And the follow-up time is pretty irregularly spaced
  • 31:29and overall the CD4 measurements are quite sparse,
  • 31:34and there's also incomplete information
  • 31:36for example, these two they either died
  • 31:41or were lost to follow up
  • 31:44before they even got a chance to be treated.
  • 31:47So there's also a lot of complication in the data.
  • 31:51There's a continuous time measurement
  • 31:54of the treatment initiation.
  • 31:55It just happens all over the place.
  • 31:58The longitudinal outcome of interest are sparsely measured,
  • 32:03leading to incomplete data.
  • 32:05There's also a censoring due to dropout or deaths.
  • 32:09So our general solution is that we'll use weighting
  • 32:11to handle time-varying confounding.
  • 32:14And will show how to derive a continuous time versions
  • 32:17of the weights.
  • 32:19For the missing outcomes
  • 32:21that is caused by sparse measurement and censoring
  • 32:24we'll use imputations from a model of the joint distribution
  • 32:28of CD4 and mortality.
  • 32:30And because we're interested in both mortality status
  • 32:33and CD4, we'll develop a composite outcome.
  • 32:38So our general approach is to emulate a randomized trial
  • 32:42in which we would randomize individuals
  • 32:45to follow specific DTR Q.
  • 32:48And Q equals zero means never treated,
  • 32:51because CD4 can never drop below zero.
  • 32:54Now, Q equals infinity means treat immediately.
  • 32:58So after randomization,
  • 32:59all the individuals will be followed
  • 33:02for a fixed amount of time,
  • 33:05at which point, say T star,
  • 33:07both their mortality status.
  • 33:10And among those who are alive at T star,
  • 33:14their CD4 count will be assessed.
  • 33:18So what define a composite outcome XQ,
  • 33:21that is the product of the test indicator
  • 33:25and the potential CD4.
  • 33:28So the cumulative distribution of this composite outcome
  • 33:32is a useful measure of treatment utility,
  • 33:36because it has appointments at zero
  • 33:39corresponding to mortality rate.
  • 33:41Thereby capturing both mortality status
  • 33:45and CD4 count among survivors at T star.
  • 33:52So for example, the probability of a positive XQ,
  • 33:56that's the survival fraction,
  • 33:58and the probability of XQ greater than X,
  • 34:02that's the fraction of survivors with CD4 above X.
  • 34:08Okay, so similar to the first motivating example,
  • 34:12we again have three timed events.
  • 34:16Death time, censoring time, treatment initiation time.
  • 34:19And now we have a tailoring variable, CD4 count.
  • 34:23So the CD four process is defined for all continuous time,
  • 34:27but it's just measured at discrete times.
  • 34:30And we also have a P by one covariate process.
  • 34:35Using a convention in the DTR literature,
  • 34:38we assume that the treatment decision
  • 34:40is always made after observing the covariate history
  • 34:45and the CD4 count.
  • 34:48Putting everything together,
  • 34:50we have a history information indicator.
  • 34:55For each individual, we'll have a observed a data process.
  • 35:00And just note that each person
  • 35:02can have a different lens of followup
  • 35:04at different time points.
  • 35:08Our goal is to evaluate the effect of DTRs,
  • 35:12but we're dealing with observational data,
  • 35:14so we'll have to map the observed treatment regimen
  • 35:18to specific DTRs that we are interested in evaluating.
  • 35:22Essentially we'll follow the deterministic function
  • 35:28to create the mapping.
  • 35:30Essentially there are three rules.
  • 35:32First rule says not to treat the person
  • 35:34if the person has not yet initiated treatment
  • 35:38and their CD4 has not fallen below Q,
  • 35:41or has not been observed.
  • 35:44Second rule says, treat this person if their time T,
  • 35:47CD4 has fallen below Q for the very first time.
  • 35:51Once treated, always treat them.
  • 35:54Following these three rules,
  • 35:56we'll be able to create a regimen specific compliant process
  • 36:00for each individual in the data.
  • 36:02So essentially if the rule says treat,
  • 36:05and if the person is actually treated by the time T,
  • 36:09then this person is compliant at time T.
  • 36:13If the rule says do not treat,
  • 36:14and the person was not treated at the time T,
  • 36:17so this person is still compliant to the rule.
  • 36:20And so we'll be able to observe a compliant process
  • 36:24for each person.
  • 36:25Here a simple example to show you how to create the mapping.
  • 36:29For example, we're interested in Q equals 350.
  • 36:33This person came in at baseline,
  • 36:35had a measurement 400 above the threshold.
  • 36:39The rule says do not treat,
  • 36:40the person was not treated.
  • 36:41At this point, it's compliant with the rule.
  • 36:44Next visit, no new CD4 observation.
  • 36:48So the rule says do not treat,
  • 36:50the person's still not treated,
  • 36:52still compliant at this point.
  • 36:53Third visit, the person's CD4 drops to 330,
  • 36:58which is below the threshold for the very first time,
  • 37:01the rules are start treating this person,
  • 37:05the person was actually treated.
  • 37:06So compliant at this point.
  • 37:10Next visit the rule says once treated always treat them,
  • 37:14the person kept being treated.
  • 37:16So this person was compliant with the rule 350
  • 37:20all throughout his or her followup.
  • 37:23Next example, the first two rows are the same.
  • 37:27The third visit, the person's CD4 jumps to 450,
  • 37:33which is above the threshold.
  • 37:35The rule says do not treat,
  • 37:37but on the contrary, the person was actually treated
  • 37:40and kept being treated.
  • 37:42So from this time point onward,
  • 37:46the person was not compliant with this rule.
  • 37:51Okay, so that's just some simple example
  • 37:55to show how to create the mapping.
  • 37:58With missing outcomes for those alive
  • 38:01at the target measurement time T star,
  • 38:05the observed outcome XI is the CD4 measurement at T star.
  • 38:11But because of CD4 is sparsely measured
  • 38:14and irregularly spaced,
  • 38:17Z of T star is directly observed
  • 38:19only when the person's followup time is exactly at T star.
  • 38:25So in this case, it is pretty common
  • 38:27to predefine a interval and capture the CD4 that is measured
  • 38:37at the time closest to the target measurement time.
  • 38:41But even using this strategy,
  • 38:43there's still a possibility that there is no measurement
  • 38:48in predefined interval.
  • 38:51Then we say this person has a missing outcome.
  • 38:54And it's also possible that the person dropped out
  • 38:57before TA.
  • 38:59And so in this case, the outcome is also missing.
  • 39:04For these missing outcomes, our general strategy
  • 39:08is to use multiple imputation.
  • 39:10So we would specify and fit model
  • 39:12for the joint distribution of the CD4 process
  • 39:16and the mortality process.
  • 39:18For those known to be alive,
  • 39:20but without a CD4 measurement,
  • 39:23we would impute the CD4 count from the fitted CD4 sub-model.
  • 39:28And for those missing the CD4,
  • 39:31because of right censoring,
  • 39:33we would calculate the mortality probability
  • 39:37from the fitted survival sub-model,
  • 39:39and then impute the death indicator
  • 39:42from the Bernoulli distribution
  • 39:44with this calculated probability.
  • 39:46If the death indicator was imputed to be zero,
  • 39:49then we further impute a CD4 count for this person.
  • 39:52Otherwise we'll set X to be zero.
  • 39:56And again, we would have to assume
  • 39:59some standard causal inference assumptions
  • 40:02in order to draw causal effects about the DTRQ
  • 40:08using observational data.
  • 40:09And we can estimate and compare DTRs along a continuum.
  • 40:15We can formulate a causal model
  • 40:17for the smooth effect of Q on the task quantile of XQ.
  • 40:22This is our composite outcome with separate parameters
  • 40:25capturing the effect of treat immediately,
  • 40:29and the effect of never treat.
  • 40:31And then we can parametrize the model using splines of Q
  • 40:35for the third term here, to gain statistical efficiency.
  • 40:41And we can obtain a consistent estimator of effect of Q
  • 40:47by solving the weighted quantile regression
  • 40:50estimating equation.
  • 40:53So what should be the weights?
  • 40:57First, we assume there's no dropout or death
  • 40:59prior to the target measurement time.
  • 41:02In the discrete time setting with common time point,
  • 41:07the form of the weights have already been done.
  • 41:10It has been derived in several papers.
  • 41:14Essentially, the denominator of the weight
  • 41:17is this conditional probability.
  • 41:19It's a conditional probability of the person being compliant
  • 41:22all throughout the follow up, given the covariate history.
  • 41:29So if we have a common set of discrete time points,
  • 41:34it's a cumulative a product of the conditional probability
  • 41:38of this person being compliant at every time point.
  • 41:42And essentially if the rule says treat,
  • 41:46it's a condition of probability of the person
  • 41:48actually being treated at this time point,
  • 41:51if the rule says not treat,
  • 41:53as a conditional probability of this person not treated
  • 41:57by this time point.
  • 41:58So in order to estimate this probability,
  • 42:02we just need to model the observed
  • 42:04treatment initiation process among those regimen compliers,
  • 42:09but this is for discrete time setting.
  • 42:12What would be the continuous time weights?
  • 42:14We note that the occurrence of treatment initiation
  • 42:18in a small time interval T and T plus TD
  • 42:21is actually a Bernoulli trial with outcome DNA of T.
  • 42:28So then we can rewrite this probability,
  • 42:31this probability here,
  • 42:33in the form of individual partial likelihood
  • 42:37for the counting process of A.
  • 42:40And now we note that when DT becomes smaller and smaller,
  • 42:45this finite product approaches a product integral.
  • 42:50So then this finite product can be rewritten
  • 42:54as a final product over jump times of the counting process
  • 42:58for A times the survival function.
  • 43:02And then by recognizing that each individual
  • 43:05had at most one jump at exactly AI.
  • 43:10Now we can further reduce this probability to this form.
  • 43:15Which suggests weighting scheme.
  • 43:18Essentially it says for those who have been treated
  • 43:23by a T star, we would weight them
  • 43:25by the conditional density function of A.
  • 43:28For those who haven't been treated by the time T star,
  • 43:32we would weight them by the survival or function of A.
  • 43:36So if you recall the weighting scheme
  • 43:39for the first motivating example, this is exactly the same,
  • 43:43the same rating scheme,
  • 43:45but we took different approaches.
  • 43:47The first example,
  • 43:48we use a random Aalen derivatives
  • 43:51to derive the weighting scheme.
  • 43:52The second project we derive the limit
  • 43:58of the finite product,
  • 44:00but using different approaches,
  • 44:02we arrive at the same weighting scheme.
  • 44:06And so similarly we modeled the intensity process
  • 44:10of treatment initiation.
  • 44:12We estimate the weights.
  • 44:15So if there was a censoring or death
  • 44:18prior to target measurement time,
  • 44:20we would have to assume once lost to follow up
  • 44:23at a time prior to T star,
  • 44:25the treatment and regimen status remain constant.
  • 44:28And this way we will just estimate the weights
  • 44:31up to a time point CI, and if the person died before T star,
  • 44:37then we would only evaluate compliance
  • 44:40and treatment initiation processes up to time TI.
  • 44:45Okay, so for missing outcomes,
  • 44:49we propose a joint modeling approach.
  • 44:52We specify a two-level model for the observed CD4 process.
  • 44:56The first level,
  • 44:57the observed CD4 process is a true CD4 trajectory
  • 45:02plus some arrow process.
  • 45:04The second level,
  • 45:05we relate the true CD4 trajectory
  • 45:07to baseline characteristics and treatment initiation time,
  • 45:13and some subject specific random effects,
  • 45:16capturing subject-specific deviations
  • 45:19from the mean trajectories.
  • 45:22And now we propose a hazard model for deaths
  • 45:26uses the true CD4 trajectory as a covariate
  • 45:30linking the two processes,
  • 45:33Linking the death process and linking with a CD4 process.
  • 45:38Now we use the joint model
  • 45:39to impute the missing outcomes
  • 45:43and estimate the variance of the target estimator
  • 45:46using Rubin's combination wall.
  • 45:50So we applied this method to the IeDEA dataset.
  • 45:54IeDEA is another HIV consortium based in West Kenya.
  • 46:00So we have almost 2000 data.
  • 46:03We see that the CD4 is pretty sparsely measured
  • 46:07and death rate is low around three and 4%.
  • 46:12Most of patients have been treated by one year
  • 46:16and we have a set of covariates.
  • 46:21Some of them are time varying, some of them are time fixed.
  • 46:26We proposed three target estimators,
  • 46:30so first we're interested in mortality proportion.
  • 46:34We're also interested in the median of the distribution
  • 46:38of the composite outcome XQ.
  • 46:41We also looked at CD4 among survivors,
  • 46:45but this estimator does not have a causal interpretation
  • 46:49because it conditions on having survived two T star.
  • 46:53So it only measures association,
  • 46:55but the first two estimators have causal interpretations.
  • 47:03So we first look at the effectiveness
  • 47:05of five specific regimens for both one year and two years
  • 47:10after diagnosis.
  • 47:12We can see that the immediate treatment initiation
  • 47:18lead to significant lower mortality rate
  • 47:21and significantly higher median values
  • 47:24of the composite alcohol compared to delayed treatment.
  • 47:30And the never treat initiation
  • 47:32will lead to a significantly higher mortality probability.
  • 47:37And for those who do survive to T star,
  • 47:41their CD4 count is higher.
  • 47:44So resulting higher to theta Q2 and higher theta Q3
  • 47:48compared to other delayed treatment regimen.
  • 47:54So this may suggest that those who do survive
  • 47:57to T-star without any treatment,
  • 48:00maybe they are relatively healthier
  • 48:02at the beginning of the followup.
  • 48:05Okay, and then we also plot the dose response curve
  • 48:10of the median value of the composite outcome
  • 48:14versus DTR Q,
  • 48:17also suggests that the immediate treatment
  • 48:22would lead to significantly higher median values of XQ,
  • 48:27and also as illustration
  • 48:29of the gained statistical efficiency
  • 48:30by modeling the smooth effect Q on the quantile of the XQ.
  • 48:37The variance in the one year outcome
  • 48:40associated with Q equals 350, achieved about 15% reduction
  • 48:46compared to that from the regimen specific estimates.
  • 48:51So we gain a bit of our statistical efficiency
  • 48:55by modeling the smooth effect.
  • 49:00So there are several strands of continuous time
  • 49:02marginal structure model.
  • 49:04We see that we can derive, using different approaches,
  • 49:08closed form of weights for continuous-time treatment.
  • 49:12It can handle complex dataset on its own terms
  • 49:16without having to artificially align measurement times,
  • 49:20which could possibly lead to loss of information.
  • 49:24It is amenable to many different outcomes.
  • 49:27We've used the survival outcomes,
  • 49:28we've used composite outcomes.
  • 49:31You can also handle many data complications
  • 49:34introduced by various censoring patterns
  • 49:37within the same marginal structure model.
  • 49:41So these are the strengths,
  • 49:43but there are also limitations with this approach of course.
  • 49:46One notable limitation is extreme ways,
  • 49:50which could possibly lead to unstable estimates.
  • 49:54So how to address this issue,
  • 49:57especially for time varying confounding
  • 50:00with censored outcome, this would be a challenging task,
  • 50:04but if we can solve this issue,
  • 50:06it might be a very important contribution to the field.
  • 50:10So this is something my colleagues and I
  • 50:14have been thinking about and working on for some time.
  • 50:18Another limitation is that we know
  • 50:22that weighting-based estimator is less efficient
  • 50:25than the so-called G methods.
  • 50:28The G computation, G estimation,
  • 50:30and both G methods require integrating
  • 50:32over the space of longitudinal confounders.
  • 50:35So the G methods are computationally
  • 50:38much, much more expensive
  • 50:40than the marginal structure model-based methods.
  • 50:44And as far as I know,
  • 50:48currently there's no continuous time version
  • 50:50of the G computation methods.
  • 50:53Judith Lok has a paper, back in 2008.
  • 50:56She developed theory for continuous time G-estimation,
  • 51:00but I have yet to see a practical implementation
  • 51:03of this method.
  • 51:05So this could be another avenue for future research,
  • 51:11how to increase efficiency of the continuous time
  • 51:15weighting-based methods.
  • 51:18And here's some key references.
  • 51:22Thank you.
  • 51:24- Thank you Liangyuan for this very interesting
  • 51:25and comprehensive presentation.
  • 51:29Let's see if we have any questions from the audience.
  • 51:32If there's any questions,
  • 51:34please feel free to unmute yourself and speak
  • 51:36or type in the chat.
  • 51:43- [Donna] Thanks, it was a very interesting talk.
  • 51:45This is Donna Spiegelman.
  • 51:47- Hi, Donna.
  • 51:48- Yeah, hi.
  • 51:49I was wondering I might've missed it,
  • 51:51but did you say much about estimating the variance?
  • 51:55I see you have (indistinct) around the curve,
  • 51:58so you must derive the variance.
  • 52:01So I'm wondering if you could say a little bit about that
  • 52:03or a little more about that
  • 52:05if I missed what you did say.
  • 52:07- Sure, sure, sure.
  • 52:09So for this one, this is the second example,
  • 52:12for this one we have multiple amputation
  • 52:15and we also have weighting.
  • 52:20So with weighting part,
  • 52:22the variance was estimated using bootstrap
  • 52:26for multiple amputation, and then we combined,
  • 52:29so it's a bootstrap nested within multiple imputation.
  • 52:33So then we use the Rubin's combination role
  • 52:35to estimate the total variance.
  • 52:38For the first example, we actually used a bootstrap,
  • 52:45and the coverage probability was actually okay.
  • 52:50It's good for the estimator.
  • 52:52- Did you think about asymptotic variants derivations?
  • 52:56- I did.
  • 52:58It was a very difficult task,
  • 53:03there's a story about our first paper
  • 53:05found that about it.
  • 53:10It was first submitted to Jaza
  • 53:12and then they asked about the asymptotic variants
  • 53:16about the estimator.
  • 53:18And it's quite complex because they involve the splice
  • 53:22and involves the survival data.
  • 53:25And we have already approved as a consistency,
  • 53:34and it also involves optimization.
  • 53:38So it's just comes to-
  • 53:40- What's the optimization piece.
  • 53:43- Oh, it's the model based optimal treatment initiation time
  • 53:47that will lead to the maximum survival
  • 53:49at predefined time points.
  • 53:54Right, so they are interested in the optimization.
  • 53:58So the inference about the optimized
  • 54:00treatment initiation time.
  • 54:02We did some empirical evidence
  • 54:04for like the largest sample convergence rate,
  • 54:08but we weren't successful at deriving asymptotic variants.
  • 54:14So that's another piece, I think maybe,
  • 54:18I don't know.
  • 54:19We had this discussion among colleagues
  • 54:21and also my advisor at the time,
  • 54:24we just not sure about whether it's worth the effort
  • 54:28to go and do that route.
  • 54:30- It's probably way more complex
  • 54:32than just the usual derivation.
  • 54:34'Cause you do have like two weighting models,
  • 54:36which are also survival models,
  • 54:40and also the derivation that these variances
  • 54:42sometimes can be specific to the choice
  • 54:44of these (indistinct) models.
  • 54:46And so if you have a variance and the cup's model,
  • 54:49it does not apply to other forms of models,
  • 54:52I guess it's really a trade-off right?
  • 54:54- Yeah, it is a trade off.
  • 54:58It's still an open question and nobody had done it yet,
  • 55:03but just, whether you're thinking it's was the effort
  • 55:06just to devote a couple of years to work on that.
  • 55:10- So was bootstrap time consuming for these datasets,
  • 55:15for this data analysis, or they're pretty manageable.
  • 55:18- They're pretty manageable.
  • 55:20And it looks complicated because we have to weight everybody
  • 55:23that had event.
  • 55:24We also have to weight everywhere in the risk set
  • 55:27at any time point.
  • 55:28So it looks pretty complex, but still manageable.
  • 55:35Another reason is because we use parametric models.
  • 55:38If we wanted to,
  • 55:41I'm not aware of any machine learning algorithm
  • 55:46that can handle survival data,
  • 55:48but also with time varying covariates,
  • 55:52that's something I'm also thinking about.
  • 55:54Like, if we use those algorithm
  • 55:56might be more time consuming,
  • 55:59but with just a parametric models, it's pretty manageable.
  • 56:03- And when you're bootstrapped,
  • 56:04you go back to the weight models
  • 56:06and refit the weight models every time?
  • 56:08- Yeah.
  • 56:10- But the variable is pre-determined.
  • 56:12So that's what you mentioned, machine learning.
  • 56:15So the variables are predetermined
  • 56:17and they're functional forms in the model,
  • 56:19but the coefficients that correspond to them
  • 56:22are re estimated for each bootstrap.
  • 56:24- Very estimated.
  • 56:26Right, right, right.
  • 56:27Exactly.
  • 56:28Yeah.
  • 56:29- Great question.
  • 56:30- Yeah.
  • 56:31So a lot of open questions still.
  • 56:35- So any other questions from the audience?
  • 56:41- I have another comment.
  • 56:44So by getting back to this,
  • 56:47that you re estimated the coefficients
  • 56:49for the weight models.
  • 56:50So in sort of the standard marginal structural model,
  • 56:54the variability due to those weight models is ignored.
  • 56:59And the robust variance is used
  • 57:01and said to be an overestimate,
  • 57:02implying that if you took that variation into account,
  • 57:06you'd get a smaller variance
  • 57:08and you might see the same thing here with your bootstraps.
  • 57:11If you took the weight models as fixed,
  • 57:15you might find that you have a less efficient estimator,
  • 57:18which is kind of interesting
  • 57:20just in terms of say a methods paper to show,
  • 57:23because there's different ways to do bootstraps,
  • 57:26but here you're automatically taking the estimation
  • 57:30of the weight models into account,
  • 57:32which is not saying that say the classic paper
  • 57:35by Hernan in epidemiology,
  • 57:38that's ignored and the robust variance is recommended.
  • 57:42- Hmm.
  • 57:43It's a very great comment.
  • 57:46Something I have to think about.
  • 57:47So you're saying that in each bootstrap,
  • 57:52when we estimate the weight model, we fix the weight model.
  • 57:57So the coefficients from the weight model stay fixed-
  • 58:01- Yeah, so you don't even do a bootstrap for that.
  • 58:03You basically hold the weight model as a constant,
  • 58:06and then you'd-
  • 58:07- Robust variance.
  • 58:09- Yeah, you use the robust variance,
  • 58:11which I guess it's a little tricky
  • 58:13because now you don't have the robust variance
  • 58:15because you're not using it,
  • 58:16but it seems the bootstrap analog of the approach taken
  • 58:21would be to just fit the weight model once,
  • 58:24treat that fixed unknown,
  • 58:27and then only bootstrap on the outcome model.
  • 58:31- Right, right.
  • 58:32Yeah.
  • 58:33- [Fan Li] Totally. Yeah.
  • 58:35- Interesting.
  • 58:36Take that in as a note.
  • 58:39- So I do have a question as well.
  • 58:42I think Liangyuan you had presented two applications
  • 58:44at the HIV observational studies,
  • 58:47do you see the application that these new methods
  • 58:51to other areas as well
  • 58:54to solve the other questions? - Yeah.
  • 58:57Yeah, actually this is not pertaining to HIV area.
  • 59:02It's actually in the public health areas.
  • 59:06A lot of questions are involving
  • 59:13this statistical formulation.
  • 59:16So for example,
  • 59:18I've been collaborating with an epidemiologist at Columbia.
  • 59:23They are doing cardiovascular research.
  • 59:27So one research question is that,
  • 59:31I think it's blood pressure lowering intervention.
  • 59:37So blood lowering innovation is very useful
  • 59:40for preventing cardiovascular diseases,
  • 59:45but they don't know.
  • 59:47And there also a lack of randomized control trials.
  • 59:50What is the optimal threshold
  • 59:53to start giving the blood lowering treatment?
  • 59:57So this is exactly the same form
  • 59:59as our second motivating example.
  • 01:00:01Like what is the optimal CD4 threshold
  • 01:00:04to start the HIV treatment?
  • 01:00:06And their question is what is the optimal threshold
  • 01:00:09to start the blood lowering treatment?
  • 01:00:13So I think there's a lot of possibility
  • 01:00:20as to apply these kinds of methods
  • 01:00:22in other health research area.
  • 01:00:24- Yeah, it's a huge controversy
  • 01:00:26in terms of the treatment of hypertension,
  • 01:00:28what's the optimal blood pressure
  • 01:00:31to start antihypertensives.
  • 01:00:33And I think there was a very large trial
  • 01:00:35that showed that it was better to start it
  • 01:00:38at a much earlier threshold than what current practices.
  • 01:00:42And it's very troublesome for people around the world
  • 01:00:47because these medicines are expensive.
  • 01:00:49And if you see now,
  • 01:00:51like another like 40% of the population
  • 01:00:54should now be initiated a antihypertensive medication,
  • 01:00:58well, most countries can't even afford that.
  • 01:01:01So the implications of these different thresholds
  • 01:01:05is a very big topic of sort of substantive research
  • 01:01:09and debate right now.
  • 01:01:10- Well, that's great to know,
  • 01:01:12there's urgent need for that.
  • 01:01:14(indistinct)
  • 01:01:16- Totally.
  • 01:01:17All right, I think we are at the hour,
  • 01:01:19so thanks Liangyuan again for your great presentation
  • 01:01:25and if the audience has any questions,
  • 01:01:28I'm sure Liangyuan is happy to take any questions offline
  • 01:01:31by emails.
  • 01:01:33And I think this is the final seminar of our fall series,
  • 01:01:38and I hope to see everyone next spring,
  • 01:01:40have a good holiday.
  • 01:01:42Thank you.
  • 01:01:43- Thank you.
  • 01:01:44- Bye. - Bye.