본문 바로가기
CS/MachineLearning

Demo: Enabling end-to-end causal inference at scale

by Diligejy 2022. 6. 29.

https://www.youtube.com/watch?v=7z7jUF4Clok&ab_channel=MicrosoftResearch 

 

#1 - What is Causal Inference?

1. Implicit Assumption

    a. Many of the early advances in machine learning were really designed to think about prediction problems.

    b. When we use prediction models, we're making an implicit assumption that things will basically continue as they have been.

 

2. Reality is more Complex

a. When we think about changing our behavior, that in some assumption of prediction models is violated.

b. Past sales may be a good predictor of future sales. But if we did something to artificially inflate sales today, we wouldn't necessarily expect that sales would increase a lot in the future.

 

3. Many data science Qs are causal Qs 

a. A/B Experiments : If I change the algorithm, will it lead to a higher success rate?

b. Policy decisions : If we adopt this treatment/policy, will it lead to a healthier patient/more revenue/etc?

c. Policy evaluation : knowing what I know now, did my policy help or hurt?

d. Credit attribution : are people buying because of the recommendation algorithm? Would they have bought anyway?

 

In fact, causal questions form the basis of almost all scientific inquery

 

4. Two fundamental challenges for causal inference

a. Data is not enough. Causal inference analysis needs domain assumptions 

=> Formally express and test assumptions

-> DoWhy(https://github.com/py-why/dowhy)

 

+

 

b. Datasets are increasingly high dimensional and causal effect varies over different subpopulations.

=> Estimate conditional effects over complex data

-> EconML(https://github.com/microsoft/EconML)

 

#2 - How to do causal inference?

1. Scalable Solutions to Causal Questions

a. Causal inference requires domain knowledge and expertise.

 

b. DoWhy and EconML transform how these questions are answered

    i. Reduce the steps that require expert decision making 

 

 

2. Easing Each Step of Causal Inference

a. Framing

    i. Articulate assumptions and formulate the correct causal estimated with DoWhy

b. Estimation

    i. Estimate cutting-edge, personalized causal effects and confidence intervals automatically with EconML

c. Evaluation

    i. Interpret and present causal effects with EconML

    ii. Test assumptions with DoWhy

 

 

3. Framing the causal problem

a. DoWhy + EconML lets you create a formal causal graph based on domain knowledge.

b. Multiple identification methods that automatically tell you which technique can be used estimate the causal effect

 

4. Estimating the causal effect

a. DoWhy + EconML supports state-of-the-art methods for conditional treatment effect.

b. Can estimate varying effects on differenct subpopulations.

 

5. Interpreting and evaluating the robustness of estimate

a. Interpret the estimate

    i. Interpret role of confounders in driving treatment and outcome with SHAP

    ii. Map personalized treatment effects with tree interpreeters.

b. Test the estimator

    i. Bootstrap Refuter

    ii. Data Subset Refuter

c. Test all steps at once.

    i. Placebo Treatment Refuter

    ii. Dummy Outcome Refuter

    iii. Random Cause Refuter

    iv. Sensitivity Analysis

 

#3 - Case Study

 

1. Customer Segmentation

a. A media company would like to offer small targeted discounts to customers.

    i. Prefer to use historical data rather than a new experiment.

    ii. Large data with many customer features.

b. What is the causal effect of these discounts on demand?

c. Which customers are most responsive to discounts?

댓글