https://www.youtube.com/watch?v=7z7jUF4Clok&ab_channel=MicrosoftResearch
#1 - What is Causal Inference?
1. Implicit Assumption
a. Many of the early advances in machine learning were really designed to think about prediction problems.
b. When we use prediction models, we're making an implicit assumption that things will basically continue as they have been.
2. Reality is more Complex
a. When we think about changing our behavior, that in some assumption of prediction models is violated.
b. Past sales may be a good predictor of future sales. But if we did something to artificially inflate sales today, we wouldn't necessarily expect that sales would increase a lot in the future.
3. Many data science Qs are causal Qs
a. A/B Experiments : If I change the algorithm, will it lead to a higher success rate?
b. Policy decisions : If we adopt this treatment/policy, will it lead to a healthier patient/more revenue/etc?
c. Policy evaluation : knowing what I know now, did my policy help or hurt?
d. Credit attribution : are people buying because of the recommendation algorithm? Would they have bought anyway?
In fact, causal questions form the basis of almost all scientific inquery
4. Two fundamental challenges for causal inference
a. Data is not enough. Causal inference analysis needs domain assumptions
=> Formally express and test assumptions
-> DoWhy(https://github.com/py-why/dowhy)
+
b. Datasets are increasingly high dimensional and causal effect varies over different subpopulations.
=> Estimate conditional effects over complex data
-> EconML(https://github.com/microsoft/EconML)
#2 - How to do causal inference?
1. Scalable Solutions to Causal Questions
a. Causal inference requires domain knowledge and expertise.
b. DoWhy and EconML transform how these questions are answered
i. Reduce the steps that require expert decision making
2. Easing Each Step of Causal Inference
a. Framing
i. Articulate assumptions and formulate the correct causal estimated with DoWhy
b. Estimation
i. Estimate cutting-edge, personalized causal effects and confidence intervals automatically with EconML
c. Evaluation
i. Interpret and present causal effects with EconML
ii. Test assumptions with DoWhy
3. Framing the causal problem
a. DoWhy + EconML lets you create a formal causal graph based on domain knowledge.
b. Multiple identification methods that automatically tell you which technique can be used estimate the causal effect
4. Estimating the causal effect
a. DoWhy + EconML supports state-of-the-art methods for conditional treatment effect.
b. Can estimate varying effects on differenct subpopulations.
5. Interpreting and evaluating the robustness of estimate
a. Interpret the estimate
i. Interpret role of confounders in driving treatment and outcome with SHAP
ii. Map personalized treatment effects with tree interpreeters.
b. Test the estimator
i. Bootstrap Refuter
ii. Data Subset Refuter
c. Test all steps at once.
i. Placebo Treatment Refuter
ii. Dummy Outcome Refuter
iii. Random Cause Refuter
iv. Sensitivity Analysis
#3 - Case Study
1. Customer Segmentation
a. A media company would like to offer small targeted discounts to customers.
i. Prefer to use historical data rather than a new experiment.
ii. Large data with many customer features.
b. What is the causal effect of these discounts on demand?
c. Which customers are most responsive to discounts?
'CS > MachineLearning' 카테고리의 다른 글
리얼월드 머신러닝 (0) | 2022.07.05 |
---|---|
파이썬으로 완성하는 비지도 학습 알고리즘 (0) | 2022.07.04 |
MLOps Tutorial #1: Intro to Continuous Integration for ML (0) | 2022.06.24 |
HPO W&B Sweeps (0) | 2022.06.22 |
Using Survival Analysis to understand customer retention - Lorna Brightmore (0) | 2022.06.21 |
댓글