The whole lot you realized about causal inference in academia is true. It’s additionally not sufficient, and most of us doing utilized causal inference expertise it.
, what’s totally different is the gravity of the choices that lean on the evaluation: not each determination deserves the identical degree of proof. Match your rigour and causal inference to the gravity of the choice, or waste sources.
Take product discovery. Earlier than constructing and delivery, many assumptions want validation at a number of steps. Aiming to nail every reply with good causal inference; for what? Shifting up one sq. on a board of many related, even mandatory, however on their very own inadequate selections. The danger is already unfold, hedged, over many selections, due to a course of that values incremental proof, studying and iterations.
Concurrently, causal inference comes with materials alternative price: the rigour requires delays time-to-impact, whereas there may have been a challenge ready for you the place this rigour was really wanted to enhance the choice high quality (cut back threat, enhance accuracy and reliability)
Closing vs. constructive selections is my go-to framing to make this concept easy:
- Constructive selections transfer you ahead in a course of. “Ought to we discover this characteristic additional?”, “Is that this person downside value investigating?” Getting it mistaken prices you a dash, possibly two, whereas getting it proper doesn’t change the corporate, but.
- Closing selections commit sources or change route, and getting it mistaken is dear or arduous to reverse: “Ought to we make investments $2M in constructing this out?” “Ought to we kill this product line?“, “Ought to we allocate extra advertising and marketing finances into this or that channel?“
In tech, the quantity and tempo of selections is unparalleled. Typically, these are ultimate selections. However way more frequent are constructive selections.
As knowledge scientists we’re concerned in each sorts, and failing to recognise after we are coping with one or the opposite results in posing the mistaken questions or chasing the mistaken solutions, losing sources, finally.
On this article I wish to floor three guidelines that I hold coming again to when embarking on causal inference initiatives:
- Begin with the issue, not with the reply
- Should you can remedy it extra simply with out causal inference, do it
- Do 80/20 in your causal inference challenge too
Guidelines not often sound enjoyable. However these helped me enhance my influence by tons, really.
Let’s unpack that.
1. Begin with the issue, not the reply
Each causal inference challenge begins with the issue you’re making an attempt to resolve; not with the identification technique and the estimator. It’s the right instance of doing the fitting factor, over doing issues proper. Your strategies might be on level, however what’s the worth if you’re fixing for the mistaken factor? Nudge your self to kick off a challenge with a crystal clear enterprise downside backing it up, and also you’d get 50% of labor is completed earlier than even beginning.
Should you’re extremely technical, likelihood is you understand the anatomy of a causal inference challenge: from DAG to mannequin, to inference, to sensitivity evaluation, and solutions.
However are you aware the anatomy of downside fixing in organisations?
The issue behind the issue
Huge issues get damaged down into smaller ones. That’s simply extra workable for a group that should discover options. And it permits us to mobilise a number of groups to resolve totally different a part of the larger (sub) downside. The identical goes throughout roles inside one group: you’re estimating churn drivers; your PM wants that to determine whether or not to put money into retention or acquisition.
That’s the problem: the issue you, the info scientist, are fixing is usually not the endgame.
Your downside is nested inside another person’s. Different individuals, round you and above you, want your reply as one enter to their answer. Recognise that dependency, and you’ll tailor your causal inference to what really issues upstream. The wins are concrete: tighter alignment on the causal estimand of curiosity, or faster discarding of causal inference altogether. Backside-line: shorter time-to-insight.
One time I used to be into community principle (Markov Random Fields was what made me perceive DAGs again in 2018). The whole lot was a community in my head. So I went to make a community of our inner BI functionality utilization. All dashboards have been nodes and they’d have thicker edges between them once they have been utilized by the identical customers. I calculated all kinds of centrality metrics; I recognized influential dashboards: dashboards that introduced departments collectively; and way more. I made a whole story round it, however actions by no means adopted. The problem was that I had by no means paid consideration to the issue my stakeholders have been making an attempt to resolve. Maybe I assumed the choice was of the ultimate kind, whereas it was a constructive one all alongside. A easy depend of dashboard utilization may’ve accomplished the job, however I handled it as a analysis challenge.
That was me then. And it wasn’t the final time one thing like that occurred. However the lesson realized is to start out with the issue, not with the solutions.
The anti-rule: trying on the mistaken issues
If you would like a fast solution to throw away cash, then go remedy the mistaken issues. Not solely will the options don’t have any materials final result, but additionally the chance price of not fixing the fitting downside in that point will add up.
So, in being keen to seek out the issue behind the issue, be crucial about whether or not it’s the fitting one to start, whenever you discover it.
In that sense, beginning with the solutions does provide the remedy. Nevertheless it goes barely in another way. Ask your self:
- If we do get these solutions, what do we all know that we didn’t know earlier than?
- If we all know that, then so-what?
If the reply to the so-what query makes a whole lot of sense, not solely to you, but additionally to your supervisor and their supervisor (presumably), then you definately’re on the fitting downside.
Magical.
2. Should you can remedy it extra simply with out causal inference, then do it
There’s no cookie-cutter causal inference. Strategies turn into canonical as a result of we’ve mapped their assumptions properly; not as a result of utilizing them is mechanical. Each state of affairs can violate these assumptions in its personal method, and each deserves full rigor.
The problem with that, although, is that we are able to’t justify doing so for all of them, resource-wise.
That’s when making use of causal inference turns into a cost-effective train: how a lot of the sources we could put in, in order that we attain the specified final result with some mandatory degree of confidence?
Ask your self that query subsequent time.
Fortunately, each evaluation wants to not be as rigorous as a full causal inference challenge to make the return of funding tip over to the constructive aspect.
The options: frequent sense, area data, and associative evaluation, derive good-enough solutions too.
It positively hurts a bit to say this; principled and rigorous me hates me now. However I’ve realized that it pays to strategy the trade-off as a strategic alternative.
Right here’s an instance to carry it residence:
The query is: ought to we make investments additional in characteristic A? Now, I can simply flip this round to: what’s the influence of characteristic A on person acquisition/retention? (a quite common angle to absorb a SaaS state of affairs; and a causal query at its coronary heart)
If it’s excessive, then we put money into it, in any other case not.
That phrase influence alone places me straight right into a causal inference mode, as a result of influence ≠ affiliation. However we all know that’s pricey. Is the issue value it? What’s the choice?
One strategy is to grasp how many customers are utilizing this characteristic in any respect. How frequent do they use it, provided that they selected to make use of it? That signifies how precious a characteristic may very well be, and sign that we are able to additional make investments on this characteristic. No diff-in-diff, nor IPSW, nor A/B check: but when these solutions return damaging, would a exact causal inference matter nonetheless?
The reality could also be within the center; solutions to these query could also be extra indicative than decisive, and the principle query should still really feel open. However absolutely, much less open than whenever you began: if these solutions ignite deeper analysis, then the product group is in movement, and certain within the route. Maybe extra rigorous causal inference follows.
The anti-rule: skipping causal inference is harmful
Say, the product group picks up the indicators out of your evaluation and makes some materials “enhancements” to the characteristic. The pattern dimension is low and they’re quick on time, so that they skip the A/B check and launch it instantly.
Fanatic experimenters lose it at this level. I believe that it might very properly be the fitting determination, if any person did the maths and concluded there may be extra at stakes to experiment, than to to not. After all I saved the case so generic nobody can really defend both aspect. That’d transcend the purpose.
However then, whereas the group jumps onto the following dash, the product administration nonetheless stresses how vital it’s to study one thing from what they launched beforehand. They nonetheless wish to a) get a sense of the influence, and b) whether or not some segments the place impacted kind of than others.
You’re glad as a result of learnings -> iterations is strictly the mentality you are attempting to foster. However you’re additionally in ache for at the least three causes:
- Lack of exchangeability: you understand that the customers that went on to make use of the characteristic are a extremely self-selected set. Contrasting them towards non-users. Actually?
- Interacting results: assume that one section was certainly impacted greater than others. Now recall the primary level: we’re conditioning on extremely engaged customers. It might be that that section displayed the next influence merely as a result of the customers have been additionally extremely engaged. The identical segments could not present that differential influence after we take into account decrease engaged customers. However you possibly can’t know. You’re working knowledge is skewed in direction of extremely engaged customers solely.
- Collider bias: in a worse case, conditioning on excessive engagement could flip across the relationship between segments and the end result of curiosity. The evaluation would steer the group to the mistaken route.
3. Do 80/20 in your causal inference challenge too
The title is a false buddy. I’m not saying half-bake your evaluation: when the query calls for full rigor, give it. The 80/20 is about the place your effort goes throughout a choice, not how deep you drill into the causal piece.
Recall the nested issues thought. Your causal inference challenge usually sits inside a bigger enterprise determination, and it not often is the one dimension that issues. The stakeholder has to weigh price, timing, strategic match, reversibility; alongside your estimate. Causal inference will not be every part we have to know.
In case your causal reply carries 30% of the load in that call, treating it like 100% is a waste. Worse: it’s a waste with a chance price, as a result of the opposite 70% sits unanswered.
That is the place the final-vs-constructive framing earns its hold. For constructive selections, spreading effort throughout dimensions virtually all the time beats drilling into one. For ultimate selections, the causal dimension usually is the core, and the maths ideas the opposite method.
Guidelines 1, 2, and three overlap however they aren’t the identical. Rule 1 requested whether or not you’re tackling the fitting downside. Rule 2 requested whether or not you want causal inference in any respect. Rule 3 assumes you’ve cleared each. Now the query is: inside the challenge, are you answering the fitting questions, plural, and letting causal inference carry solely the load that’s really on it?
Ship the choice, not the estimate
A current challenge: estimate the impact of a brand new pricing tier on income per person. Instinctively, I reached for the cleanest identification technique I may deploy. Distinction-in-differences with parallel-trends sensitivity, placebo checks, possibly a synth management for good measure. A month’s work, simply.
However after I zoomed out, the PM had three open questions, not one:
- What’s the impact on income per person? (causal)
- Are we cannibalising the present tier? (causal, totally different final result)
- How reversible is that this if it tanks? (not causal; an ops and product query)
Spending a month on query 1 would have left 2 and three half-answered. The choice wanted all three to be roughly proper, not one to be exactly proper. So: a tighter diff-in-diff on query 1 in two weeks, with specific caveats, and the remaining time on 2 and three. The stakeholder walked into the choice assembly with a balanced image fairly than one quantity and two shrugs.
The anti-rule: when the causal query is the choice
Should you 80/20 a causal inference challenge the place the causal estimate is the entire determination, you’ve hollowed out the evaluation.
That is the final-decision state of affairs. “Ought to we make investments $2M on this channel?” “Does this therapy trigger a significant discount in churn?” When the opposite dimensions are both already nailed down or genuinely secondary, the causal estimate will not be considered one of many inputs; it’s the enter. Chopping corners there to liberate time for work that doesn’t change the choice inverts the unique rule: now you’re misallocating the opposite method.
The talent is understanding which state of affairs you’re in. A fast check: when you can’t checklist three dimensions your stakeholder wants moreover your estimate, your causal reply most likely is the choice. Don’t 80/20 that one.
So, what now?
These guidelines apply throughout all analytical work, not simply causal inference. However causal inference is the place I’ve felt it the toughest in my previous roles.
Each time I really feel the pull of a clear synth management for a query no person requested, these are the reminders I tape to my very own brow:
The strategies come from finding out them. That’s one thing I received’t cease. However on the market, on the battlefield, let’s be sharp on when making use of them does good, and when not.
If considered one of these guidelines prevent a dash subsequent time, or an argument with a PM, that’s already a win; and these wins compound. Rigour reveals up when it issues. The remainder of your time goes to issues that additionally matter.
I’d be glad to have a dose of wholesome debating with you about all of the above. Join with me on LinkedIn, or comply with my personal website for content material like this!

