<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=648880075207035&amp;ev=PageView&amp;noscript=1">

"Data-Driven Decisions" - Good Alliteration, Bad Motto.


Whoever cites "data-driven decisions" probably has good intentions but implicitly commits a fallacy of omission! The problem is that the motto, "data-driven decisions", fails to state what is really required to make data useful for decision making.


Don't Confuse the Map for the Territory!

First of all, we should not be interested in data per se (unless we are computer scientists). Rather, we should concern ourselves with the aspect of the real world that is manifested in the data. It is the original problem domain, not its manifestation, in which we need to engage in the pursuit of our goals.

So, let's assume we have data that is meaningful in the above sense. It still remains useless. We need to use the data to shape an abstract, typically mathematical representation of the characteristics and the dynamics of the problem domain.

Nowadays, this can be much easier than in the past. For instance, we can leverage machine-learning techniques that help us find well-fitting, parsimonious models in an automated fashion. Assuming we have done our work correctly, we can now claim that we have a model of our problem domain that fits our data.


No Interventions Without Causal Assumptions!

However, what we have so far is still useless for many types of decision-making. Why? Decision-making generally requires us to anticipate the consequences of actions we have not yet taken. We want to know what would happen if we were to intervene in our domain (as opposed to just observing the domain).

Unfortunately, our data-driven model can't help us with that. What's missing? Causality is missing from our model! We must use our domain expertise and provide causal assumptions, i.e. a causal structure. No amount of data, clever statistical techniques, or machine-learning can replace human causal assumptions! No, not even Watson. Sorry, IBM!

Only if we have data (1) plus a formal model (2) that also contains the correct causal structure (3) we will be able to simulate our potential actions and their outcomes. Only then, we can make a decision.

So, the next time someone uses the buzzword "data-driven" in support of their decisions, you need to ask for the rest of the story.

Related Posts: