Can causality be determined from a data set or a set of observations?

less than 1 minute read

It is not possible to determine causality in a data set. To derive causality you need to look outside data science.

Correlations can be a helpful tool to intuitively understand a data set. Thinking about what the correlation actually means in reality can help you understand the data better (for example if data science tells me that the “being thirsty” variable is correlated with the “not drinking” variable, I can determine what the causality is just through common sense!) And from there you can make more informed decisions about what to do next.

Ultimately, what’s real is the what matters and data science is mostly for corroboration and getting unprecedented insights which we cannot see directly in real. But once data science is implemented, those plausible insights make more sense aligning with reality.