Today’s deep learning models excel at prediction but lack the human-like ability of causal reasoning. While we’ve witnessed significant progress in machine learning applied to big data problems – such as computer vision, natural language processing, and consumer segmentation – some researchers argue that data alone is insufficient to develop machine intelligence on the level of human cognition. This viewpoint leads us to consider the human brain’s proficiency in causal thinking, paving the way for developing causal models for both virtual and physical intelligent systems.
Causal machine learning, also known as causal inference, refers to the cause-and-effect relationships within data. A simple example is as follows. Consider two sets of schools, one that provides tablets to their students and one that does not. In comparing average grades, it becomes apparent that schools with tablets tend to outperform those without. A traditional machine learning model might interpret this data to suggest a correlation, or association, between tablets and academic performance. This model would conclude that providing students with tablets will increase academic performance. However, we intuitively recognize that there are other factors, called confounding variables: schools with tablets might be better funded, located in more affluent areas, or provide superior teaching staff and resources. Simply put, providing tablets to students doesn’t automatically translate to improved academic results, considering these external factors. This example demonstrates the key difference between association and causation, emphasizing the importance of causal inference.
Two key concepts that separate causal inference from standard deep learning models are intervention and counterfactuals. The former refers to interacting with and changing a system rather than relying solely on past observed data. By making deliberate changes to a system, we may observe the causal implications that come as a result. Counterfactuals pose the question, “what if?” As we operate within the real world, counterfactuals compare the observed world to a fictional world with the goal of considering what might have happened under different circumstances. Even as children, we are constantly performing causal reasoning to learn about the world. For example, consider a child learning to water a plant:
1. Association
a. Action: The child notices that after the plant is watered, the withering leaves appear revitalized.
b. Causal Reasoning: The child associates watering with improving the plant’s appearance.
2. Intervention
a. Action: The child decides to water the plant themselves once they see the leaves begin to wither.
b. Causal Reasoning: The child learns a cause-and-effect relationship. The intervention (watering) consistently results in an effect (plant appearance).
3. Counterfactual
a. Action: The child wonders, “What would happen to the plant if it was never watered?”
b. Causal Reasoning: Without direct testing, the child can infer that if they do not water the plant, its health would continue to decline, and it would die. This counterfactual thinking reinforces the child’s understanding of the importance of watering for the plant’s well-being.
This post has introduced the concept and basic elements of causal inference. Here we have presented how conventional machine learning models are limited to association, while causal learning extends to intervention and counterfactuals. In my next post, I will extend this topic to robotics, discussing its importance for intelligent robotic systems and its integration into robotic control.
Image based on illustration by Maayan Harel from The Book of Why, by Judea Pearl, 2018.