Module 3.1: Causal Effect Estimation

25 min Prerequisites: Core Skills modules

What You'll Learn

  1. Difference between causal discovery and effect estimation
  2. Using the CausalEffects class
  3. Estimating direct, total, and mediated effects
  4. Answering "what if" intervention questions

Discovery vs. Effect Estimation

QuestionToolOutput
"Does X cause Y?"PCMCI (Discovery)Graph (yes/no)
"How MUCH does X affect Y?"CausalEffectsNumber (effect size)
"What if we increase X by 10?"CausalEffectsPredicted change in Y

Discovery finds the structure. Effect estimation quantifies the strength.

Setup: Import Libraries

import numpy as np
import matplotlib.pyplot as plt
from tigramite import data_processing as pp
from tigramite import plotting as tp
from tigramite.pcmci import PCMCI
from tigramite.independence_tests.parcorr import ParCorr
from tigramite.causal_effects import CausalEffects
from tigramite.toymodels import structural_causal_processes as toys
from sklearn.linear_model import LinearRegression

Create Data with Known Effects

# Create a system with KNOWN causal effects
np.random.seed(42)

def lin_f(x): return x

# True coefficients:
# X0(t-1) → X0(t) with strength 0.5
# X0(t-1) → X1(t) with strength 0.6 (the effect we want to estimate!)
# X1(t-1) → X1(t) with strength 0.4
# X1(t-1) → X2(t) with strength 0.7

true_links = {
    0: [((0, -1), 0.5, lin_f)],
    1: [((1, -1), 0.4, lin_f), ((0, -1), 0.6, lin_f)],  # X0 → X1 = 0.6
    2: [((2, -1), 0.3, lin_f), ((1, -1), 0.7, lin_f)],  # X1 → X2 = 0.7
}

T = 2000
data, _ = toys.structural_causal_process(true_links, T=T, seed=42)
var_names = ['Advertising', 'WebTraffic', 'Sales']
dataframe = pp.DataFrame(data, var_names=var_names)

# Scenario: Marketing Analysis
# - Advertising (X0) affects Web Traffic (X1)
# - Web Traffic (X1) affects Sales (X2)

Step 1: Discover the Causal Graph

# First, discover the causal structure
parcorr = ParCorr(significance='analytic')
pcmci = PCMCI(dataframe=dataframe, cond_ind_test=parcorr, verbosity=0)
results = pcmci.run_pcmciplus(tau_max=3, pc_alpha=0.05)

# Print and visualize
pcmci.print_significant_links(
    p_matrix=results['p_matrix'],
    val_matrix=results['val_matrix'],
    alpha_level=0.01
)

tp.plot_graph(
    graph=results['graph'],
    val_matrix=results['val_matrix'],
    var_names=var_names,
    figsize=(8, 5)
)
plt.show()

Step 2: Estimate Causal Effects

Now we'll quantify: "If we increase Advertising by 1 unit, how much does Web Traffic change?"

# Define the causal question
# X = cause (Advertising at lag 1)
# Y = effect (Web Traffic at current time)

X = [(0, -1)]  # Advertising at t-1
Y = [(1, 0)]   # Web Traffic at t

# Initialize CausalEffects with the discovered graph
causal_effects = CausalEffects(
    graph=results['graph'],
    graph_type='stationary_dag',  # Type of graph from PCMCIplus
    X=X,
    Y=Y,
    tau_max=3,
    verbosity=0
)
# Fit the effect model using linear regression
causal_effects.fit_total_effect(
    dataframe=dataframe,
    estimator=LinearRegression(),
)

# Estimate the effect of a 1-unit increase in Advertising
intervention = np.array([[1.0]])  # Increase Advertising by 1
effect = causal_effects.predict_total_effect(intervention_data=intervention)

# Result:
# Estimated causal effect: ~0.6
# True causal effect: 0.6
# Interpretation: Increasing Advertising by 1 unit
# causes Web Traffic to increase by ~0.6 units

Total vs. Direct Effects

Consider the path: Advertising → Web Traffic → Sales

  • Direct effect: Advertising → Web Traffic (0.6)
  • Indirect effect: Advertising → Web Traffic → Sales
  • Total effect of Advertising on Sales = Direct + Indirect paths
# Estimate total effect of Advertising on Sales
X = [(0, -1)]  # Advertising at t-1
Y = [(2, 0)]   # Sales at t

causal_effects_sales = CausalEffects(
    graph=results['graph'],
    graph_type='stationary_dag',
    X=X, Y=Y,
    tau_max=3,
    verbosity=0
)

causal_effects_sales.fit_total_effect(
    dataframe=dataframe,
    estimator=LinearRegression()
)

effect_on_sales = causal_effects_sales.predict_total_effect(
    intervention_data=np.array([[1.0]])
)

# This includes both direct and indirect paths

Intervention Scenarios: "What If" Analysis

# What if we increase advertising by different amounts?
interventions = np.linspace(-2, 2, 20).reshape(-1, 1)

effects = []
for intervention in interventions:
    effect = causal_effects.predict_total_effect(
        intervention_data=intervention.reshape(1, -1)
    )
    effects.append(effect[0, 0])

# Plot intervention effects
plt.figure(figsize=(10, 5))
plt.plot(interventions, effects, 'b-', linewidth=2)
plt.xlabel('Change in Advertising')
plt.ylabel('Predicted Change in Web Traffic')
plt.title('Intervention Analysis: What If We Change Advertising?')
plt.show()

# This plot answers: 'If we change advertising by X, how does traffic change?'

Key Takeaways

  1. CausalEffects quantifies HOW MUCH a cause affects an outcome
  2. Total effect includes all causal paths (direct + indirect)
  3. Intervention analysis answers "what if" questions
  4. Requires a known graph - run discovery first, then estimate effects
  5. Uses adjustment sets internally to remove confounding