Case Study: Smart Building Energy Analysis

30 min Prerequisites: All previous modules

Scenario

You're a data scientist at a smart building company. The building manager complains about unexpected energy spikes. Your job: find what's causing them.

Available Data

  • Temperature: Indoor temperature sensor
  • Occupancy: Number of people in building
  • HVAC_Status: Heating/cooling system state
  • Energy: Total energy consumption (our target!)
  • Outdoor_Temp: External temperature

Questions to Answer

  1. What variables CAUSE energy consumption?
  2. How strong are these effects?
  3. What interventions could reduce energy use?

Setup

import numpy as np
import matplotlib.pyplot as plt
from tigramite import data_processing as pp
from tigramite import plotting as tp
from tigramite.pcmci import PCMCI
from tigramite.independence_tests.parcorr import ParCorr
from tigramite.causal_effects import CausalEffects
from sklearn.linear_model import LinearRegression

np.random.seed(42)

Step 1: Generate Realistic Smart Building Data

We'll simulate data with a known causal structure to validate our analysis.

# Simulate smart building data (hourly measurements for 30 days)
T = 24 * 30  # 720 hours

# Hour of day (for realistic patterns)
hour = np.tile(np.arange(24), 30)

# Outdoor temperature (follows daily pattern)
outdoor_temp = 20 + 10 * np.sin(2 * np.pi * hour / 24 - np.pi/2) + np.random.randn(T) * 2

# Occupancy (high during work hours)
occupancy = np.zeros(T)
for t in range(T):
    h = hour[t]
    if 9 <= h <= 18:  # Work hours
        occupancy[t] = 50 + np.random.randn() * 10
    else:
        occupancy[t] = 5 + np.random.randn() * 3
occupancy = np.clip(occupancy, 0, None)

# Indoor temperature (affected by outdoor temp and HVAC)
# HVAC turns on based on occupancy and temperature
# Energy consumption is driven by HVAC, Occupancy, and Outdoor_Temp

var_names = ['Indoor_Temp', 'Occupancy', 'HVAC', 'Energy', 'Outdoor_Temp']

Step 2: Explore the Data (ALWAYS FIRST!)

# Create DataFrame
dataframe = pp.DataFrame(data, var_names=var_names)

# Plot time series
tp.plot_timeseries(dataframe, figsize=(14, 10))
plt.suptitle('Smart Building Sensor Data', fontsize=14)
plt.show()
# Check for linear relationships
tp.plot_scatterplots(dataframe=dataframe, figsize=(12, 12))
plt.suptitle('Checking Linearity of Relationships', fontsize=14)
plt.show()

# Relationships look reasonably linear → ParCorr is appropriate
# Find optimal tau_max using lag function
parcorr = ParCorr(significance='analytic')
pcmci = PCMCI(dataframe=dataframe, cond_ind_test=parcorr, verbosity=0)

correlations = pcmci.get_lagged_dependencies(tau_max=24, val_only=True)['val_matrix']
tp.plot_lagfuncs(
    val_matrix=correlations,
    setup_args={'var_names': var_names, 'x_base': 6},
    figsize=(14, 10)
)
plt.suptitle('Lag Function - Choose tau_max where effects decay', fontsize=14)
plt.show()

# Effects seem to decay by lag 6-8 → use tau_max=8

Step 3: Discover Causal Structure

# Run PCMCI
results = pcmci.run_pcmciplus(tau_max=8, pc_alpha=0.05)

# Apply FDR correction
q_matrix = pcmci.get_corrected_pvalues(
    p_matrix=results['p_matrix'],
    tau_max=8,
    fdr_method='fdr_bh'
)

print("Discovered Causal Links (FDR corrected):")
pcmci.print_significant_links(
    p_matrix=q_matrix,
    val_matrix=results['val_matrix'],
    alpha_level=0.05
)
# Visualize the causal graph
corrected_graph = pcmci.get_graph_from_pmatrix(
    p_matrix=q_matrix,
    alpha_level=0.05,
    tau_min=0,
    tau_max=8
)
results['graph'] = corrected_graph

tp.plot_graph(
    graph=results['graph'],
    val_matrix=results['val_matrix'],
    var_names=var_names,
    figsize=(10, 8),
    link_colorbar_label='MCI Strength',
    node_colorbar_label='Auto-MCI',
    show_autodependency_lags=True
)
plt.title('Causal Graph: What Drives Energy Consumption?')
plt.show()

Step 4: Quantify Causal Effects on Energy

# What are the causal parents of Energy?
energy_idx = 3
print("Causal drivers of Energy consumption:")

for i in range(5):
    for tau in range(9):
        if results['graph'][energy_idx, i, tau] == '-->':
            val = results['val_matrix'][energy_idx, i, tau]
            pval = q_matrix[energy_idx, i, tau]
            print(f"  {var_names[i]}(t-{tau}) → Energy: strength={val:.3f}, q={pval:.4f}")
# Estimate the effect of HVAC on Energy
X = [(2, -1)]  # HVAC at lag 1
Y = [(3, 0)]   # Energy at current time

causal_effects = CausalEffects(
    graph=results['graph'],
    graph_type='stationary_dag',
    X=X, Y=Y,
    tau_max=8,
    verbosity=0
)

causal_effects.fit_total_effect(
    dataframe=dataframe,
    estimator=LinearRegression()
)

# What if we reduce HVAC activity by 0.5 units?
hvac_reduction = np.array([[-0.5]])
energy_change = causal_effects.predict_total_effect(intervention_data=hvac_reduction)

print(f"\nIntervention Analysis:")
print(f"If we reduce HVAC activity by 0.5 units...")
print(f"Predicted energy reduction: {-energy_change[0, 0]:.2f} units")

Step 5: Actionable Insights

KEY FINDINGS

  1. HVAC system is the PRIMARY driver of energy consumption
  2. Occupancy has a moderate direct effect
  3. Outdoor temperature indirectly affects energy (via HVAC)
  4. Indoor temperature is EFFECT, not cause (don't optimize for it!)

RECOMMENDATIONS

  1. Optimize HVAC scheduling - biggest impact potential
  2. Pre-cool/heat building before peak occupancy
  3. Implement smart occupancy detection for HVAC control
  4. Weather-based predictive HVAC management

WARNING - DO NOT

  • Use indoor temperature as a control variable (it's an effect!)
  • Assume correlation = causation without this analysis

Conclusion

In this case study, we:

  1. Explored the data thoroughly before analysis
  2. Discovered the true causal drivers of energy consumption
  3. Quantified the effect of potential interventions
  4. Generated actionable recommendations

Key insight: Without causal analysis, we might have wrongly targeted indoor temperature (which is an EFFECT of HVAC, not a cause of energy use).

Congratulations!

You've completed the Tigramite beginner tutorial. You now have the skills to:

  • Prepare data for causal analysis
  • Choose appropriate tests and methods
  • Discover causal relationships
  • Quantify causal effects
  • Make data-driven intervention decisions

Happy causal discovery!