Large Causal Models on Time Series


  • Foundation for Research and Technology, Hellas (FORTH)
  • Huawei Ireland Research Center - AIOps Team
  • Huawei Dongguan R&D Campus

Large Causal Model

A Large Causal Model (LCM) for causal discovery is a model that unveils the cause-and-effect relationships between variables in complex systems. These models are particularly useful for understanding how changes in one factor lead to changes in another, rather than just identifying correlations. Additionally, they help identify which variables influence others, especially in complex or high-dimensional datasets where traditional methods might miss key connections.

Approach Discover causal connections Fast inference Scale to large data quantity
Traditional Causal Discovery Methods Yes No No
Large Causal Models Yes Yes Yes

How does it work?

The LCM takes as input a dataset of time series and automatically predicts the full-time graph representing the causal relationships between the time series over a time window period. Such graph is then condensed in a simpler representation, the summary graph, in which each time series is represented as a node, and each discovered cause-effect relationship is represented as a directed link.

Prediction Pipeline

Try it yourself!

0. Setup

The setup phase is divided into three steps:

Currently, two pretrained LCM models of different sizes are available:

1. Import the Required Libraries

At first, import the necessary modules for data generation, model prediction, and result visualization:


        from pathlib import Path
        from utils.causal_model import CausalModel # architecture module 
        from utils.data_utils import create_example_data # example data creation module
        from utils.plotting_utils import plot_summary_from_pred, plot_summary_graph # plotting module
        

2. Load the Data

The LCM takes as input a temporal dataset of shape (N, D) where N is the sample size and D the feature size (number of time-series). In this example, we generate synthetic data with 1000 time samples, where each column represents a different time series. Data are Min-max normalized and random seed set to 42 for reproducibility.


        set_seed(42)
        
        df = create_example_data(n=1000)
        variable_names = list(df.columns)
        

3. Load the Pretrained Model

Load the .ckpt pretrained model for causal prediction:


        models_path = 'res'
        model_name = 'lcm_CI_RH_12_3_merged_290k'
        
        model = CausalModel(model_name = model_name, model_path = Path(models_path) / f"{model_name}.ckpt") 
        

4. Perform Causal Discovery

Run model.predict to perform causal discovery on the data. The max_lag parameter specifies the maximum time window size for analyzing causal relationships:


        # Run causal discovery with a maximum lag of 1
        pred = model.predict(df, max_lag_to_predict = 1)
        

The result is a lagged adjacency tensor of shape (N, N, max_lag) where:

5. Visualize the Results

The predicted causal relationships can be visualized using plot_summary_from_pred. The plt_thr parameter controls the density of the graph: higher values result in fewer edges being displayed.


        plot_summary_from_pred(pred, variable_names, plt_thr=0.25)
        

In the resulting graph, an edge from time series A to B marked as t-1 means that time series A at time t-1 caused time series B at time t.

Output plot of the summary graph.

6. Alternative Causal Discovery Method

As an alternative to using a specific causal model or threshold, the get_best_graph method can be applied. This method evaluates all available models and thresholds and returns the causal graph that optimally represents the relationships in the dataset.


        import utils.prediction_utils as pu
        G = pu.get_best_graph(df, models_folder = models_path)
        plot_summary_graph(G, variable_names)
        

Publications

The following publications have emerged from this research collaboration:

Assumptions and Limitations of current model

The last version of the LCMs works under the current assumptions: