Running Experiments

Objectives

  • Run the main analysis script

  • Use background training for long runs

  • Understand the command line options

Usage

1. Activate the Environment

cd ~/wp7-UC1-climate-indices-teleconnection
source venv/bin/activate

2. Run the Main Analysis Script

python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname LinearRegression \
    --max_allowed_features 6 \
    --end_lag 50

3. Interactive Exploration with Jupyter

jupyter lab
# Open demonstrator-v1.orchestrator.ipynb

Command Line Options

Option

Description

Example

--data_file

Path to dataset CSV

dataset/noresm-f-p1000_slow_new_jfm.csv

--target_feature

Variable to predict

amo2, amo3, AMOCann

--modelname

ML model to use

LinearRegression, RandomForestRegressor, MLPRegressor, XGBRegressor

--max_allowed_features

Max features for model

6, 10

--end_lag

Maximum lag in years

50, 100

--step_lag

Lag step size

5

--splitsize

Train/test split ratio

0.6

--n_ensembles

Number of ensemble runs

10, 100

--with_mean_feature

Include mean feature

Flag (no value)

Background Training (Long Runs)

For long-running experiments, use tmux sessions:

# Start background training
tmux new-session -d -s training 'cd ~/wp7-UC1-climate-indices-teleconnection && source venv/bin/activate && \
python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname LRforcedPSO \
    --max_allowed_features 6 \
    --end_lag 100 \
    --n_ensembles 100 2>&1 | tee training.log'

# Monitor progress
tail -f training.log

# Attach to session
tmux attach -t training

# Detach: Ctrl+B, then D

Example Experiments

Quick Test (Linear Regression)

python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname LinearRegression \
    --max_allowed_features 3 \
    --end_lag 30 \
    --n_ensembles 5

Full Analysis (Multiple Models)

# Linear Regression
python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname LinearRegression \
    --max_allowed_features 6 \
    --end_lag 100

# Random Forest
python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname RandomForestRegressor \
    --max_allowed_features 6 \
    --end_lag 100

# XGBoost (GPU accelerated)
python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname XGBRegressor \
    --max_allowed_features 6 \
    --end_lag 100

Result File Format

Results are saved to the results/ directory:

Column

Description

model

Model name

target_feature

Predicted variable

max_lag

Maximum lag in years

corr_score

Correlation coefficient

mae_score

Mean Absolute Error

selected_features

Features used by model

Keypoints

  • Run experiments using main.py with appropriate parameters

  • Use tmux for long-running background experiments

  • Results are saved to the results/ directory

  • Use Jupyter notebook for interactive exploration