Downloads & Quick Reference

Important

This repository is self-contained. After cloning, run ./setup.sh to set up the environment automatically.

Setup Script

Run after cloning to set up the environment:

git clone https://github.com/NAICNO/wp7-UC1-climate-indices-teleconnection.git
cd wp7-UC1-climate-indices-teleconnection
./setup.sh

Quick Reference Card

Available Models

Model Name	Description
`LinearRegression`	Simple baseline for linear relationships
`RandomForestRegressor`	Ensemble model with decision trees
`MLPRegressor`	Neural network for non-linear patterns
`XGBRegressor`	Gradient boosting with GPU support

Command Line Options

Option	Default	Description
`--data_file`	Required	Path to dataset CSV
`--target_feature`	Required	Variable to predict (e.g., amo2)
`--modelname`	Required	ML model to use
`--max_allowed_features`	6	Maximum features for model
`--end_lag`	50	Maximum lag in years
`--step_lag`	5	Lag step size
`--splitsize`	0.6	Train/test split ratio
`--n_ensembles`	10	Number of ensemble runs
`--with_mean_feature`	False	Include mean feature (flag)

Available Datasets

Dataset	Description
`dataset/noresm-f-p1000_slow_new_jfm.csv`	Historical SLOW forcing (850-2005 AD)
`dataset/noresm-f-p1000_shigh_new_jfm.csv`	Historical HIGH forcing (850-2005 AD)
`dataset/noresm-f-p1000_picntrl_new_jfm.csv`	Pre-industrial control (1000 years)

Common Target Features

amo1, amo2, amo3, AMOCann, amoSSTann, naoPSLjfm, ensoSSTjfm

Example Commands

Quick Test (< 1 minute)

python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname LinearRegression \
    --max_allowed_features 3 \
    --end_lag 30 \
    --n_ensembles 5

Full Analysis with Random Forest

python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname RandomForestRegressor \
    --max_allowed_features 6 \
    --end_lag 100

Background Training (tmux)

tmux new-session -d -s training 'source venv/bin/activate && \
python scripts/lrbased_teleconnection/main.py \
    --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \
    --target_feature amo2 \
    --modelname XGBRegressor \
    --max_allowed_features 6 \
    --end_lag 100 2>&1 | tee training.log'

# Monitor: tail -f training.log
# Attach: tmux attach -t training

For AI Coding Assistants

If you’re using an AI coding assistant (Claude Code, GitHub Copilot, Cursor, etc.), the repository includes machine-readable instruction files:

AGENT.md - Markdown format (human and agent readable)
AGENT.yaml - YAML format (structured data for programmatic parsing)

These files contain step-by-step instructions that agents can follow to:

Set up the environment on the VM
Run the Jupyter notebook
Execute command-line experiments
Verify results

Quick prompt for your AI assistant:

Read AGENT.md and help me run the teleconnections demonstrator on my NAIC VM.
VM IP: <your_vm_ip>
SSH Key: <path_to_your_key.pem>

The agent will execute the setup and run experiments based on the structured instructions.