# Running Experiments ```{objectives} - Run the main analysis script - Use background training for long runs - Understand the command line options ``` ## Usage ### 1. Activate the Environment ```bash cd ~/wp7-UC1-climate-indices-teleconnection source venv/bin/activate ``` ### 2. Run the Main Analysis Script ```bash python scripts/lrbased_teleconnection/main.py \ --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \ --target_feature amo2 \ --modelname LinearRegression \ --max_allowed_features 6 \ --end_lag 50 ``` ### 3. Interactive Exploration with Jupyter ```bash jupyter lab # Open demonstrator-v1.orchestrator.ipynb ``` ## Command Line Options | Option | Description | Example | |--------|-------------|---------| | `--data_file` | Path to dataset CSV | `dataset/noresm-f-p1000_slow_new_jfm.csv` | | `--target_feature` | Variable to predict | `amo2`, `amo3`, `AMOCann` | | `--modelname` | ML model to use | `LinearRegression`, `RandomForestRegressor`, `MLPRegressor`, `XGBRegressor` | | `--max_allowed_features` | Max features for model | `6`, `10` | | `--end_lag` | Maximum lag in years | `50`, `100` | | `--step_lag` | Lag step size | `5` | | `--splitsize` | Train/test split ratio | `0.6` | | `--n_ensembles` | Number of ensemble runs | `10`, `100` | | `--with_mean_feature` | Include mean feature | Flag (no value) | ## Background Training (Long Runs) For long-running experiments, use tmux sessions: ```bash # Start background training tmux new-session -d -s training 'cd ~/wp7-UC1-climate-indices-teleconnection && source venv/bin/activate && \ python scripts/lrbased_teleconnection/main.py \ --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \ --target_feature amo2 \ --modelname LRforcedPSO \ --max_allowed_features 6 \ --end_lag 100 \ --n_ensembles 100 2>&1 | tee training.log' # Monitor progress tail -f training.log # Attach to session tmux attach -t training # Detach: Ctrl+B, then D ``` ## Example Experiments ### Quick Test (Linear Regression) ```bash python scripts/lrbased_teleconnection/main.py \ --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \ --target_feature amo2 \ --modelname LinearRegression \ --max_allowed_features 3 \ --end_lag 30 \ --n_ensembles 5 ``` ### Full Analysis (Multiple Models) ```bash # Linear Regression python scripts/lrbased_teleconnection/main.py \ --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \ --target_feature amo2 \ --modelname LinearRegression \ --max_allowed_features 6 \ --end_lag 100 # Random Forest python scripts/lrbased_teleconnection/main.py \ --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \ --target_feature amo2 \ --modelname RandomForestRegressor \ --max_allowed_features 6 \ --end_lag 100 # XGBoost (GPU accelerated) python scripts/lrbased_teleconnection/main.py \ --data_file dataset/noresm-f-p1000_slow_new_jfm.csv \ --target_feature amo2 \ --modelname XGBRegressor \ --max_allowed_features 6 \ --end_lag 100 ``` ## Result File Format Results are saved to the `results/` directory: | Column | Description | |--------|-------------| | `model` | Model name | | `target_feature` | Predicted variable | | `max_lag` | Maximum lag in years | | `corr_score` | Correlation coefficient | | `mae_score` | Mean Absolute Error | | `selected_features` | Features used by model | ```{keypoints} - Run experiments using main.py with appropriate parameters - Use tmux for long-running background experiments - Results are saved to the results/ directory - Use Jupyter notebook for interactive exploration ```