Created comprehensive multi-objective modeling system: **6 Prediction Tasks:** 1. Match Winner (Binary Classification) - Who wins the match? 2. Map Winner (Binary Classification) - Who wins this specific map? 3. Team 1 Score (Regression) - Predict exact round score for team 1 4. Team 2 Score (Regression) - Predict exact round score for team 2 5. Round Difference (Regression) - Predict score margin 6. Total Maps (Regression) - Predict number of maps in match **Implementation:** - Updated preprocessing to generate all target variables - Created train_multitask.py with separate models per task - Classification tasks use Random Forest Classifier - Regression tasks use Random Forest Regressor - All models logged to MLflow experiment 'csgo-match-prediction-multitask' - Metrics tracked per task (accuracy/precision for classification, MAE/RMSE for regression) - Updated DVC pipeline to use new training script **No Data Leakage:** - All features are pre-match only (rankings, map, starting side) - Target variables properly separated and saved with 'target_' prefix This enables comprehensive match analysis and multiple betting/analytics use cases. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
33 lines
731 B
YAML
33 lines
731 B
YAML
stages:
|
|
preprocess:
|
|
cmd: python src/data/preprocess.py
|
|
deps:
|
|
- src/data/preprocess.py
|
|
- data/raw
|
|
params:
|
|
- preprocess.test_size
|
|
- preprocess.random_state
|
|
outs:
|
|
- data/processed/features.csv
|
|
- data/processed/train.csv
|
|
- data/processed/test.csv
|
|
metrics:
|
|
- data/processed/data_metrics.json:
|
|
cache: false
|
|
|
|
train:
|
|
cmd: python src/models/train_multitask.py
|
|
deps:
|
|
- src/models/train_multitask.py
|
|
- data/processed/train.csv
|
|
- data/processed/test.csv
|
|
params:
|
|
- train.n_estimators
|
|
- train.max_depth
|
|
- train.random_state
|
|
outs:
|
|
- models/
|
|
metrics:
|
|
- models/metrics.json:
|
|
cache: false
|