Alexis Bruneteau efaf5ff0e1 Fix critical data leakage in feature engineering
Removed features that contain match outcome information:
- result_1, result_2 (actual match scores - only known after match)
- ct_1, t_2, t_1, ct_2 (rounds won per side - only known after match)
- total_rounds, round_diff (derived from results)

These features caused perfect 1.0 accuracy because the model was
essentially "cheating" by knowing the match outcome.

Now using only pre-match information:
- Team rankings (rank_1, rank_2)
- Historical map performance (map_wins_1, map_wins_2)
- Starting side (starting_ct)
- Derived: rank_diff, map_wins_diff

This will give realistic model performance based on what would
actually be known before a match starts.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-01 20:01:46 +02:00
2025-09-30 17:03:15 +02:00
2025-09-30 17:03:15 +02:00
2025-10-01 17:35:13 +02:00
2025-10-01 15:04:13 +02:00
2025-09-30 17:04:43 +02:00
2025-09-30 16:38:14 +02:00
2025-09-30 15:48:38 +02:00
2025-10-01 15:04:13 +02:00
2025-10-01 15:04:13 +02:00

MLOps Project

This is an MLOps project for CSGO data analysis and model training.

Features

  • Data pipeline with Apache Airflow
  • Model training with PyTorch and scikit-learn
  • MLflow for experiment tracking
  • DVC for data versioning
  • Monitoring with Prometheus
  • FastAPI for API serving

Setup

  1. Install dependencies:

    poetry install
    
  2. Run the data pipeline:

    airflow dags unpause csgo_data_pipeline
    

Project Structure

  • dags/: Airflow DAGs
  • src/: Source code
  • models/: Trained models
  • data/: Data files
  • notebooks/: Jupyter notebooks
  • tests/: Test files
  • config/: Configuration files
  • docker/: Docker files
  • kubernetes/: Kubernetes manifests
Description
No description provided
Readme 350 KiB
Languages
Python 73.3%
Typst 25.9%
Dockerfile 0.8%