Performance Predictor — CTL / ATL / TSB¶
Objectif : modéliser la charge d'entraînement (Fitness-Fatigue) et prédire les temps de course.
- CTL (Chronic Training Load) = forme à 42 jours
- ATL (Acute Training Load) = fatigue à 7 jours
- TSB (Training Stress Balance) = forme − fatigue = état du jour
- Pace (min/km) = cible normalisée cross-distance et cross-sport
Source des données : base MySQL Laravel (Strava + Withings + Nolio)
In [1]:
import sys
sys.path.append('..')
import pandas as pd
import plotly.io as pio
pio.renderers.default = 'notebook'
from src.data.loader import load_activities, load_competitions, load_weight_measurements
from src.features.training_load import compute_tss, build_daily_load, compute_ctl_atl_tsb, build_race_features
from src.models.performance_predictor import train, evaluate_loo, add_pace, FEATURE_COLS, OPTIONAL_FEATURES, TARGET
from src.viz.charts import chart_fitness_fatigue, chart_feature_importance, chart_predicted_vs_actual
1. Chargement des données¶
Modifier
USER_IDavec l'ID de ton utilisateur en base.
In [2]:
USER_ID = 1
activities = load_activities(user_id=USER_ID)
competitions = load_competitions(user_id=USER_ID)
weights = load_weight_measurements(user_id=USER_ID)
print(f"Activités : {len(activities)} séances")
print(f"Compétitions : {len(competitions)} courses")
print(f"Pesées : {len(weights)} mesures")
activities.head(3)
Activités : 2969 séances Compétitions : 21 courses Pesées : 618 mesures
Out[2]:
| id | user_id | type | name | start_date_local | distance | moving_time | elapsed_time | total_elevation_gain | average_speed | max_speed | average_heartrate | max_heartrate | average_cadence | average_watts | max_watts | suffer_score | distance_km | moving_time_h | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2057 | 1 | Ride | 2014/12/26_13,69_1:13'48" | 2014-12-26 03:41:00 | 13653.1 | 4414.0 | 4414.0 | 13.0 | 3.09 | 7.0 | 151.3 | 193.0 | NaN | 56.0 | NaN | 63.0 | 13.6531 | 1.226111 |
| 1 | 2059 | 1 | Run | 2014/12/26_1,19_7'14" | 2014-12-26 10:17:05 | 1228.4 | 395.0 | 395.0 | NaN | 3.11 | 8.0 | 193.2 | 207.0 | NaN | NaN | NaN | 31.0 | 1.2284 | 0.109722 |
| 2 | 2058 | 1 | Run | 2014/12/26_0,28_1'32"8 | 2014-12-26 10:28:15 | 276.4 | 93.0 | 93.0 | NaN | 2.97 | 3.5 | 173.1 | 195.0 | NaN | NaN | NaN | 3.0 | 0.2764 | 0.025833 |
2. Calcul du TSS par activité¶
In [3]:
activities = compute_tss(activities)
print("TSS par type de sport :")
tss_coverage = activities.groupby('type').agg(
total=('id', 'count'),
with_tss=('tss', lambda x: x.notna().sum()),
avg_tss=('tss', 'mean')
).round(1)
print(tss_coverage)
TSS par type de sport :
total with_tss avg_tss
type
Canoeing 2 1 63.8
Crossfit 9 0 NaN
Hike 91 29 72.5
Kayaking 6 6 68.9
Ride 715 711 26.8
Run 1283 834 72.9
StandUpPaddling 1 1 40.0
Swim 375 224 65.0
VirtualRide 292 292 31.8
Walk 34 34 40.8
WeightTraining 34 33 25.4
Workout 127 58 6.7
3. CTL / ATL / TSB¶
In [4]:
daily = build_daily_load(activities)
daily = compute_ctl_atl_tsb(daily)
print(f"Période : {daily.index.min().date()} -> {daily.index.max().date()}")
print(f"CTL max : {daily['ctl'].max():.1f} | ATL max : {daily['atl'].max():.1f}")
print(f"TSB min : {daily['tsb'].min():.1f} | TSB max : {daily['tsb'].max():.1f}")
daily.tail()
Période : 2014-12-26 -> 2026-06-21 CTL max : 99.9 | ATL max : 143.8 TSB min : -53.3 | TSB max : 32.9
Out[4]:
| tss | activity_count | ctl | atl | tsb | |
|---|---|---|---|---|---|
| date | |||||
| 2026-06-17 | 58.525634 | 1 | 77.677528 | 80.738581 | -3.061054 |
| 2026-06-18 | 92.516268 | 2 | 78.026658 | 82.306452 | -4.279794 |
| 2026-06-19 | 33.212000 | 1 | 76.972245 | 75.770895 | 1.201350 |
| 2026-06-20 | 57.431266 | 1 | 76.512479 | 73.329485 | 3.182993 |
| 2026-06-21 | 64.781501 | 1 | 76.236468 | 72.191560 | 4.044909 |
4. Visualisation Fitness-Fatigue¶
In [5]:
fig = chart_fitness_fatigue(daily, competitions)
fig.show()
5. Features de course (CTL/ATL/TSB au jour J de chaque course)¶
In [6]:
race_features = build_race_features(
competitions=competitions,
daily_load=daily,
weight_df=weights if not weights.empty else None,
)
race_features[['event_name', 'competition_date', 'sport', 'race_distance_km',
'achieved_time_h', 'ctl', 'atl', 'tsb', 'tss_sum_8w']].round(2)
Out[6]:
| event_name | competition_date | sport | race_distance_km | achieved_time_h | ctl | atl | tsb | tss_sum_8w | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Foulées du Lac Kir 2018 | 2018-04-14 | run | 5.0 | 0.37 | 14.49 | 26.43 | -11.94 | 889.52 |
| 1 | UT3C 2019 | 2019-05-18 | run | 10.0 | 0.88 | 9.73 | 2.65 | 7.08 | 731.16 |
| 2 | Semi de Paris 2022 | 2022-03-06 | run | 21.1 | 2.03 | 64.94 | 66.25 | -1.31 | 3971.53 |
| 3 | Semi de Troyes 2022 | 2022-05-15 | run | 21.1 | 2.01 | 64.48 | 76.13 | -11.65 | 3486.05 |
| 4 | Semi de Reims | 2022-10-15 | run | 21.1 | 2.11 | 48.19 | 30.55 | 17.64 | 2889.39 |
| 5 | Semi de Paris 2023 | 2023-03-05 | run | 21.1 | 2.22 | 52.11 | 72.28 | -20.18 | 2650.83 |
| 6 | Marathon de Paris 2023 | 2023-04-02 | run | 42.2 | 5.27 | 47.91 | 76.14 | -28.23 | 2255.18 |
| 7 | Ironman 70.3 Vichy 2023 | 2023-08-19 | triathlon | 113.0 | 7.44 | 57.28 | 98.57 | -41.29 | 3120.40 |
| 8 | Semi de Paris 2024 | 2024-03-03 | run | 21.1 | 2.14 | 63.64 | 74.26 | -10.62 | 3519.25 |
| 9 | RAP 300k 2024 | 2024-04-26 | bike | 300.0 | 17.45 | 56.22 | 72.73 | -16.51 | 2714.24 |
| 10 | Ironman 70.3 Sables d'Olonne 2024 | 2024-06-30 | triathlon | 113.0 | 6.48 | 55.60 | 92.33 | -36.73 | 2730.09 |
| 11 | Foulées du Petit Bleu 2024 | 2024-09-08 | run | 5.0 | 0.48 | 31.37 | 43.63 | -12.26 | 1247.97 |
| 12 | Semi Tournefeuille 2024 | 2024-10-13 | run | 21.1 | 2.43 | 43.33 | 70.66 | -27.33 | 2251.52 |
| 13 | Toulouse Run Expérience 2024 | 2024-11-10 | run | 42.2 | 5.57 | 43.35 | 80.41 | -37.06 | 2075.13 |
| 14 | Half Frenchman 2025 | 2025-05-30 | triathlon | 113.0 | 6.57 | 96.48 | 110.93 | -14.45 | 5274.09 |
| 15 | Ironman Sables d'Olonne 2025 | 2025-06-22 | triathlon | 226.3 | 14.04 | 97.20 | 143.81 | -46.60 | 4904.91 |
| 16 | Foulées du Petit Bleu 2025 | 2025-09-14 | run | 10.0 | 1.05 | 61.25 | 56.53 | 4.72 | 3009.11 |
| 17 | Semi Tournefeuille 2025 | 2025-10-12 | run | 21.1 | 2.45 | 57.23 | 60.22 | -2.99 | 2776.76 |
| 18 | Toulouse Run Expérience 2025 | 2025-11-02 | run | 42.2 | 5.79 | 52.64 | 82.93 | -30.29 | 2219.20 |
| 19 | TCS London Marathon 2026 | 2026-04-26 | run | 42.2 | 5.01 | 83.81 | 114.99 | -31.17 | 4544.00 |
| 20 | Half Frenchman 2026 | 2026-05-15 | triathlon | 113.0 | 6.38 | 78.13 | 96.38 | -18.25 | 4334.06 |
6. Vue par sport¶
In [7]:
rf_pace = add_pace(race_features.dropna(subset=['race_distance_km']))
print("Courses par sport (avec distance renseignée) :")
summary = rf_pace.groupby('sport').agg(
n_courses=('achieved_time_h', 'count'),
pace_moyen_min_km=('pace_min_per_km', 'mean'),
ctl_moyen=('ctl', 'mean'),
tsb_moyen=('tsb', 'mean'),
).round(2)
print(summary)
Courses par sport (avec distance renseignée) :
n_courses pace_moyen_min_km ctl_moyen tsb_moyen
sport
bike 1 3.49 56.22 -16.51
run 15 6.42 49.23 -13.04
triathlon 5 3.60 76.94 -31.47
7. Entraînement du modèle (cible : pace en min/km)¶
In [8]:
try:
model, importance = train(race_features)
print("Modèle entraîné sur pace (min/km).")
print(importance.to_string(index=False))
except ValueError as e:
print(f"[!] {e}")
Modèle entraîné sur pace (min/km).
feature importance
weight_kg 0.284545
fat_ratio 0.162420
tsb 0.136727
atl 0.099575
tss_days_4w 0.078722
tss_days_12w 0.075373
ctl 0.064612
tss_sum_12w 0.040487
tss_sum_8w 0.031167
tss_sum_4w 0.022150
tss_days_8w 0.004221
8. Évaluation Leave-One-Out¶
In [9]:
metrics = evaluate_loo(race_features)
n = metrics['n_samples']
print(f"Cross-validation LOO ({n} courses avec distance)")
if metrics['mae_min_per_km']:
print(f" MAE pace : {metrics['mae_min_per_km']} min/km")
print(f" MAE temps : +/- {metrics['mae_min']:.0f} min en moyenne")
else:
print(" Pas assez de données pour une évaluation robuste.")
print("")
print("Par sport :")
for sport in race_features['sport'].dropna().unique():
m = evaluate_loo(race_features, sport=sport)
if m['mae_min_per_km']:
print(f" {sport:12s}: +/- {m['mae_min']:.0f} min ({m['n_samples']} courses)")
else:
print(f" {sport:12s}: trop peu de données ({m['n_samples']} courses)")
Cross-validation LOO (12 courses avec distance) MAE pace : 1.798 min/km MAE temps : +/- 193 min en moyenne Par sport :
run : +/- 16 min (10 courses) triathlon : trop peu de données (4 courses) bike : trop peu de données (1 courses)
9. Importance des features¶
In [10]:
try:
fig = chart_feature_importance(importance)
fig.show()
except NameError:
print("Modèle non entraîné — skip.")
10. Temps prédit vs temps réel¶
In [11]:
try:
feature_cols = [c for c in FEATURE_COLS + OPTIONAL_FEATURES if c in race_features.columns]
valid = add_pace(race_features.dropna(subset=['race_distance_km']))
valid = valid.dropna(subset=feature_cols + [TARGET])
pred_pace = model.predict(valid[feature_cols].values)
pred_time_h = (pred_pace * valid['race_distance_km']) / 60
fig = chart_predicted_vs_actual(
actual=valid['achieved_time_h'],
predicted=pd.Series(pred_time_h, index=valid.index),
labels=valid['event_name'] + ' (' + valid['sport'] + ')',
)
fig.show()
except Exception as e:
import traceback
traceback.print_exc()