This vignette demonstrates the advanced features of Athlytics for sports physiology analysis:
These features make Athlytics suitable for both individual training analysis and multi-athlete research studies.
Before calculating physiological metrics, it’s crucial to ensure data
quality. The flag_quality() function detects common issues
in activity stream data.
library(Athlytics)
library(dplyr)
# Parse stream data from a FIT/TCX/GPX file
stream_data <- parse_activity_file("path/to/activity.fit")
# Flag quality issues
flagged_data <- flag_quality(
stream_data,
sport = "Run",
hr_range = c(30, 220), # Valid HR range
pw_range = c(0, 1500), # Valid power range
max_run_speed = 7.0, # Max speed (m/s) ≈ 2:23/km pace
max_hr_jump = 10, # Max HR change per second
max_pw_jump = 300, # Max power change per second
min_steady_minutes = 20, # Min duration for steady-state
steady_cv_threshold = 8 # CV% threshold for steady-state
)
# View quality summary
summary_stats <- summarize_quality(flagged_data)
print(summary_stats)Output interpretation: - quality_score:
Overall quality (0-1), where 1 = perfect - flagged_pct:
Percentage of data points with issues - steady_state_pct:
Percentage suitable for EF/decoupling calculations
Without quality control: - HR spikes from sensor errors inflate HRSS calculations - GPS drift creates artificially high speeds, affecting pace metrics - Non-steady-state segments contaminate Efficiency Factor (EF) calculations
With quality control: - Only clean data contributes to metrics - Steady-state detection ensures valid EF/decoupling calculations - Researchers can report data quality in publications
Traditional ACWR uses rolling averages (RA). The EWMA method, introduced by Williams et al. (2017), offers: - Greater sensitivity to recent training changes - Customizable decay rates via half-life parameters - Confidence bands via bootstrap to quantify uncertainty
Reference: Williams, S., et al. (2017). Better way to determine the acute:chronic workload ratio? British Journal of Sports Medicine, 51(3), 209-210. DOI: 10.1136/bjsports-2016-096589
# Load activities
activities <- load_local_activities("export_12345678.zip")
# Calculate ACWR using Rolling Average (traditional)
acwr_ra <- calculate_acwr_ewma(
activities,
activity_type = "Run",
method = "ra",
acute_period = 7,
chronic_period = 28,
load_metric = "duration_mins"
)
# Calculate ACWR using EWMA
acwr_ewma <- calculate_acwr_ewma(
activities,
activity_type = "Run",
method = "ewma",
half_life_acute = 3.5, # Acute load half-life (days)
half_life_chronic = 14, # Chronic load half-life (days)
load_metric = "duration_mins"
)
# Compare the two methods
library(ggplot2)
ggplot() +
geom_line(
data = acwr_ra, aes(x = date, y = acwr_smooth),
color = "blue", linewidth = 1, linetype = "solid"
) +
geom_line(
data = acwr_ewma, aes(x = date, y = acwr_smooth),
color = "red", linewidth = 1, linetype = "dashed"
) +
labs(
title = "ACWR: Rolling Average vs EWMA",
subtitle = "Blue = RA (7:28d) | Red = EWMA (3.5:14d half-life)",
x = "Date", y = "ACWR"
) +
theme_minimal()Key differences: - RA: Equal weight to all days in window; sudden jumps when old workouts exit window - EWMA: Exponential decay; smooth transitions; more responsive to recent changes
Confidence bands quantify uncertainty in ACWR estimates, essential for research reporting.
# Calculate EWMA with 95% confidence bands
acwr_ewma_ci <- calculate_acwr_ewma(
activities,
activity_type = "Run",
method = "ewma",
half_life_acute = 3.5,
half_life_chronic = 14,
load_metric = "duration_mins",
ci = TRUE, # Enable confidence intervals
B = 200, # Bootstrap iterations
block_len = 7, # Block length (days) for bootstrap
conf_level = 0.95 # 95% CI
)
# Plot with confidence bands
ggplot(acwr_ewma_ci, aes(x = date)) +
geom_ribbon(aes(ymin = acwr_lower, ymax = acwr_upper),
fill = "gray70", alpha = 0.5
) +
geom_line(aes(y = acwr_smooth), color = "black", linewidth = 1) +
geom_hline(yintercept = c(0.8, 1.3), linetype = "dotted", color = "green") +
geom_hline(yintercept = 1.5, linetype = "dotted", color = "red") +
labs(
title = "ACWR with 95% Confidence Bands",
subtitle = "Gray band = bootstrap confidence interval",
x = "Date", y = "ACWR"
) +
theme_minimal()Interpretation: - Gray band: 95% confidence interval from moving-block bootstrap - Wider bands = more uncertainty (e.g., during high load variability) - Narrower bands = more stable estimates
The confidence bands use moving-block bootstrap: 1. Daily load sequence is resampled in weekly blocks (preserves within-week correlation) 2. ACWR is recalculated for each bootstrap sample (B = 200 iterations) 3. 2.5th and 97.5th percentiles form the 95% confidence band
This approach is computationally efficient and doesn’t require distributional assumptions.
When analyzing multiple athletes (teams, research cohorts), it’s valuable to compare individuals to group norms.
# Load data for multiple athletes
athlete1 <- load_local_activities("athlete1_export.zip") |>
mutate(athlete_id = "athlete1", sex = "M", age_band = "25-35")
athlete2 <- load_local_activities("athlete2_export.zip") |>
mutate(athlete_id = "athlete2", sex = "F", age_band = "25-35")
athlete3 <- load_local_activities("athlete3_export.zip") |>
mutate(athlete_id = "athlete3", sex = "M", age_band = "35-45")
# Combine
cohort_data <- bind_rows(athlete1, athlete2, athlete3)
# Calculate ACWR for each athlete
cohort_acwr <- cohort_data |>
group_by(athlete_id) |>
group_modify(~ calculate_acwr_ewma(.x, method = "ewma")) |>
ungroup() |>
left_join(
cohort_data |> select(athlete_id, sex, age_band) |> distinct(),
by = "athlete_id"
)# Calculate cohort reference by activity type and sex
reference <- calculate_cohort_reference(
cohort_acwr,
metric = "acwr_smooth",
by = c("type", "sex"), # Group by activity type and sex
probs = c(0.05, 0.25, 0.5, 0.75, 0.95), # Percentiles
min_athletes = 3 # Minimum athletes per group
)
# View reference structure
head(reference)Output:
# A tibble: 6 × 5
date type sex percentile value n_athletes
<date> <chr> <chr> <chr> <dbl> <int>
1 2024-01-01 Run M p05 0.65 5
2 2024-01-01 Run M p25 0.82 5
3 2024-01-01 Run M p50 0.98 5
4 2024-01-01 Run M p75 1.15 5
5 2024-01-01 Run M p95 1.38 5
# Extract one athlete's data
individual <- cohort_acwr |>
filter(athlete_id == "athlete1")
# Plot with cohort reference
p <- plot_with_reference(
individual = individual,
reference = reference |> filter(sex == "M"), # Match athlete's group
metric = "acwr_smooth",
bands = c("p25_p75", "p05_p95", "p50")
)
print(p)Interpretation: - Black line: Individual athlete’s ACWR trend - Dark shaded area (P25-P75): “Normal” range (50% of cohort) - Light shaded area (P5-P95): Extended range (90% of cohort) - Dashed line (P50): Median (cohort center)
Use cases: - Outlier detection: Athletes consistently above P95 or below P5 - Load matching: Ensure similar load profiles when comparing injury rates - Benchmarking: Compare individual’s trajectory to team norms
Here’s a complete research workflow combining all features:
library(Athlytics)
library(dplyr)
library(ggplot2)
# --- Step 1: Load multi-athlete cohort ---
cohort_files <- c("athlete1_export.zip", "athlete2_export.zip", "athlete3_export.zip")
cohort_data <- lapply(cohort_files, function(file) {
load_local_activities(file) |>
mutate(athlete_id = tools::file_path_sans_ext(basename(file)))
}) |> bind_rows()
# --- Step 2: Quality control (conceptual - for stream data) ---
# If you have stream data, flag quality before calculating metrics
# stream <- parse_activity_file("path/to/activity.fit")
# flagged <- flag_quality(stream, sport = "Run")
# summary <- summarize_quality(flagged)
# --- Step 3: Calculate ACWR with EWMA + CI ---
cohort_acwr <- cohort_data |>
group_by(athlete_id) |>
group_modify(~ calculate_acwr_ewma(
.x,
method = "ewma",
half_life_acute = 3.5,
half_life_chronic = 14,
load_metric = "duration_mins",
ci = TRUE,
B = 100 # Reduced for speed in example
)) |>
ungroup()
# --- Step 4: Calculate cohort reference ---
reference <- calculate_cohort_reference(
cohort_acwr,
metric = "acwr_smooth",
by = c("type"),
probs = c(0.05, 0.25, 0.5, 0.75, 0.95),
min_athletes = 3
)
# --- Step 5: Visualize individual with confidence + cohort ---
individual <- cohort_acwr |> filter(athlete_id == "athlete1")
p <- plot_with_reference(individual, reference, metric = "acwr_smooth") +
# Add individual's confidence bands
geom_ribbon(
data = individual,
aes(x = date, ymin = acwr_lower, ymax = acwr_upper),
fill = "blue", alpha = 0.2
)
print(p)
# --- Step 6: Export for statistical analysis ---
write.csv(cohort_acwr, "cohort_acwr_with_ci.csv", row.names = FALSE)
write.csv(reference, "cohort_reference.csv", row.names = FALSE)| Method | Pros | Cons | Recommended Use |
|---|---|---|---|
| RA (Rolling Average) | Stable, well-established, simple interpretation | Sudden jumps, equal weight to all days | Descriptive studies, replicating prior work |
| EWMA | Smooth, responsive, customizable decay | More parameters, less established norms | Monitoring, sensitive change detection |
Default parameters: - RA: 7-day acute, 28-day chronic (most common in literature) - EWMA: 3.5-day acute half-life, 14-day chronic half-life (equivalent decay)
⚠️ Critical distinction: Percentile bands are descriptive population variability, NOT confidence intervals for an individual’s “true” value.
Correct interpretation: - “Athlete was above the 75th percentile of the cohort” ✅ - “Athlete’s ACWR was significantly higher than cohort” ❌ (requires statistical test)
Minimum sample sizes: - P25-P75: n ≥ 5 athletes - P5-P95: n ≥ 20 athletes (for stable extremes)
Q: How do I choose EWMA half-life?
A: Start with defaults (3.5:14d ≈ 7:28d RA). For more responsive
monitoring, reduce both proportionally (e.g., 2.5:10d). For smoother
trends, increase (e.g., 5:20d).
Q: Bootstrap is slow. Can I reduce B?
A: Yes. B = 50–100 is often sufficient for visual inspection. For
publication, use B = 200–500.
Q: What if my cohort has < 5 athletes?
A: Set min_athletes = 3, but report this limitation.
Percentiles with n < 5 are unstable.
Q: Can I use these methods for injury
prediction?
A: No. These are descriptive monitoring
tools. Injury prediction requires proper case-control studies,
controlling for confounders, and validation. See: Carey, D. L., et
al. (2018). Predictive modelling of training loads and injury in
Australian football. International Journal of Computer Science in
Sport, 17(1), 49-66. DOI:
10.2478/ijcss-2018-0002; Impellizzeri, F. M., et al. (2021). What
role do chronic workloads play in the acute to chronic workload ratio?
Time to dismiss ACWR and its underlying theory. Sports
Medicine, 51(3), 581-592. DOI:
10.1007/s40279-020-01378-6
Q: How do I cite Athlytics?
A: Use the standard software citation format with version number and
repository URL. See the main tutorial for the complete citation.
sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] rmarkdown_2.30
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 fastmap_1.2.0 xfun_0.56
#> [5] maketools_1.3.2 cachem_1.1.0 knitr_1.51 htmltools_0.5.9
#> [9] buildtools_1.0.0 lifecycle_1.0.5 cli_3.6.5 sass_0.4.10
#> [13] jquerylib_0.1.4 compiler_4.5.2 sys_3.4.3 tools_4.5.2
#> [17] evaluate_1.0.5 bslib_0.10.0 yaml_2.3.12 jsonlite_2.0.0
#> [21] rlang_1.1.7