Skip to content

Track CI section timings and chart trends over commits#1080

Open
adrienbernede wants to merge 5 commits into
developfrom
woptim/ci-timings-tracking
Open

Track CI section timings and chart trends over commits#1080
adrienbernede wants to merge 5 commits into
developfrom
woptim/ci-timings-tracking

Conversation

@adrienbernede

@adrienbernede adrienbernede commented Jun 27, 2026

Copy link
Copy Markdown
Member

Summary

scripts/gitlab/build_and_test.sh already prints collapsible per-section timings into the job log. This PR additionally persists those timings as data and charts how each section's duration evolves across commits, so we can spot build/test regressions over time.

The chart is what was envisioned: one cumulative (stacked-area) graph per (machine, job) — commit on the X axis, seconds on the Y axis, one band per top-level section — published to GitLab Pages and regenerated by a dedicated job after the per-machine child pipelines.

What changed

  • scripts/gitlab/build_and_test.shsection_start/section_end now also capture each section's title, nesting depth, and parent. A new write_timings_file() helper emits a machine-readable timings.json (commit SHA, machine, job, spec, pipeline id, exit_code, total seconds, and a sections[] array). It is registered as a trap … EXIT so timings are written even when the build fails (each completed section is recorded before any early exit).
  • .gitlab/custom-jobs.yml — adds timings.json to the per-job artifacts and sets artifacts:when: always so the file (and junit.xml) is uploaded even on job failure.
  • .gitlab-ci.yml — new pages job in the radiuss-spack-testing stage, gated to the default branch.
  • scripts/gitlab/collect_and_chart_timings.py (new) — collects each child job's timings.json via the GitLab API, merges into a persistent history.jsonl, renders the charts, and writes an index.html.

Design choices

  • Why a custom chart job, not GitLab's native performance tracking. GitLab's artifacts:reports:metrics / browser-performance reports only render as merge-request-vs-target-branch widgets (and metrics reports are Premium) — they do not produce a cross-commit time series. The "evolution over commits" view therefore has to be built: collect → persist → render.
  • Persistence without an extra branch or token. Each pipeline's artifacts are isolated and expire, so the accumulated history can't live in a single pipeline. Instead the pages job's own history.jsonl artifact is the store: each default-branch run downloads the latest default-branch pipeline's history.jsonl, appends the new data points (deduped by sha|machine|job, newest ts wins), and re-publishes it. This relies on GitLab's "Keep artifacts from the most recent successful pipelines" setting (default on); expire_in: never is set as belt-and-suspenders. No data branch, no push token.
  • Data format. Per-job JSON + JSONL history — easy to append and to feed a plotting script. A "schema": 1 version field is included for forward-compatibility if the record shape changes.
  • Scope. Timings are collected from every child job on all machines; history accumulation + charting happen only on the default branch, so the trend line tracks mainline.
  • Robustness. needs: on the trigger jobs is optional: true, so the pages job still runs (charting whatever exists) when a machine block is commented out or its availability check fails. Failed builds upload partial timings; each record carries exit_code for optional downstream filtering.

Verification done locally

  • bash -n and py_compile pass; both YAML files parse.
  • Drove the real write_timings_file with populated records → valid JSON with correct nesting and quote-escaping (incl. the failure/EXIT-trap path).
  • Ran the collector in offline mode end-to-end: history merged/deduped/sorted, per-(machine, job) PNGs + index.html generated, and the stacked chart renders as intended (nested depth-1 sections correctly excluded from the stack).

To verify on real CI / left to do

  • Child-pipeline completion. The pages job assumes the *-build-and-test trigger jobs propagate downstream completion (GitLab trigger:strategy: depend). The RADIUSS .build-and-test template is expected to set this; if child artifacts are missing in pages, add strategy: depend to the trigger blocks. (Flagged in a comment on the job.)
  • Runner prerequisites. The [shell, oslic] runner needs python3 + outbound network for the venv/pip install matplotlib step and the GitLab API calls. Confirm GitLab Pages is enabled and the keep-latest-artifacts setting is on.
  • (Follow-up, not in this PR) Cache the Python env instead of reinstalling matplotlib each run — e.g. a pinned requirements-charts.txt + a cache: block, or a runner image with matplotlib preinstalled.
  • (Optional) Filter failed builds out of trend lines using exit_code, if partial-timing data proves noisy.

🤖 Generated with Claude Code

adrienbernede and others added 5 commits June 27, 2026 17:30
build_and_test.sh now writes a machine-readable timings.json (per-section
durations) via an EXIT trap, so timings are captured even on failure.
A new parent-pipeline `pages` job collects these across child pipelines,
merges them into a persistent history.jsonl, and publishes per-job
stacked-area trend charts to GitLab Pages.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop the optional needs from the pages job; the build-and-test trigger
jobs already inherit trigger:strategy: depend from the shared-CI
.build-and-test template, so the build-and-test stage completes only once
all child pipelines finish. The pages job, in the last stage, therefore
runs after them without a DAG dependency (and avoids the empty-optional-
needs early-start case).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant