Bumping llm-d release to latest version v0.8.1 by dumb0002 · Pull Request #1337 · llm-d/llm-d-workload-variant-autoscaler

dumb0002 · 2026-06-25T16:02:30Z

This PR bumps llm-d release version v0.8.1 (latest).

Proposed Changes

🎁 Migrate EPP install to the upstream llm-d-router-standalone chart
(oci://ghcr.io/llm-d/charts/llm-d-router-standalone), replacing the
kubernetes-sigs GAIE standalone chart. Updates deploy/lib/infra_epp.sh.
🧹 Pin the new llm-d-router-standalone chart and llm-d-router-endpoint-picker
EPP image to v0.9.0 via a new LLM_D_ROUTER_VERSION variable (default v0.9.0),
separate from LLM_D_RELEASE (default v0.8.1, controls only the guide values
fetched from llm-d/llm-d). The two are released independently — the
llm-d/llm-d-router repo never published a v0.8.1 chart, so deriving the
chart version from LLM_D_RELEASE would 404. Threaded through Makefile,
CI workflows, install scripts, and poc.mk.
🧹 Bump EPP image to ghcr.io/llm-d/llm-d-router-endpoint-picker:v0.9.0
(was llm-d-inference-scheduler:v0.8.0) following the upstream rename
Inference Scheduler → llm-d Router. The old image repo no longer ships
new tags.
🐛 Reshape values overlays for the new chart: model-{a,b}-epp-values.yaml
and epp-flow-control.values.yaml now use the chart's router: wrapper
(was flat inferenceExtension:/inferencePool:) and the v0.8.0
EndpointPickerConfig apiVersion (llm-d.ai/v1alpha1). Without this,
the new chart fails to render with
.Values.inferencePool.modelServers.matchLabels is required.
🧹 Default LLM_D_RELEASE to v0.8.1 everywhere
(Makefile, ci-e2e-openshift.yaml, ci-pr-checks.yaml, deploy/README.md,
deploy/kubernetes/README.md). install-epp.sh was already at v0.8.1; the
rest were stuck at v0.7.0, so make deploy-… and direct script invocations
picked different versions.
🧹 Update remaining references to the old chart name in poc.mk,
install-epp.sh, cleanup.sh, deploy-infrastructure.sh, and
kind-emulator/README.md. Comment/docs only.

Pre-review Checklist

E2E tests for any new behavior
Docs PR for any user-facing impact
Proposal PR for any new enhancement or change to existing behavior

Release Note

  The bundled EPP image is now `ghcr.io/llm-d/llm-d-router-endpoint-picker:v0.9.0` (was `ghcr.io/llm-d/llm-d-inference-scheduler:v0.8.0`); the upstream image was
  renamed when the project rebranded to llm-d Router. The EPP Helm install also moved from the kubernetes-sigs GAIE standalone chart to the upstream
  `llm-d-router-standalone` chart at `oci://ghcr.io/llm-d/charts/llm-d-router-standalone`, with the chart version now pinned by `LLM_D_RELEASE` (default `v0.8.1`).
  action required: users overriding `LLM_D_RELEASE` to `v0.7.x` are no longer supported by `deploy/install-epp.sh`; users with custom EPP values overrides must rewrap
   them under the chart's new `router:` key (e.g. `inferenceExtension.image` → `router.epp.image`, `inferenceExtension.sidecar` → `router.proxy`). `GAIE_VERSION`
  continues to pin the GAIE CRDs independently.

Docs

Copilot

Pull request overview

This PR updates the repository’s pinned llm-d release version defaults to v0.8.0 for more consistent local deploy and CI behavior, and updates the EPP image reference in the HPA coordinator sample values to the renamed Router EPP image.

Changes:

Bump LLM_D_RELEASE default from v0.7.0 → v0.8.0 in the Makefile, CI workflows, and deployment docs.
Update coordinator sample EPP values to use ghcr.io/llm-d/llm-d-router-endpoint-picker:v0.9.0 (renamed from llm-d-inference-scheduler).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
Makefile	Updates default `LLM_D_RELEASE` used by make-driven deploy/test targets.
deploy/README.md	Updates example invocation to use `LLM_D_RELEASE=v0.8.0`.
deploy/kubernetes/README.md	Clarifies the current default `LLM_D_RELEASE` while keeping guide selection guidance intact.
config/samples/hpa/co-ordinator/model-a-epp-values.yaml	Switches sample EPP image repo/name and bumps its tag.
config/samples/hpa/co-ordinator/model-b-epp-values.yaml	Switches sample EPP image repo/name and bumps its tag.
.github/workflows/ci-pr-checks.yaml	Pins CI e2e jobs to `LLM_D_RELEASE=v0.8.0`.
.github/workflows/ci-e2e-openshift.yaml	Pins OpenShift e2e to `LLM_D_RELEASE=v0.8.0`.

dumb0002 · 2026-06-25T17:35:09Z

/ok-to-test

github-actions · 2026-06-25T17:47:54Z

🚀 Kind E2E (full) triggered by /ok-to-test

View the Kind E2E workflow run

dumb0002 · 2026-06-25T18:37:09Z

/ok-to-test

github-actions · 2026-06-25T18:37:19Z

🚀 Kind E2E (full) triggered by /ok-to-test

View the Kind E2E workflow run

github-actions · 2026-06-25T18:37:25Z

🚀 OpenShift E2E — approve and run (/ok-to-test)

View the OpenShift E2E workflow run

github-actions · 2026-06-25T18:41:27Z

GPU Pre-flight Check ✅

GPUs are available for e2e-openshift tests. Proceeding with deployment.

Resource	Total	Allocated	Available
GPUs	50	26	24

Cluster	Value
Nodes	16 (7 with GPUs)
Total CPU	993 cores
Total Memory	10383 Gi
GPUs required	4 (min) / 6 (recommended)

mamy-CS · 2026-06-29T19:38:58Z

/ok-to-test

github-actions · 2026-06-29T19:39:08Z

🚀 Kind E2E (full) triggered by /ok-to-test

View the Kind E2E workflow run

github-actions · 2026-06-29T19:39:13Z

🚀 OpenShift E2E — approve and run (/ok-to-test)

View the OpenShift E2E workflow run

github-actions · 2026-06-29T19:42:51Z

GPU Pre-flight Check ⚠️

Low GPU headroom — tests may fail during scale-up phases.

Resource	Total	Allocated	Available
GPUs			****

Cluster	Value
Nodes	( with GPUs)
Total CPU	cores
Total Memory	Gi
GPUs required	(min) / (recommended)

mamy-CS

The release note can be clarified better, as I understand it the chart version is controlled by LLM_D_ROUTER_VERSION, not LLM_D_RELEASE correct? and the default in the actual diff is v0.8.1, not v0.8.0 as the note says. Worth fixing before merge so users overriding these variables know which knob to turn.

mamy-CS · 2026-06-29T19:44:20Z

 POC_WVA_NS  := workload-variant-autoscaler-system
 POC_MON_NS  := workload-variant-autoscaler-monitoring
 GAIE_VERSION ?= v1.5.0
+LLM_D_RELEASE ?= v0.8.1


is this used? looks like only LLM_D_ROUTER_VERSION is used

Yes, it's used to select the right guides to deploy from the llm-d repo (it's used in the script infra_epp.sh).

Signed-off-by: Braulio Dumba <Braulio.Dumba@ibm.com>

dumb0002 · 2026-06-29T21:17:55Z

@mamy-CS @mamy-CS @lionelvillard, I was able to successfully run all e2e tests in my local kind cluster - see the attached logs:
WVA_Full_E2E_Tests.txt

This PR is in good shape to be merged if there are no further comments.

dumb0002 · 2026-06-29T21:18:58Z

The release note can be clarified better, as I understand it the chart version is controlled by LLM_D_ROUTER_VERSION, not LLM_D_RELEASE correct? and the default in the actual diff is v0.8.1, not v0.8.0 as the note says. Worth fixing before merge so users overriding these variables know which knob to turn.

@mamy-CS, the llm-d router version does not follow the same release number as the upstream llm-d - see https://github.com/llm-d/llm-d/releases/tag/v0.8.1 - I updated the release number to v0.8.1 as suggested.

- Delete hack/benchmark folder and hack/benchmark-jobs-template.yaml - Remove deploy_benchmark_grafana() and deploy_optional_benchmark_grafana() from infra_monitoring.sh The benchmark tooling was added in #1337 as an undocumented side effect of the llm-d release bump. It was never integrated into CI/CD workflows or actively maintained, and no team is actively using it. Removing it reduces code maintenance burden and clarifies intent. Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings June 25, 2026 16:02

Copilot started reviewing on behalf of dumb0002 June 25, 2026 16:02 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

dumb0002 requested review from asm582 and lionelvillard June 25, 2026 17:38

dumb0002 added ready-for-review Signal that changes are ready for review area/installation labels Jun 25, 2026

lionelvillard previously approved these changes Jun 25, 2026

View reviewed changes

dumb0002 enabled auto-merge (squash) June 25, 2026 17:45

dumb0002 force-pushed the epp-fixes branch from 568febc to 6e4747b Compare June 25, 2026 17:47

dumb0002 dismissed lionelvillard’s stale review via f9fb79e June 29, 2026 14:54

dumb0002 force-pushed the epp-fixes branch from f9fb79e to f0fac04 Compare June 29, 2026 14:59

dumb0002 requested a review from mamy-CS June 29, 2026 17:32

mamy-CS reviewed Jun 29, 2026

View reviewed changes

Bumping llm-d release to latest version v0.8.0

0bb373b

Signed-off-by: Braulio Dumba <Braulio.Dumba@ibm.com>

dumb0002 force-pushed the epp-fixes branch from ab84108 to 0bb373b Compare June 29, 2026 20:31

dumb0002 changed the title ~~Bumping llm-d release to latest version v0.8.0~~ Bumping llm-d release to latest version v0.8.1 Jun 29, 2026

mamy-CS approved these changes Jun 30, 2026

View reviewed changes

mamy-CS disabled auto-merge June 30, 2026 15:23

mamy-CS merged commit 46ae7b6 into llm-d:main Jun 30, 2026
21 checks passed

asm582 mentioned this pull request Jun 30, 2026

fix: remove unused benchmark tooling #1360

Merged

Uh oh!

Conversation

dumb0002 commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed Changes

Pre-review Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

dumb0002 commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

dumb0002 commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

GPU Pre-flight Check ✅

Uh oh!

mamy-CS commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

GPU Pre-flight Check ⚠️

Uh oh!

mamy-CS left a comment

Choose a reason for hiding this comment

Uh oh!

mamy-CS Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

dumb0002 Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

dumb0002 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dumb0002 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dumb0002 commented Jun 25, 2026 •

edited

Loading

dumb0002 commented Jun 29, 2026 •

edited

Loading

dumb0002 commented Jun 29, 2026 •

edited

Loading