Skip to content

frontend: workload: Fix Pods overview chart to reflect failing pods#6114

Open
xonas1101 wants to merge 1 commit into
kubernetes-sigs:mainfrom
xonas1101:fix/workloads-pods-chart-health
Open

frontend: workload: Fix Pods overview chart to reflect failing pods#6114
xonas1101 wants to merge 1 commit into
kubernetes-sigs:mainfrom
xonas1101:fix/workloads-pods-chart-health

Conversation

@xonas1101

Copy link
Copy Markdown
Contributor

What this fixes

Fixes #6090.

On the Workloads overview, the Pods status chart always reported 100% / "N Running", even when pods were in Error, CrashLoopBackOff, ImagePullBackOff, Pending, etc. — it was structurally incapable of ever showing an unhealthy pod.

Root cause

WorkloadCircleChart computes the failed count with a replica-mismatch check:

workloadData?.filter(item => getReadyReplicas(item) !== getTotalReplicas(item)).length

getReadyReplicas/getTotalReplicas only read replica fields (status.readyReplicas, status.numberReady, spec.replicas, status.currentNumberScheduled, status.desiredNumberScheduled). A Pod has none of these, so both helpers return 0, the check is always 0 !== 0false, and the Pods tile can only ever render 100%. This is correct for replica-based kinds (Deployment/StatefulSet/DaemonSet/ReplicaSet) but wrong for Pods.

Approach

  • Add an optional categorize() classifier to WorkloadCircleChart. When provided (Pods only), the chart renders a multi-segment health ring instead of the binary one:
    • 🟢 healthy — Running & Ready, or Succeeded
    • 🟡 degraded — Running but not Ready
    • transitional — Pending / Terminating (not failures)
    • 🔴 failed — Failed phase, or a container in CrashLoopBackOff / ImagePullBackOff / etc.
  • Replica-based tiles keep their existing binary behavior unchanged.
  • getPodHealth() derives the category from the same signals the Pods list already shows (phase, readiness, container waiting reason, deletionTimestamp), so the chart and the list can't disagree. Notably, Terminating and Pending pods are transitional, not failures, which avoids a false-alarm red arc.

Before / after

On a cluster with 89 pods (50 Running&Ready, 13 not-ready, 19 Pending/Terminating, 7 genuinely failing):

Center Legend
Before 100% 89 Running
After 56.2% 50 Running · 7 Failed · 13 Degraded · 19 Other

Testing

  • Updated the Overview storyshot (Pods tile now reflects real health instead of a flat 100%).
  • Charts storyshots for the replica-based tiles are unchanged (binary path untouched).
  • tsc and eslint clean; i18n strings extracted.

Copilot AI review requested due to automatic review settings June 20, 2026 08:53
@k8s-ci-robot k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 20, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xonas1101
Once this PR has been reviewed and has the lgtm label, please assign sniok for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from illume and sniok June 20, 2026 08:53
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 20, 2026
@xonas1101 xonas1101 force-pushed the fix/workloads-pods-chart-health branch from 5d4f3d7 to 0c400a6 Compare June 20, 2026 08:55
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 20, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the Workloads overview “Pods” status chart so it can reflect unhealthy pods by introducing pod-specific health categorization, while keeping replica-based workload tiles on the existing binary replica-mismatch behavior.

Changes:

  • Added an optional categorize() path to WorkloadCircleChart to render a multi-segment health ring (healthy/degraded/transitional/failed).
  • Implemented getPodHealth() to bucket pods based on phase/readiness/container state and wired it into the Pods tile.
  • Added the new i18n key Other and updated the Overview storyshot snapshot accordingly.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
frontend/src/components/workload/Overview.tsx Wires categorize() for Pods so the chart uses pod-aware health bucketing.
frontend/src/components/workload/Charts.tsx Adds categorized multi-segment rendering and legend breakdown support.
frontend/src/components/pod/List.tsx Introduces getPodHealth() and health-category typing used by the overview chart.
frontend/src/components/workload/snapshots/Overview.Workloads.stories.storyshot Updates snapshot to reflect the new multi-segment Pods chart output.
frontend/src/i18n/locales/en/translation.json Adds Other translation key used in the new breakdown label.
frontend/src/i18n/locales/ar/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/bn/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/de/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/es/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/fr/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/he/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/hi/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/it/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/ja/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/ko/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/pt/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/ru/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/ta/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/ur/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/zh/translation.json Adds Other key (fallback to English when untranslated).
frontend/src/i18n/locales/zh-tw/translation.json Adds Other key (fallback to English when untranslated).

Comment thread frontend/src/components/pod/List.tsx Outdated
Comment thread frontend/src/components/pod/List.tsx Outdated
Comment thread frontend/src/components/workload/Overview.tsx Outdated
The Workloads overview reuses WorkloadCircleChart for Pods, but its
default health logic counts an item as failed when getReadyReplicas !=
getTotalReplicas. Pods have none of those replica fields, so both
helpers return 0, the check is always false, and the Pods tile can only
ever render 100% Running regardless of Error/CrashLoopBackOff/
ImagePullBackOff/Pending pods.

Give the chart an optional per-item categorize() classifier. When
provided (Pods only), it renders a multi-segment health ring instead of
the binary one: healthy / degraded (running, not ready) / transitional
(Pending, Terminating) / failed. Replica-based tiles keep their existing
binary behavior unchanged.

Add Pod.getHealth() to derive the category from the same signals the
Pods list already shows (phase, readiness, container waiting/terminated
reason, deletionTimestamp), so the chart and the list can't disagree.
Terminating and Pending pods are treated as transitional rather than
failures (NodeLost stays unhealthy), avoiding false-alarm red arcs. The
helper and the shared WorkloadHealthCategory type live in the k8s model
layer so pages don't need to import the heavy pod list module.

Ref: kubernetes-sigs/headlamp issue 6090
Signed-off-by: xonas1101 <aarushsingh1305@gmail.com>
@xonas1101 xonas1101 force-pushed the fix/workloads-pods-chart-health branch from 0c400a6 to 070c55e Compare June 20, 2026 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

frontend: Workloads overview "Pods" chart always shows 100% and never reflects failing pods

3 participants