perf(task-detail): cache cloud diff stats by content identity by pauldambra · Pull Request #2804 · PostHog/code

pauldambra · 2026-06-20T19:30:48Z

Problem

Opening a large cloud task froze the renderer (100% CPU, no loaders — synchronous work). extractCloudToolChangedFiles line-diffs every changed file via getDiffStats, and it re-runs whenever the session events array changes identity. A big session log loads/streams in chunks, so every changed file was re-diffed on every chunk — O(files × updates).

Found by loading a production database (766 workspaces, 5.9 GB of logs, biggest single conversation 449 MB / 203k events) into a dev instance and CPU-profiling task loads over CDP. getDiffStats was the single biggest frame — 28% / ~2,088ms on the worst load.

Change

The diff content objects are stable across these recomputes (they're the session events' own objects), so cache getDiffStats results in a WeakMap keyed on the diff object. Repeated recomputes become O(new diffs) instead of O(all files). Output is unchanged — the cache only avoids recompute.

Measurement

Same CDP profile, before/after:

getDiffStats: ~2,088ms (28% of the worst load) → ~0 (dropped out of the hot path entirely).
A representative large cloud task: 4,101ms → 1,282ms blocked main-thread time on load.

A parameterised test pins the diff-stat values (added/removed across new/modified/removal) and that reusing the same diff object returns identical results.

Scope

This is the cloud-conversation half of the freeze. The sidebar half (re-rendering all task rows on navigation) is a separate PR (memoizing TaskRow). Residual cost on the most extreme 449 MB task is now diffuse (log parsing, file-search indexing) with no single dominant frame — a possible future follow-up is incrementalizing buildCloudEventSummary, but the dominant freeze is resolved.

🤖 Generated with Claude Code

Opening a large cloud task froze the renderer. `extractCloudToolChangedFiles` line-diffs every changed file via `getDiffStats`, and it re-runs whenever the session `events` array changes - which during load/stream of a big log happens on every chunk. So every changed file was re-diffed on every update: O(files x updates), the dominant cost (28% / ~2s) when opening the worst tasks. The diff content objects are stable across these recomputes (they are the session events' own objects), so cache `getDiffStats` by diff-object identity in a WeakMap. Repeated recomputes become O(new diffs) instead of O(all files). Measured on a production-scale dataset (CPU profile over CDP, task-load blocked main-thread time): `getDiffStats` dropped from ~2088ms / 28% of the worst load to ~0 (out of the hot path); a representative large cloud task fell 4101ms -> 1282ms blocked. Results are unchanged - the cache only avoids recompute - and a parameterised test pins the diff-stat values plus same-object reuse. This is the cloud-conversation half of the sidebar/cloud freeze pair; the sidebar half is addressed separately by memoizing the task-list rows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-20T19:31:24Z

React Doctor found no issues in the changed files. 🎉

_{Reviewed by React Doctor for commit d46ed41.}

greptile-apps · 2026-06-20T19:33:12Z

Comments Outside Diff (1)

packages/core/src/task-detail/cloudToolChanges.test.ts, line 112-130 (link)

Cache test only proves idempotency, not caching

The test named "returns identical stats when the same diff object is reused (cache)" verifies that two calls with the same content reference return equal linesAdded/linesRemoved values. That assertion would pass even if the WeakMap were removed and cachedDiffStats delegated directly to getDiffStats every time — both calls produce the same result regardless. To actually guard the caching contract, the test would need to spy on getDiffStats and assert it is called exactly once across the two extractCloudToolChangedFiles invocations.

Prompt To Fix With AI

This is a comment left during a code review.
Path: packages/core/src/task-detail/cloudToolChanges.test.ts
Line: 112-130

Comment:
**Cache test only proves idempotency, not caching**

The test named `"returns identical stats when the same diff object is reused (cache)"` verifies that two calls with the same `content` reference return equal `linesAdded`/`linesRemoved` values. That assertion would pass even if the `WeakMap` were removed and `cachedDiffStats` delegated directly to `getDiffStats` every time — both calls produce the same result regardless. To actually guard the caching contract, the test would need to spy on `getDiffStats` and assert it is called exactly once across the two `extractCloudToolChangedFiles` invocations.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/core/src/task-detail/cloudToolChanges.test.ts:112-130
**Cache test only proves idempotency, not caching**

The test named `"returns identical stats when the same diff object is reused (cache)"` verifies that two calls with the same `content` reference return equal `linesAdded`/`linesRemoved` values. That assertion would pass even if the `WeakMap` were removed and `cachedDiffStats` delegated directly to `getDiffStats` every time — both calls produce the same result regardless. To actually guard the caching contract, the test would need to spy on `getDiffStats` and assert it is called exactly once across the two `extractCloudToolChangedFiles` invocations.

_{Reviews (1): Last reviewed commit: "perf(task-detail): cache cloud diff stat..." | Re-trigger Greptile}

Review (paul + xp) noted the previous "cache" test asserted value-equality, which passes even with the WeakMap removed (getDiffStats is pure) - it tested purity, not memoization. Export cachedDiffStats and assert a cache hit returns the same instance (and a distinct-but-equal diff object recomputes), so the test now fails if the cache regresses. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

pauldambra · 2026-06-20T19:44:09Z

Note

🤖 Automated comment by QA Swarm — not written by a human

Clean-room review: qa-team, paul-reviewer, xp-reviewer, security-audit — all on Opus.

Verdict: ✅ APPROVE (unanimous)

All four approved; this is a clean, correct, well-targeted memoization.

Verified (the load-bearing correctness question)

The cache key — the diff content object — is sound. All four reviewers independently traced it: getDiffStats reads only oldText/newText, which live on the key object itself; the session store appends events (sessionStore push, never mutates in place), so already-seen diff objects keep identity across the buildCloudEventSummary recomputes — the WeakMap hits, and a cached value can never go stale relative to its key. WeakMap is the right structure (self-evicting, no leak). Behaviour is value-identical; only recompute is avoided. security-audit: no cross-session contamination, no findings.

Actioned (convergent nit, paul + xp) — fixed in `d46ed41`

The original cache test asserted value-equality, which passes even with the WeakMap removed (since getDiffStats is pure) — it tested purity, not the cache. Exported cachedDiffStats and rewrote the test to assert a cache hit returns the same instance (and a distinct-but-equal diff object recomputes), so it now fails if the cache regresses. This also pins qa-team's shared-return-reference observation.

Non-blocking notes (left as-is)

Module-level WeakMap is a process-global, but it's a pure memoization table (value derived from key), not mutable shared state — safe; reviewers agreed leaving it.
xp/qa-team noted the deeper fix is incrementalizing buildCloudEventSummary (so the whole summary isn't rebuilt each update) — a possible follow-up; the cache is a correct, contained win for now.

Reviewer	Verdict
🔍 qa-team	APPROVE, risk NONE. Cache key sound, no leak, behaviour preserved.
👤 paul	High trust, ship it. Verified the key holds; assert instance identity (done).
📐 xp	Correctly diagnosed + built; tighten the cache test (done).
🛡 security-audit	No findings.

Automated by QA Swarm — not a human review

github-actions

WeakMap-based memoization of diff stats by object identity — leak-safe, correct, well-tested. No architectural violations or risk to production.

pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jun 20, 2026

github-actions Bot approved these changes Jun 20, 2026

View reviewed changes

pauldambra requested a review from a team June 20, 2026 19:59

tatoalo approved these changes Jun 23, 2026

View reviewed changes

tatoalo merged commit 0d9885a into main Jun 23, 2026
26 checks passed

tatoalo deleted the perf/cache-cloud-diff-stats branch June 23, 2026 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(task-detail): cache cloud diff stats by content identity#2804

perf(task-detail): cache cloud diff stats by content identity#2804
tatoalo merged 2 commits into
mainfrom
perf/cache-cloud-diff-stats

pauldambra commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 20, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

pauldambra commented Jun 20, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pauldambra commented Jun 20, 2026

Problem

Change

Measurement

Scope

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comments Outside Diff (1)

Uh oh!

pauldambra commented Jun 20, 2026

Verdict: ✅ APPROVE (unanimous)

Verified (the load-bearing correctness question)

Actioned (convergent nit, paul + xp) — fixed in d46ed41

Non-blocking notes (left as-is)

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 20, 2026 •

edited

Loading

greptile-apps Bot commented Jun 20, 2026 •

edited

Loading

Actioned (convergent nit, paul + xp) — fixed in `d46ed41`