Skip to content

Add transform context for annotating compiler optimizations#216

Draft
gnidan wants to merge 10 commits into
mainfrom
transform-context
Draft

Add transform context for annotating compiler optimizations#216
gnidan wants to merge 10 commits into
mainfrom
transform-context

Conversation

@gnidan

@gnidan gnidan commented Jul 2, 2026

Copy link
Copy Markdown
Member

No description provided.

gnidan added 3 commits June 17, 2026 21:56
The TCO back-edge JUMP previously emitted a gather wrapper
around its invoke and return contexts. Multiple discriminator
keys can coexist on a single context object without gather
wrapping, so the JUMP now carries a flat combined context with
both `invoke` and `return` keys directly.

Updates the countCallSites helper in optimizer-contexts.test
to check invoke and return independently rather than as an
either/or, so flat multi-discriminator contexts get counted
in both buckets. The TCO-specific assertion now finds the
back-edge JUMP by the presence of both discriminators rather
than by a gather wrapper.
* format: make invoke.target optional for internal calls

Internal calls via JUMP normally carry a code pointer to the
callee's entry point. When the compiler inlines a function,
the JUMP is elided — there is no physical call instruction
and no code target to point at. The callee identity
(identifier, declaration, type) remains meaningful, but the
target pointer does not.

Same pattern as #211 (making return.data optional). Unblocks
inlining: bugc can emit invoke contexts on inlined first
instructions without fabricating a target pointer.

- Schema: drop target from InternalCall.required, expand
  description, add worked example for inlined case
- TS types: mark target optional; guard relaxed
- Spec page: document optionality and point at transform +
  gather for inlining annotation
- bugc: guard target access in patchInvokeInContext; tests
  assert target defined before dereferencing

* format: prefer flat form for invoke + transform composition

Pair with #212's flat-form guidance: when an inlined body's
first instruction carries both an invoke and a transform,
those belong as sibling keys on a single context — gather
isn't needed because `invoke` and `transform` don't collide.
* format: add transform context for compiler optimizations

Adds a new context type annotating instructions with the
compiler transformations that produced them. The value is an
array of short identifiers; the list may repeat the same
identifier when the transformation has been applied multiple
times (e.g., ["inline", "inline"] for doubly-inlined code).

Transform is *additional* annotation. The invoke/return contexts
for the logical call are still emitted at the call boundary so
debuggers see the source-level call stack; the transform context
tells debuggers how the call was physically realized. Consumers
that ignore transform contexts get a sound source-level view
from the semantic contexts alone.

v1 identifiers:
  - "inline": marked instruction is part of an inlined function
    body; surrounding invoke/return contexts name the inlined
    callee.
  - "tailcall": marked instruction is a tail-call-optimized
    back-edge JUMP or continuation, where the call was realized
    without pushing/popping a full activation.

The identifier set is extensible. Debuggers unfamiliar with a
given identifier should preserve it as an opaque label. Order
in the array is not semantically significant — the multiset is
what matters.

Unblocks the final shape of TCO back-edge annotations in
bugc (#210): a tail-call-optimized JUMP can now carry
`gather: [return, invoke, transform: ["tailcall"]]`.

Includes:
- schemas/program/context/transform.schema.yaml
- schemas/program/context.schema.yaml: wire into the if/$ref
  union.
- packages/format/src/types/program/context.ts: Context.Transform
  interface, isTransform guard, and Transform.Identifier union
  preserving autocomplete for known values.
- packages/format/src/types/program/context.test.ts: register
  Context.isTransform with the schema guard test harness.
- packages/web/spec/program/context/transform.mdx: spec page
  covering role, v1 identifiers, repetition/composition, and
  interaction with gather.

* format: expand transform v1 vocabulary with fold and coalesce

Adds two more identifiers to the v1 transform context
vocabulary, based on bugc optimizer's audit of transformations
the compiler currently performs or will perform:

  - "fold" — compile-time constant folding. The marked
    instruction carries the result (typically a PUSH) replacing
    a compute sequence that appeared in source.
  - "coalesce" — read-write merging. The marked instruction is
    part of a SHL/OR sequence (or similar) introduced by the
    compiler to combine adjacent source-level reads or writes,
    such as packing narrower fields into a single storage slot.

Together with the previously-defined "inline" and "tailcall",
this covers the four transformations bugc emits today or will
emit in the near term (inline once a function inlining pass
lands). Propagate was considered for v1 and deferred as
borderline.

Updates:
- transform.schema.yaml: description enumerates the four v1
  identifiers; examples include single-identifier cases for
  each plus combinations ["inline", "fold"], ["coalesce",
  "coalesce"].
- context.ts: Transform.Identifier union extended with "fold"
  and "coalesce" (still keeps `string & {}` for extensibility
  and autocomplete).
- transform.mdx: subsection for each identifier with a concrete
  EVM-level example, updated repetition/composition section
  with new combinations.

* format: prefer flat context composition, document gather scope

The context schema's discriminator keys combine via allOf of
if/then rules, so a single context object can carry multiple
keys at once (e.g., `invoke`, `return`, and `transform` all
side by side). Use gather only when two contexts would collide
on the same key.

- transform spec: switch the TCO back-edge example from gather
  to the flat form; revise the tailcall bullet accordingly
- transform schema: note in the description that flat
  composition is preferred; gather is for key collisions
- gather spec: add a "When to use" section flagging the flat
  form as the default and listing the canonical collision
  cases (multiple frames, multiple variables blocks)
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://ethdebug.github.io/format/pr-preview/pr-216/

Built to branch gh-pages at 2026-07-02 04:51 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

gnidan added 7 commits July 1, 2026 23:00
The TCO back-edge JUMP already carries a flat context with both
invoke (the new iteration's call) and return (the previous
iteration's return). Add a third sibling key,
transform: ["tailcall"], marking the instruction as a
tail-call-optimized back-edge.

This is an additive annotation: it does not replace the
invoke/return pair (which state the source-level facts) but
tells debuggers the pair was realized as a TCO back-edge rather
than a real frame push/pop, so they can avoid inventing a
spurious frame. Consumers that ignore transform contexts still
get a sound source-level view from invoke/return alone.

Widens the emitted context type to Return & Invoke & Transform
and extends the optimizer-contexts test to assert the back-edge
JUMP carries transform containing "tailcall".
Add tailcall (transform context) support to the trace widgets:

- extractTransformFromInstruction: gather/pick-aware collector for
  compiler transform identifiers (duck-typed until #212's guard lands)
- extractCallInfoFromInstruction: attach isTailCall when a tailcall
  transform is present alongside the invoke/return
- buildCallStack: a TCO back-edge carries both return and invoke on
  one instruction; replace the top frame in place (reuse) instead of
  popping to empty, and mark it isTailCall. Fixes a real call-stack
  correctness bug for tail-recursive loops.
- CallStackDisplay: tail-call chip on the reused frame
- CallInfoPanel: tail-call banner variant
- Propagate isTailCall through ResolvedCallFrame / ResolvedCallInfo
- CSS (+ web theme copies) for the transform/tailcall styling

Tested: 9 new unit tests in mockTrace.test.ts covering extraction,
the isTailCall flag, and frame replacement. Does not touch the docs
TraceDrawer opt level or examples (held for product decisions).
Add two self-tail-recursive BUG programs (accumulator sum and
factorial) and a Tail-call optimization section to the tracing page.
Both programs fold under bugc's level-2 optimizer (verified: the
recursive call terminator is eliminated and replaced with a loop
trampoline), so they exercise the new tailcall transform context in
the tracer widget.

The section explains how a TCO back-edge JUMP composes return, invoke,
and transform: ["tailcall"] as sibling keys on one context (the flat
form), and how a debugger can reconcile that with the source-level
call stack.

Also register program/context/transform in the web schemaIndex; it was
missing (gather was present), which broke the docs build for #212's
transform spec page.
In transform.mdx and gather.mdx, the first reference to frame contexts
now links to /spec/program/context/frame, matching the existing
[`gather`](...) link precedent. Frame is the one composition concept a
reader reaching these pages may not have met yet.
cloneFunction dropped the optional Ir.Function.loc and sourceId
fields, returning only { name, parameters, entry, blocks }. Since
the first optimization pass clones the module, every function lost
its declaration source info from optimization level 1 upward.

evmgen gates declaration emission on func.loc && func.sourceId, so
all invoke/return contexts lost their declaration source ranges at
optimized levels — measurably 3/3 declared at level 0, 0/3 at
levels 1-3 on the widget's runtimeInstructions path.

Copy loc/sourceId in cloneFunction's return so declarations survive
optimization. Adds a regression test asserting invoke/return
contexts still carry declaration at levels 1, 2, and 3 (with a
level-0 baseline).
…n panels (#222)

* web: fix tracer-drawer opcodes/state panels not filling height

The trace panels live in a flex:1 grid whose implicit row was auto-
sized to content, so dragging the drawer taller left dead space below
the panels instead of growing them. Give the grid an explicit 1fr row
and min-height:0 (on the grid and its items) so both panels absorb the
added vertical space and scroll internally.

* web: add optimizer-level selector (O0/O2) to tracer drawer

The drawer hardcoded optimizer level 0. Add an O0/O2 toggle in the
drawer header that recompiles + retraces at the chosen level, so
readers can flip to level 2 and watch optimizer transforms (e.g. the
tailcall annotation on TCO back-edges) appear. compileAndTrace now
takes the level explicitly; a ref mirrors the state so the example-
load effect reads the current level without re-running on toggle.

* programs-react: cover the flat (production) TCO back-edge shape

bugc #217 emits the TCO back-edge as a single flat context object
(return + invoke + transform keys together), but the existing tests
only exercised the gather shape. Add flat-shape variants for transform
extraction, the isTailCall flag, and frame replacement, plus a guard
that stripping the marker (the #10 failure mode) drops tail-call
handling.

* web: dedupe call stack + tailcall render + right-column panels

- Reuse the shared, tailcall-aware buildCallStack / extractCallInfo /
  extractTransform from @ethdebug/programs-react via a thin adapter
  (bugc .debug.context -> ethdebug format shape); drop the drawer's
  inline call-stack builder and local extractCallInfo.
- Render the tail-call chip on the reused call-stack frame and a
  tail-call variant on the call-info banner.
- Right column (gnidan's picks 1/2/4): resolved variable values
  (name: value via pointer resolution), gas remaining + per-step delta,
  and a transform annotations panel with per-tag glosses. Sections are
  now collapsible.

* web: widen optimizer selector to O0/O1/O2/O3

gnidan's call: expose all four bugc optimizer levels (each distinct —
L1 fold/prop/DCE, L2 +CSE/TCO/jump-opt, L3 +merging) rather than a
two-state O0/O2 toggle, future-proofing for other transforms. The
recompile+retrace already took the level; just widen the control from
two buttons to four (mapped over OPT_LEVELS, per-level tooltips). The
tailcall demo still lands on O2.
Update the tail-call-optimization walkthrough to reference the tracer
drawer's Opt optimization-level selector (O0-O3), telling readers to
compare O0 with O2 (TCO kicks in at level 2), instead of the generic
"optimizer control" wording.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant