Bound HTTP transaction lifetime by kenhuuu · Pull Request #3484 · apache/tinkerpop

kenhuuu · 2026-06-26T19:48:10Z

Bound HTTP transaction lifetime: suspend idle timer while busy + add `maxTransactionLifetime`

What changed

Two related changes to how Gremlin Server bounds the lifetime of an HTTP transaction (each open transaction owns a dedicated worker thread and a maxConcurrentTransactions slot, so an unbounded transaction is a real resource
leak).

Idle timer suspends while busy. The transactionTimeout idle timer was armed on request arrival, so a single operation running longer than the timeout would trip it mid-execution and roll back a perfectly healthy
transaction. It now arms only when the transaction goes genuinely idle (no operation running or queued) and is suspended while work is in flight — a long operation is bounded by evaluationTimeout, not the idle timer. The
setting is renamed transactionTimeout → idleTransactionTimeout to reflect what it actually means, and 0 now correctly disables it (the docs always claimed this; the code never honored it).

New maxTransactionLifetime absolute cap. A new setting that bounds total transaction age regardless of activity, closing the gap where a client holds a transaction open indefinitely via one long operation or
keep-alive drips. When it fires it interrupts the running operation and rolls the transaction back, and the in-flight client receives a transaction-timeout (504) rather than a misleading "increase evaluationTimeout" error.

Why

The committed idle-timer behavior acted more like a per request timeout instead of the described idle timeout and there was no ceiling on transaction lifetime at all. Together these give three composable bounds —
per-operation (evaluationTimeout), between-operations (idleTransactionTimeout), and whole-transaction (maxTransactionLifetime) — mirroring PostgreSQL's statement_timeout / idle_in_transaction_session_timeout /
transaction_timeout.

Notably, the server does not validate these settings or reject begins when bounds are disabled. Instead it ships sane defaults (idle 1 min, lifetime 10 min): a transaction is bounded out of the box, disabling the bounds
is a deliberate operator choice, and a client's per-request timeoutMs is always honored as sent rather than second-guessed.

Review guide

UnmanagedTransaction — the executor swap. The single-thread executor is now a ThreadPoolExecutor(1,1) subclass purely to expose beforeExecute/afterExecute + the queue, which drive the suspend-while-busy logic.
The key invariant: submitted tasks must not be wrapped — submit() returns the same FutureTask so the eval-timeout / cap cancel(true) interrupts the real work.
Concurrency on the idle/cap timers. maybeScheduleIdleTimer re-checks accepting after arming (so a concurrent close() can't be raced into re-arming a dying transaction); the in-flight op is tracked as a single
immutable Running(future, context) pair so the cap never flags one operation's Context while interrupting another's future.
Ownership split (intentional asymmetry). The idle timer lives in UnmanagedTransaction (it must see the executor hooks); the lifetime cap is scheduled/cancelled by TransactionManager (a fixed schedule tied to
registry membership). The cap is armed after putIfAbsent so it can never fire into an unregistered transaction and leak a thread; destroy() cancels it on every close path.
close() ordering is still load-bearing — manager.destroy() before executor.shutdown(), graceful shutdown() (not shutdownNow()). The cap path reuses this exact path; verify it's unchanged.
Error mapping — cap-kill → 504 TransactionException via a Context.closedByLifetimeCap flag set before the interrupt; ordinary eval timeout still → 500. Both the eval-timeout writer and formErrorResponseMessage are
cap-aware so the code is correct regardless of which thread writes the response first.
Tests — deterministic timer behavior is unit-tested via a virtual-clock ManualScheduledExecutorService (no Thread.sleep flakiness); integration tests assert the guarantee actually made (transaction reclaimed /
subsequent 404), since timing can't reliably catch the mid-op interrupt.

VOTE +1

xiazcy · 2026-06-27T19:16:24Z

VOTE +1

Cole-Greer · 2026-06-29T17:04:04Z

VOTE +1

The idle timeout was armed on request arrival rather than when the transaction went idle, so a single operation running longer than the timeout tripped it mid-execution, contradicting the documented promise that active transactions are unaffected. A long operation should be bounded by evaluationTimeout; the idle timer should only reclaim abandoned transactions. The per-transaction executor is now a ThreadPoolExecutor(1,1) whose before/afterExecute hooks suspend the idle timer while work runs and re-arm it only once the worker parks with an empty queue. This gives a reliable running-vs-idle signal without wrapping submitted tasks, which would break the evaluation-timeout interrupt that relies on cancelling the real FutureTask. transactionTimeout is renamed to idleTransactionTimeout to reflect its actual meaning (renamed outright as the feature is unreleased), and now honors 0 as "disabled" to match its documentation. Assisted-by: Claude Code:claude-opus-4-8

The idle timeout only reclaims transactions that go quiet; a client could still hold a transaction (and its dedicated worker thread and concurrency slot) open indefinitely with a single long operation or a keep-alive drip. maxTransactionLifetime bounds total transaction age regardless of activity: when it fires it interrupts the running operation and rolls the transaction back, so the in-flight client gets a transaction-timeout (504) rather than a misleading evaluation-timeout error. Rather than validate timeout configuration and fail begins (or silently override a client's timeoutMs) when bounds are disabled, the server ships sane defaults instead: idle reclamation at 1 minute and a lifetime cap at 10 minutes. A transaction is bounded out of the box, disabling the bounds is a deliberate operator choice, and a per-request timeoutMs is always honored as sent rather than second-guessed on the client's behalf. Assisted-by: Claude Code:claude-opus-4-8

kenhuuu added 2 commits June 30, 2026 12:21

kenhuuu force-pushed the tx-idle-to branch from 53e5ba1 to a5e94a6 Compare June 30, 2026 19:23

kenhuuu merged commit 0a149b8 into master Jun 30, 2026
47 of 48 checks passed

kenhuuu deleted the tx-idle-to branch June 30, 2026 20:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bound HTTP transaction lifetime#3484

Bound HTTP transaction lifetime#3484
kenhuuu merged 2 commits into
masterfrom
tx-idle-to

kenhuuu commented Jun 26, 2026 •

edited

Loading

Uh oh!

xiazcy commented Jun 27, 2026

Uh oh!

Cole-Greer commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

kenhuuu commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bound HTTP transaction lifetime: suspend idle timer while busy + add maxTransactionLifetime

What changed

Why

Review guide

Uh oh!

xiazcy commented Jun 27, 2026

Uh oh!

Cole-Greer commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kenhuuu commented Jun 26, 2026 •

edited

Loading

Bound HTTP transaction lifetime: suspend idle timer while busy + add `maxTransactionLifetime`